ASCII
(Add new category "Text encoding") |
Dan Tobias (Talk | contribs) (→Specifications) |
||
(22 intermediate revisions by 3 users not shown) | |||
Line 1: | Line 1: | ||
− | {| | + | {{FormatInfo |
− | | | + | |formattype=electronic |
− | | | + | |subcat=Character encoding |
− | | | + | |released=1963 |
− | | | + | |charset=US-ASCII |
− | | | + | |charsetaliases=iso-ir-6, ANSI_X3.4-1968, ANSI_X3.4-1986, ISO_646.irv:1991, ISO646-US, us, IBM367, cp367, csASCII |
− | | | + | |mibenum=3 |
− | | | + | |codepage=367 |
− | |} | + | |cfstringencoding=1536 |
− | + | |nsstringencoding=1 | |
− | The '''American Standard Code for Information Interchange''' (ASCII) is a character encoding designed for English-based information interchange. The first version was published in 1963, but had a number of differences from the later version published in 1967, which had some minor tweaks in 1986 to result in what is now referred to as '''us-ascii''' when specifying character encodings. ASCII was intended to replace a number of proprietary character sets used by various device manufacturers, and largely succeeded at that although IBM continued to use [[EBCDIC]] for a number of years. However, since only the English alphabet was included, many so-called "extended ASCII" sets were used with different characters (accented letters, other alphabets, and special symbols) in the positions from 128 to 255 which were available when an eighth bit was added to the seven bits needed to encode the 128 ASCII characters. (Some systems, however, used the eighth bit as a checksum or flag of some sort, precluding such character set extensions.) Some writing systems such as Chinese, Japanese, and Korean were entirely unsuitable for ASCII-based character sets, and adopted various multi-byte representations. Thus, there was once again a profusion of proprietary character encodings until [[Unicode]] brought some order to the chaos. | + | |pronom={{PRONOM|x-fmt/22}}, {{PRONOM|x-fmt/283}} |
+ | |wikidata={{wikidata|Q8815}} | ||
+ | }} | ||
+ | The '''American Standard Code for Information Interchange''' (ASCII) is a character encoding designed for English-based information interchange. The first version was published in 1963, but had a number of differences from the later version published in 1967 (for one, the 1963 version had only uppercase letters and left the eventual location of lowercase undefined, and also had some of the control characters in different locations), which had some minor tweaks in 1986 to result in what is now referred to as '''us-ascii''' (also known as ISO 646-US) when specifying character encodings. ASCII was intended to replace a number of proprietary character sets used by various device manufacturers, and largely succeeded at that although IBM continued to use [[EBCDIC]] for a number of years. However, since only the English alphabet was included, many so-called "extended ASCII" sets were used with different characters (accented letters, other alphabets, and special symbols) in the positions from 128 to 255 which were available when an eighth bit was added to the seven bits needed to encode the 128 ASCII characters. (Some systems, however, used the eighth bit as a checksum or flag of some sort, precluding such character set extensions.) Some writing systems such as Chinese, Japanese, and Korean were entirely unsuitable for ASCII-based character sets, and adopted various multi-byte representations. Thus, there was once again a profusion of proprietary character encodings until [[Unicode]] brought some order to the chaos. The 128 characters of ASCII are incorporated into Unicode in their ASCII code positions from 0000 to 007F (hex). | ||
Early personal computers didn't always implement ASCII consistently. The original version of the Apple II lacked lowercase letters, for instance, showing random gibberish where those characters were found. A "lower case adaptor" chip could be installed to remedy this, and later computers in the Apple II series (starting with the IIe) came with lowercase support built in. Meanwhile, the Commodore PET, VIC-20, 64, and 128 used an unusual variation sometimes called [[PETSCII]] (or PET ASCII or CBM ASCII), which could be switched between two modes, one which only had uppercase letters (with the codes usually containing lowercase instead containing graphical characters), and another which introduces lowercase, but in the odd manner of replacing the character codes normally used for uppercase with lowercase letters, and adding a new set of uppercase letters at a completely different position in the set (replacing some graphic characters, but not the ones that are in the spots usually used by lowercase). This makes the conversion of text files created on or for Commodore computers a challenge. Atari computers had their own [[ATASCII]]. | Early personal computers didn't always implement ASCII consistently. The original version of the Apple II lacked lowercase letters, for instance, showing random gibberish where those characters were found. A "lower case adaptor" chip could be installed to remedy this, and later computers in the Apple II series (starting with the IIe) came with lowercase support built in. Meanwhile, the Commodore PET, VIC-20, 64, and 128 used an unusual variation sometimes called [[PETSCII]] (or PET ASCII or CBM ASCII), which could be switched between two modes, one which only had uppercase letters (with the codes usually containing lowercase instead containing graphical characters), and another which introduces lowercase, but in the odd manner of replacing the character codes normally used for uppercase with lowercase letters, and adding a new set of uppercase letters at a completely different position in the set (replacing some graphic characters, but not the ones that are in the spots usually used by lowercase). This makes the conversion of text files created on or for Commodore computers a challenge. Atari computers had their own [[ATASCII]]. | ||
+ | |||
+ | ASCII characters have been used artistically to draw pictures, which is known as [[ASCII Art]]. | ||
== Control characters == | == Control characters == | ||
Line 17: | Line 22: | ||
The lower 32 characters of the ASCII set are control characters given various special uses by different systems and programs, and sometimes also given a graphic rendition in some platforms. | The lower 32 characters of the ASCII set are control characters given various special uses by different systems and programs, and sometimes also given a graphic rendition in some platforms. | ||
− | + | See article: '''[[C0 controls]]''' | |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
== Specifications == | == Specifications == | ||
* RFC 20 | * RFC 20 | ||
+ | * [http://datatracker.ietf.org/doc/status-change-rfc20-ascii-format-to-standard/ RFC 20 finally designated an Internet Standard (2015)] | ||
* [http://www.wps.com/projects/codes/X3.4-1963/ ASA standard X3.4-1963] | * [http://www.wps.com/projects/codes/X3.4-1963/ ASA standard X3.4-1963] | ||
* [http://www.unicode.org/charts/PDF/U0000.pdf Unicode C0 Controls and Basic Latin Code Block] | * [http://www.unicode.org/charts/PDF/U0000.pdf Unicode C0 Controls and Basic Latin Code Block] | ||
+ | * [http://www.textfiles.com/programming/FORMATS/asciival.txt ASCII character code chart] | ||
+ | * [http://www.kreativekorp.com/charset/encoding/USASCII/ US ASCII code chart at Kreative Korp] | ||
+ | * [http://www.kreativekorp.com/charset/encoding/USASCIIQuotes/ US ASCII Quotes code chart; charts pre-1986 version with curly left and right single quotes] | ||
== External links == | == External links == | ||
− | * [ | + | * [[Wikipedia: ASCII]] |
− | * [http:// | + | * [http://trafficways.org/ascii/ascii.pdf The Evolution of Character Codes, 1874–1968] ([https://github.com/ericfischer/ascii Source code at GitHub]) |
− | [[Category: | + | [[Category:ISO 646]] |
Latest revision as of 05:20, 27 June 2019
The American Standard Code for Information Interchange (ASCII) is a character encoding designed for English-based information interchange. The first version was published in 1963, but had a number of differences from the later version published in 1967 (for one, the 1963 version had only uppercase letters and left the eventual location of lowercase undefined, and also had some of the control characters in different locations), which had some minor tweaks in 1986 to result in what is now referred to as us-ascii (also known as ISO 646-US) when specifying character encodings. ASCII was intended to replace a number of proprietary character sets used by various device manufacturers, and largely succeeded at that although IBM continued to use EBCDIC for a number of years. However, since only the English alphabet was included, many so-called "extended ASCII" sets were used with different characters (accented letters, other alphabets, and special symbols) in the positions from 128 to 255 which were available when an eighth bit was added to the seven bits needed to encode the 128 ASCII characters. (Some systems, however, used the eighth bit as a checksum or flag of some sort, precluding such character set extensions.) Some writing systems such as Chinese, Japanese, and Korean were entirely unsuitable for ASCII-based character sets, and adopted various multi-byte representations. Thus, there was once again a profusion of proprietary character encodings until Unicode brought some order to the chaos. The 128 characters of ASCII are incorporated into Unicode in their ASCII code positions from 0000 to 007F (hex).
Early personal computers didn't always implement ASCII consistently. The original version of the Apple II lacked lowercase letters, for instance, showing random gibberish where those characters were found. A "lower case adaptor" chip could be installed to remedy this, and later computers in the Apple II series (starting with the IIe) came with lowercase support built in. Meanwhile, the Commodore PET, VIC-20, 64, and 128 used an unusual variation sometimes called PETSCII (or PET ASCII or CBM ASCII), which could be switched between two modes, one which only had uppercase letters (with the codes usually containing lowercase instead containing graphical characters), and another which introduces lowercase, but in the odd manner of replacing the character codes normally used for uppercase with lowercase letters, and adding a new set of uppercase letters at a completely different position in the set (replacing some graphic characters, but not the ones that are in the spots usually used by lowercase). This makes the conversion of text files created on or for Commodore computers a challenge. Atari computers had their own ATASCII.
ASCII characters have been used artistically to draw pictures, which is known as ASCII Art.
[edit] Control characters
The lower 32 characters of the ASCII set are control characters given various special uses by different systems and programs, and sometimes also given a graphic rendition in some platforms.
See article: C0 controls
[edit] Specifications
- RFC 20
- RFC 20 finally designated an Internet Standard (2015)
- ASA standard X3.4-1963
- Unicode C0 Controls and Basic Latin Code Block
- ASCII character code chart
- US ASCII code chart at Kreative Korp
- US ASCII Quotes code chart; charts pre-1986 version with curly left and right single quotes