Character encoding

From Just Solve the File Format Problem
(Difference between revisions)
Jump to: navigation, search
(Commentary and satire)
Line 176: Line 176:
* [ 8 New Punctuation Marks We Desperately Need]
* [ 8 New Punctuation Marks We Desperately Need]
* [ 8 Symbols We Turned Into Words]
* [ 8 Symbols We Turned Into Words]
* [ I Can Text You A Pile of Poo, But I Can’t Write My Name]
== Other external links ==
== Other external links ==

Revision as of 00:57, 18 March 2015

File Format
Name Character encoding


Character Encodings are methods of representing characters of text, usually as numeric values which can be stored on computers as bits and bytes, but sometimes in other things (e.g., Braille represents them as patterns of raised dots). Sometimes they're also referred to as "character sets", but purists will make a distinction in that, strictly speaking, a character set is merely a repertoire of characters, the list of characters supported by some system, protocol, or file format, without it necessarily having any inherent order or numbering system. A character encoding assigns specific values (in some coding system) to each character. However, the distinction can get vague and fuzzy; there are multiple levels of abstraction (Unicode includes a set of defined characters as well as assigned numeric code points for each, but leaves it to other more specific encodings such as UTF-8 to define the specific bits/bytes that represent them in a file), and some protocols even use parameter names such as 'charset' to indicate which character encoding is in use, so the terminology can slip and slide even in "tech" uses. This section documents all the various sorts of character sets/encodings of any sort.

See Fonts for the renditions of character encodings as seen on screens and printouts. The appearance of a character is known as a "glyph", and a font consists of a set of glyphs mapped onto the more abstractly-defined characters as included in the character set that is part of a character encoding.


Specific character sets or encodings

Format details

Character escape codes

(used to enter characters in various systems and formats)


Commentary and satire

Other external links


Personal tools