UCS-2

From Just Solve the File Format Problem
(Difference between revisions)
Jump to: navigation, search
(Created page with "{{FormatInfo |formattype=electronic |subcat=Character Encodings }} '''UCS-2''' is the trivial 16-bit Unicode encoding. It is considered to be obsolete. It was at one time...")

Revision as of 17:20, 14 February 2013

File Format
Name UCS-2
Ontology

UCS-2 is the trivial 16-bit Unicode encoding. It is considered to be obsolete.

It was at one time the only popular Unicode encoding, so there was little need to distinguish between the terms Unicode and UCS-2. If an old format specification says that text is encoded in "Unicode", it probably means UCS-2.

UCS-2 encodes a sequence of Unicode code points in a sequence of unsigned 16-bit integers, one code point per integer, in the obvious way (U+0000=0x0000, U+0001=0x0001, ..., U+FFFF=0xffff). It is only capable of encoding code points up to U+FFFF, and does not support the higher code points (U+10000 through U+10FFFF).

Since it is often necessary to encode code points into bytes, instead of 16-bit integers, there are two flavors of UCS-2 which do that: USC-2BE (big-endian) and UCS-2LE (little-endian).

See also

Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox