UCS-2
(Created page with "{{FormatInfo |formattype=electronic |subcat=Character Encodings }} '''UCS-2''' is the trivial 16-bit Unicode encoding. It is considered to be obsolete. It was at one time...") |
Dan Tobias (Talk | contribs) |
||
Line 15: | Line 15: | ||
* [[UTF-16]] | * [[UTF-16]] | ||
* [[Byte Order Mark]] | * [[Byte Order Mark]] | ||
+ | |||
+ | == External links == | ||
+ | |||
+ | * [http://www.unicode.org/faq/basic_q.html#14 Unicode FAQ: What is the difference between UCS-2 and UTF-16?] |
Revision as of 23:33, 14 February 2013
UCS-2 is the trivial 16-bit Unicode encoding. It is considered to be obsolete.
It was at one time the only popular Unicode encoding, so there was little need to distinguish between the terms Unicode and UCS-2. If an old format specification says that text is encoded in "Unicode", it probably means UCS-2.
UCS-2 encodes a sequence of Unicode code points in a sequence of unsigned 16-bit integers, one code point per integer, in the obvious way (U+0000=0x0000, U+0001=0x0001, ..., U+FFFF=0xffff). It is only capable of encoding code points up to U+FFFF, and does not support the higher code points (U+10000 through U+10FFFF).
Since it is often necessary to encode code points into bytes, instead of 16-bit integers, there are two flavors of UCS-2 which do that: USC-2BE (big-endian) and UCS-2LE (little-endian).