Windows encodings
Dan Tobias (Talk | contribs) |
(Edited a statement that might not be so true anymore) |
||
| Line 10: | Line 10: | ||
The term is somewhat ambiguous. In the strictest sense, it refers to the so-called "ANSI" encodings such as [[Windows 1252]], but it can also encompass many of the even-more-legacy [[MS-DOS encodings]] (a.k.a. "OEM" encodings) supported by Windows, such as [[CP437]]. | The term is somewhat ambiguous. In the strictest sense, it refers to the so-called "ANSI" encodings such as [[Windows 1252]], but it can also encompass many of the even-more-legacy [[MS-DOS encodings]] (a.k.a. "OEM" encodings) supported by Windows, such as [[CP437]]. | ||
| − | The native encoding of Windows NT-based systems is [[UTF-16]] (or [[UCS-2]] for very old systems), but that is usually not considered to be a "Windows encoding". Sufficiently modern versions of Windows even support [[UTF-8]] as a "legacy" encoding, | + | The native encoding of Windows NT-based systems is [[UTF-16]] (or [[UCS-2]] for very old systems), but that is usually not considered to be a "Windows encoding". Sufficiently modern versions of Windows even support various ways of using [[UTF-8]] as a "legacy" encoding (as a systemwide locale setting, per-process using a manifest<ref>[https://learn.microsoft.com/en-us/windows/apps/design/globalizing/use-utf8-code-page Microsoft Learn: Use UTF-8 code pages in Windows apps]</ref>, or for C library functions, via setlocale<ref>[https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/setlocale-wsetlocale?view=msvc-170 Microsoft Learn: setlocale, _wsetlocale]</ref>). |
== List of encodings == | == List of encodings == | ||
| Line 27: | Line 27: | ||
* [https://msdn.microsoft.com/en-us/library/windows/desktop/dd317756(v=vs.85).aspx Windows Dev Center: Code Page Identifiers] | * [https://msdn.microsoft.com/en-us/library/windows/desktop/dd317756(v=vs.85).aspx Windows Dev Center: Code Page Identifiers] | ||
* [[Wikipedia: Windows code page]] | * [[Wikipedia: Windows code page]] | ||
| + | |||
| + | == References == | ||
| + | <references /> | ||
[[Category:Microsoft]] | [[Category:Microsoft]] | ||
[[Category:Windows]] | [[Category:Windows]] | ||
Latest revision as of 12:45, 6 November 2025
Windows encodings (or Windows code pages) refers to the various legacy character encodings used by the non-Unicode Microsoft Windows API, and most non-Unicode-aware Windows applications.
In many contexts, it means "whatever the user's default non-Unicode encoding happens to be", which is bad from a portability perspective. All too many file formats use one of these encodings, with no reliable way to determine which one.
The term is somewhat ambiguous. In the strictest sense, it refers to the so-called "ANSI" encodings such as Windows 1252, but it can also encompass many of the even-more-legacy MS-DOS encodings (a.k.a. "OEM" encodings) supported by Windows, such as CP437.
The native encoding of Windows NT-based systems is UTF-16 (or UCS-2 for very old systems), but that is usually not considered to be a "Windows encoding". Sufficiently modern versions of Windows even support various ways of using UTF-8 as a "legacy" encoding (as a systemwide locale setting, per-process using a manifest[1], or for C library functions, via setlocale[2]).
[edit] List of encodings
- Windows 1250 (Central European) - code table
- Windows 1251 (Cyrillic) - code table
- Windows 1252 (Western European; ISO 8859-1 plus additional characters) - code table
- Windows 1253 (Greek) - code table
- Windows 1254 (Turkish) - code table
- Windows 1255 (Hebrew) - code table
- Windows 1256 (Arabic, Farsi, Urdu) - code table
- Windows 1257 (Baltic Rim) - code table
- Windows 1258 (Vietnamese) - code table