Windows encodings

From Just Solve the File Format Problem
(Difference between revisions)
Jump to: navigation, search
Line 3: Line 3:
 
|subcat=Character encoding
 
|subcat=Character encoding
 
}}
 
}}
'''Windows encodings''' refers to the various legacy character encodings used by the non-[[Unicode]] Microsoft Windows API, and most non-Unicode-aware Windows applications. (GUI applications, anyway. Windows ''console'' applications default to using the [[MS-DOS encodings]] in many cases.)
+
'''Windows encodings''' (or '''Windows code pages''') refers to the various legacy character encodings used by the non-[[Unicode]] Microsoft Windows API, and most non-Unicode-aware Windows applications.
  
In Windows jargon, these encodings are misleadingly called "ANSI". Together with the [[MS-DOS encodings]], they are sometimes called the "multi-byte" encodings.
+
In many contexts, it means "whatever the user's default non-Unicode encoding happens to be", which is bad from a portability perspective. All too many file formats use one of these encodings, with no reliable way to determine which one.
  
All too many file formats use one of these encodings, with no reliable way to determine which one.
+
The term is somewhat ambiguous. In the strictest sense, it refers to the so-called "ANSI" encodings such as [[Windows 1252]], but it can also encompass many of the even-more-legacy [[MS-DOS encodings]] (a.k.a. "OEM" encodings) supported by Windows, such as [[CP437]].
 +
 
 +
The native encoding of Windows NT-based systems is [[UTF-16]] (or [[UCS-2]] for very old systems), but that is usually not considered to be a "Windows encoding". Sufficiently modern versions of Windows even support [[UTF-8]] as a "legacy" encoding, though it preferable to use the Unicode API instead.
  
 
== List of encodings ==
 
== List of encodings ==
Line 18: Line 20:
 
== Links ==
 
== Links ==
 
* [https://unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/ Unicode mappings]
 
* [https://unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/ Unicode mappings]
 +
* [[Wikipedia: Windows code page]]
  
 
[[Category:Microsoft]]
 
[[Category:Microsoft]]
 
[[Category:Windows]]
 
[[Category:Windows]]

Revision as of 17:32, 27 January 2018

File Format
Name Windows encodings
Ontology

Windows encodings (or Windows code pages) refers to the various legacy character encodings used by the non-Unicode Microsoft Windows API, and most non-Unicode-aware Windows applications.

In many contexts, it means "whatever the user's default non-Unicode encoding happens to be", which is bad from a portability perspective. All too many file formats use one of these encodings, with no reliable way to determine which one.

The term is somewhat ambiguous. In the strictest sense, it refers to the so-called "ANSI" encodings such as Windows 1252, but it can also encompass many of the even-more-legacy MS-DOS encodings (a.k.a. "OEM" encodings) supported by Windows, such as CP437.

The native encoding of Windows NT-based systems is UTF-16 (or UCS-2 for very old systems), but that is usually not considered to be a "Windows encoding". Sufficiently modern versions of Windows even support UTF-8 as a "legacy" encoding, though it preferable to use the Unicode API instead.

List of encodings

Links

Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox