C1 controls

From Just Solve the File Format Problem
(Difference between revisions)
Jump to: navigation, search
(New version cloned from C0, still working on it.)
 
(12 intermediate revisions by 3 users not shown)
Line 1: Line 1:
 
{{FormatInfo
 
{{FormatInfo
 
|formattype=electronic
 
|formattype=electronic
|subcat=Character Encodings
+
|subcat=Character encoding
 
}}
 
}}
 +
The '''C1 controls''' are the control characters (code positions 128-159 decimal) which are defined by ISO/IEC 6429:1992 and are part of the [[ISO 8859]] and other encodings. They are not often used, and in otherwise equivalent Microsoft character sets (e.g., Windows 1252) they are replaced by printing characters. It is common in both the web and e-mail for the character set to be announced as an ISO-8859 encoding but actually be in a Windows encoding, including the printable characters where these controls are located in the official standards, making it unsafe for software to even attempt to interpret them as control characters even though that is what the standards technically say to do in such cases. Because of their rarity of use, there is little agreement on their exact meaning and implementation.
  
The '''C1 controls''' are the control characters (code positions 128-159 decimal) which are part of the [[ISO-8859]] standard. They are also part of a number of other character sets derived from ASCII. They are not often used, and in otherwise equivalent Microsoft character sets (e.g., Windows 1252) they are replaced by printing characters.
+
Other alternative C1 controls have been used in some applications, such as [https://www.itscj.ipsj.or.jp/iso-ir/056.pdf UK videotex (1982)].
 
+
NOTE: I'm cloning this from the C0 article and saving it occasionally. There will be gross errors till I'm done. Bear with me or dive in. --[[User:Gmcgath|Gmcgath]] ([[User talk:Gmcgath|talk]]) 12:34, 30 November 2012 (UTC)
+
  
 
{| class="wikitable"
 
{| class="wikitable"
 
! title="Hexadecimal code point" | Hex
 
! title="Hexadecimal code point" | Hex
 
! title="Decimal code point" | Dec
 
! title="Decimal code point" | Dec
! title="Codes used to represent character" | Codes
+
! title="Standard abbreviation" | Abbreviation
! title="Standard Acronym" | Acronym
+
 
! title="Character name" | Name
 
! title="Character name" | Name
 
! title="Description and uses" | Description and uses
 
! title="Description and uses" | Description and uses
 
|-
 
|-
|80||0||^@, \0||NUL||Null character||Marks unused space or padding (e.g., to intentionally slow down terminals or to leave space for added data in memory or storage media). Used in C-based programming languages to mark end of string.
+
|80||128||PAD||Padding Character||Not part of ISO/IEC 6429.
 
|-
 
|-
|81||1||^A||SOH||Start of Heading||Marks the beginning of a header in a message or data structure.
+
|81||129||HOP||High Octet Preset||Not part of ISO/IEC 6429. On Commodore computers, sets text color to orange.
 
|-
 
|-
|82||2||^B||STX||Start of Text||Marks the beginning of the body text of a message, and/or the end of the header.
+
|82||130||BPH||Break Permitted Here||Follows a graphic character where a line break is permitted.
 
|-
 
|-
|83||3||^C||ETX||End of Text||Marks the end of the body text. Also used as "break character" (Control-C) to terminate a program or process.
+
|83||131||NBH||No Break Here||Follows a graphic character where a line break is not permitted.
 
|-
 
|-
|84||4||^D||EOT||End of Transmission||In Unix-style operating systems, signals end-of-file and is used to log out of a terminal. On Apple II, this character signalled that what followed was a DOS command when it was "printed" to standard output.
+
|84||132||IND||Index||Moves the active position one line down.
 
|-
 
|-
|85||5||^E||ENQ||Enquiry||Used in transmission protocols to request acknowledgement from the other end to make sure connection is still active. In DEC TOPS-20 mainframes, usually resulted in currently-active application outputing status information to terminal.
+
|85||133||NEL||Next Line||Yet another line ending. On Commodore computers, Function Key 1.
 
|-
 
|-
|86||6||^F||ACK||Acknowledge||Sent as response to ENQ message, or used to positively acknowledge receipt of data or messages (as opposed to NAK).
+
|86||134||SSA||Start of Selected Area||On Commodore computers, Function Key 3.
 
|-
 
|-
|87||7||^G, \a||BEL||Bell||On some systems, this causes a bell, buzzer, or beep to sound, or flashes inverse video to alert a system operator. The Apple II had "BELL" on the front side of the "G" key to remind users that Ctrl-G caused this sound effect.
+
|87||135||ESA||End of Selected Area||On Commodore computers, Function Key 5.
 
|-
 
|-
|88||8||^H, \b||BS||Backspace||Moves back one space. Usually deletes last character (e.g., from input string), but on some old terminals it just moved backward without deleting and allowed "overstrike" effects overlaying multiple characters.
+
|88||136||HTS||Horizontal Tabulation Set||Sets a horizontal tab stop. On Commodore computers, Function Key 7.
 
|-
 
|-
|89||9||^I, \t||HT||Horizontal Tab||The typewriter "tab key", usually moving to the next tab stop as defined in the particular software being used.
+
|89||137||HTJ||Horizontal Tabulation with Justification||Sets a horizontal tab stop and indicates text should be justified out to the stop. On Commodore computers, Function Key 2.
 
|-
 
|-
|8A||10||^J, \n||LF||Line Feed||Move down one line. In Unix-style operating systems, it also moves to the beginning of the next line so that it can be used as a line break (newline) character, while in some other systems and terminals it just moves down without moving to the left, requiring the "CR LF" sequence to break a line.
+
|8A||138||VTS||Vertical Tabulation Set||Sets a vertical tab stop. On Commodore computers, Function Key 4.
 
|-
 
|-
|8B||11||^K, \v||VT||Vertical Tab||Moves to vertical tab stops; not used nearly as often as the more-common horizontal tab.
+
|8B||139||PLD||Partial Line Down||Moves the active position down to a position suitable for subscripts, or undoes PLU. On Commodore computers, Function Key 6.
 
|-
 
|-
|8C||12||^L, \f||FF||Form Feed||Causes page to eject in printers, and may clear the screen in some terminal emulators. Sometimes used as a logical division of sections of a document.
+
|8C||140||PLU||Partial Line Up||Moves the active position up to a position suitable for superscripts, or undoes PLD. On Commodore computers, Function Key 8.
 
|-
 
|-
|8D||13||^M, \r||CR||Carriage Return||Moves to the beginning of the line. In some systems (e.g., Apple II, Commodore 64, and TRS-80, and early Macintosh systems before its OS switched to a Unix-based system), also moves to the next line so that it can be used as a line break character, while in other systems it stays on the same line so that it must be accompanied by a LF character to break a line (but on some printing terminals CR with no LF was used for overstrike effects including underlining by printing underscores). Thus the three different line-break conventions (LF, CR, and CR+LF) arose, which bedevil users of text files to this day. As an input character, CR is generally mapped onto the Enter key, signaling the completion of input.
+
|8D||141||RI||Reverse Index||Moves the active position one line up. On Commodore computers, used for line feed.
 
|-
 
|-
|8E||14||^N||SO||Shift Out||Switch to alternate character set (reversed by SI). Used in various systems and terminals to set different characters (e.g., APL or Cyrillic), or change the color or font.
+
|8E||142||SS2||Single-Shift 2||Indicates that the next code only should be interpreted in the G2 character set. On Commodore computers, Shift In.
 
|-
 
|-
|8F||15||^O||SI||Shift In||Return to normal character set (reverses operation of SO).
+
|8F||143||SS3||Single-Shift 3||Indicates that the next code only should be interpreted in the G3 character set.
 
|-
 
|-
|90||16||^P||DLE||Data Link Escape||Signals the start of a sequence of raw data as opposed to normal printable or control characters.
+
|90||144||DCS||Device Control String||Introduces a device control sequence, which is terminated by ST (0X9C). On Commodore computers, sets text color to black.
 
|-
 
|-
|91||17||^Q||DC1||Device Control 1||One of four device-control codes intended to be system-specific. This one (CTRL-Q, also known as XON) is often used to resume operations of a process, device, or output stream that has been paused with CTRL-S (XOFF).
+
|91||145||PU1||Private Use 1||On Commodore computers, Cursor Up.
 
|-
 
|-
|92||18||^R||DC2||Device Control 2||Another device-control code; not used as much as DC1 and DC3.
+
|92||146||PU2||Private Use 2||On Commodore computers, Reverse Video Off.
 
|-
 
|-
|93||19||^S||DC3||Device Control 3||The third of the device-control codes; this one (CTRL-S, also known as XOFF) is often used to pause processes, devices, or output streams, with CTRL-Q (XON) resuming them (though in some cases, any keypress causes output to resume).
+
|93||147||STS||Set Transmit State||On Commodore computers, Form Feed.
 
|-
 
|-
|94||20||^T||DC4||Device Control 4||The fourth device-control code; not used as much as DC1 or DC3. In DEC TOPS-20 mainframes, usually resulted in output of system status to terminal.
+
|94||148||PCH||Cancel Character||Backspace and cancel the previous character. On Commodore computers, Insert.
 
|-
 
|-
|95||21||^U||NAK||Negative Acknowledge||In transmission protocols, indicates a failure requiring a re-send, or a negative response to a query of whether the process is ready to proceed.
+
|95||149||MW||Message Waiting||On Commodore computers, set text color to brown.
 
|-
 
|-
|96||22||^V||SYN||Synchronous Idle||Signals that a correction may now be received in synchronous transmission protocols.
+
|96||150||SPA||Start of Protected Area||On Commodore computers, set text color to light red.
 
|-
 
|-
|97||23||^W||ETB||End of Transmission Block||Marks the end of a block of data divided into blocks for transmission.
+
|97||151||EPA||End of Protected Area||On Commodore computers, set text color to gray 1.
 
|-
 
|-
|98||24||^X||CAN||Cancel||Cancels an operation and signals that previously-sent data can be disregarded.
+
|98||152||SOS||Start of String||Introduces a control string, which is terminated by ST (0X9C). On Commodore computers, set text color to gray 2.
 
|-
 
|-
|99||25||^Y||EM||End of Medium||Marks the end of a physical medium such as a data-storage tape.
+
|99||153||SGCI||Single Graphic Character Introducer||Not part of ISO/IEC 6429. On Commodore computes, set text color to light green.
 
|-
 
|-
|9A||26||^Z||SUB||Substitute Character||Used to mark the spot where garbled, missing, or incomplete characters were received due to transmission errors, or various other uses involving place-holder characters.  This character (Ctrl-Z) is also used by MS/PC-DOS to mark the end of a file or input stream, calling it EOF (although CTRL-D, EOT, would have been more standards-compliant and is used by Unix-style OSs for this purpose; however, some DEC operating systems used the CTRL-Z convention and this is what was followed by PC-DOS).
+
|9A||154||SCI||Single Character Introducer||Followed by a single printing character or format effector. Meaning uncertain. On Commodore computers, set text color to light blue.
 
|-
 
|-
|9B||27||^[||ESC||Escape||Mapped onto the ESC key on keyboards, this usually signals a user attempting to exit a menu or mode. It is also commonly used in printer and terminal control protocols to signal the beginning of a special "escape sequence" where immediately-following characters are interpreted as commands.
+
|9B||155||CSI||Control Sequence Introducer||On Commodore computers, set text color to gray 3.
 
|-
 
|-
|9C||28||^\||FS||File Separator||One of four separator characters intended to delimit structured data. FS is the highest-level separator, intended to separate structures which are in turn internally delimited with GS, RS, and US (in descending order). Also used as a "quit and dump core" signal in Unix shells.
+
|9C||156||ST||String Terminator||Marks the end of control sequences introduced by several C1 codes. On Commodore computers, set text color to purple.
 
|-
 
|-
|9D||29||^]||GS||Group Separator||The second of four separator characters, subordinate to FS, but higher-level than RS and US.
+
|9D||157||OSC||Operating System Command||Introduces an operating system command, which is terminated by ST (0X9C). On Commodore computers, Cursor Left.
 
|-
 
|-
|9E||30||^^||RS||Record Separator||The third of four separator characters, subordinate to FS and GS, but higher-level than US.
+
|9E||158||PM||Privacy Message||Introduces a privacy message, which is terminated by ST (0X9C). On Commodore computers, set text color to yellow.
 
|-
 
|-
|9F||31||^_||US||Unit Separator||The lowest-level of the separator characters, used to divide strings of ASCII characters which are the base elements of a data structure. A sequence of such US-delimited strings can in turn be used as a higher-level data element separated by other such elements by the RS character, and this structure in turn can be delimited from other such elements by GS, and finally if a fourth level is needed the FS character separates those elements.
+
|9F||159||APC||Application Program Command||Introduces an application program command, which is terminated by ST (0X9C). On Commodore computers, set text color to cyan.
 
|}
 
|}
 +
 +
== See also ==
 +
* [[C0 controls]]
  
 
[[Category:File format details]]
 
[[Category:File format details]]

Revision as of 06:09, 11 October 2020

File Format
Name C1 controls
Ontology

The C1 controls are the control characters (code positions 128-159 decimal) which are defined by ISO/IEC 6429:1992 and are part of the ISO 8859 and other encodings. They are not often used, and in otherwise equivalent Microsoft character sets (e.g., Windows 1252) they are replaced by printing characters. It is common in both the web and e-mail for the character set to be announced as an ISO-8859 encoding but actually be in a Windows encoding, including the printable characters where these controls are located in the official standards, making it unsafe for software to even attempt to interpret them as control characters even though that is what the standards technically say to do in such cases. Because of their rarity of use, there is little agreement on their exact meaning and implementation.

Other alternative C1 controls have been used in some applications, such as UK videotex (1982).

Hex Dec Abbreviation Name Description and uses
80 128 PAD Padding Character Not part of ISO/IEC 6429.
81 129 HOP High Octet Preset Not part of ISO/IEC 6429. On Commodore computers, sets text color to orange.
82 130 BPH Break Permitted Here Follows a graphic character where a line break is permitted.
83 131 NBH No Break Here Follows a graphic character where a line break is not permitted.
84 132 IND Index Moves the active position one line down.
85 133 NEL Next Line Yet another line ending. On Commodore computers, Function Key 1.
86 134 SSA Start of Selected Area On Commodore computers, Function Key 3.
87 135 ESA End of Selected Area On Commodore computers, Function Key 5.
88 136 HTS Horizontal Tabulation Set Sets a horizontal tab stop. On Commodore computers, Function Key 7.
89 137 HTJ Horizontal Tabulation with Justification Sets a horizontal tab stop and indicates text should be justified out to the stop. On Commodore computers, Function Key 2.
8A 138 VTS Vertical Tabulation Set Sets a vertical tab stop. On Commodore computers, Function Key 4.
8B 139 PLD Partial Line Down Moves the active position down to a position suitable for subscripts, or undoes PLU. On Commodore computers, Function Key 6.
8C 140 PLU Partial Line Up Moves the active position up to a position suitable for superscripts, or undoes PLD. On Commodore computers, Function Key 8.
8D 141 RI Reverse Index Moves the active position one line up. On Commodore computers, used for line feed.
8E 142 SS2 Single-Shift 2 Indicates that the next code only should be interpreted in the G2 character set. On Commodore computers, Shift In.
8F 143 SS3 Single-Shift 3 Indicates that the next code only should be interpreted in the G3 character set.
90 144 DCS Device Control String Introduces a device control sequence, which is terminated by ST (0X9C). On Commodore computers, sets text color to black.
91 145 PU1 Private Use 1 On Commodore computers, Cursor Up.
92 146 PU2 Private Use 2 On Commodore computers, Reverse Video Off.
93 147 STS Set Transmit State On Commodore computers, Form Feed.
94 148 PCH Cancel Character Backspace and cancel the previous character. On Commodore computers, Insert.
95 149 MW Message Waiting On Commodore computers, set text color to brown.
96 150 SPA Start of Protected Area On Commodore computers, set text color to light red.
97 151 EPA End of Protected Area On Commodore computers, set text color to gray 1.
98 152 SOS Start of String Introduces a control string, which is terminated by ST (0X9C). On Commodore computers, set text color to gray 2.
99 153 SGCI Single Graphic Character Introducer Not part of ISO/IEC 6429. On Commodore computes, set text color to light green.
9A 154 SCI Single Character Introducer Followed by a single printing character or format effector. Meaning uncertain. On Commodore computers, set text color to light blue.
9B 155 CSI Control Sequence Introducer On Commodore computers, set text color to gray 3.
9C 156 ST String Terminator Marks the end of control sequences introduced by several C1 codes. On Commodore computers, set text color to purple.
9D 157 OSC Operating System Command Introduces an operating system command, which is terminated by ST (0X9C). On Commodore computers, Cursor Left.
9E 158 PM Privacy Message Introduces a privacy message, which is terminated by ST (0X9C). On Commodore computers, set text color to yellow.
9F 159 APC Application Program Command Introduces an application program command, which is terminated by ST (0X9C). On Commodore computers, set text color to cyan.

See also

Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox