TRON code
From Just Solve the File Format Problem
This article describes the character encoding used in TRON. Unlike Unicode, it does not use the Han unification; it can clearly distinguish Japanese from Chinese texts.
Character codes are two byte codes and are split into four zones:
- A zone: High byte and low byte are both in range 0x21 to 0x7E.
- B zone: High byte in range 0x80 to 0xFD and low byte in range 0x21 to 0x7E.
- C zone: High byte in range 0x21 to 0x7E and low byte in range 0x80 to 0xFD.
- D zone: High byte and low byte are both in range 0x80 to 0xFD.
The character codes are grouped in planes; the language selection is by first byte 0xFE and then second byte makes the plane number added to 0x20 (for example, plane 1 is selection by code 0xFE21). The default plane (if not otherwise specified) is usually plane 1.
List of planes:
- 1 = JIS, GB2312, KS X 1001, and Braille
- 2,3 = GT
- 6 = Big5
- 8,9 = Dai-Kan-Wa-Jiten, hentaigana, etc
- 10 = Dongba symbols
Conversion other formats into TRON is described below.
Contents |
Plane 1
JIS X 0208, and first plane of JIS X 0213:
hi = ku+0x20 lo = ten+0x20
hi = ku+0xA0 lo = ten+0x20
Second plane of JIS X 0213:
hi = numbers 0x87 to 0xA0, contiguous by valid rows of JIS X 0213 (1,3-5,8,12-15,78-94) lo = ten+0x20
hi = ((ku-1)*94+ten-1)/126+0x21 lo = ((ku-1)*94+ten-1)%126+0x80
hi = ((ku-1)*94+ten-1)/126+0xB7 lo = ((ku-1)*94+ten-1)%126+0x80
(unknown)
Plane 9
- Codes 0x9721 to 0x972A are the Chinese/Japanese numbers one to ten in the square.
- Codes 0x972B to 0x975A are the katakana in parentheses.
- Codes 0x975B to 0x9766 are the lowercase roman numbers i to xii in the circle.
- Codes 0x9767 to 0x977A are the numbers 1 to 20 in the triangle.
- Codes 0x9830 to 0x9839 are Baronh numbers 0 to 9.
- Codes 0x9840 to 0x985B are Baronh alphabets: a e i ï u ü é o c s t l n h p f m ai y ÿ œ r au eu g z d b