Atari BASIC tokenized file
Atari BASIC was used on the Atari 400 and 800 computers, among the many systems competing for the home computer market in the late '70s and early '80s. While Atari considered using an adapted Microsoft BASIC like some other manufacturers, they ultimately used an independently-developed BASIC instead, meaning that many characteristics of this BASIC (including its manner of tokenization) differ greatly from the other BASICs of the time.
While most BASICs used some byte values to represent literal characters (in string constants and variable names, for instance) and others (often in the "high-bit" range of #128-#255) for tokenized keywords, Atari BASIC used a more complicated scheme whereby values all throughout the 256 possibilities were used for tokens of multiple classes. Context determined the interpretation of a value. If it was the first token of a line, its meaning was taken from a list of Statement Name Tokens. At other positions, a different token list was used which included values representing functions, operators, or variables. The variables in the program were itemized in a variable table stored at the beginning of the program, so that references to a variable in the program used only the single-byte token, representing the name. There were 128 positions in the token list for variables (comprising the high-bit values), meaning that only 128 different variables could be used in a program.
String constants were marked by the byte 0F (hex), followed by a byte giving the string length (0-255), then the characters of the string itself. Numeric constants were marked by 0E (hex), followed by six bytes holding a floating point value.
Literal characters are in ATASCII, Atari's not-quite-ASCII character set.
Software
References
- Atari BASIC Source Book - has assembly source and some descriptions of what it does, from which a complete token table and file format spec could be puzzled out, though the required info is spread out somewhat.
- Books on Atari programming in Internet Archive

