Commodore BASIC tokenized file

Commodore BASIC tokenized files stored programs in the version of the BASIC programming language used on Commodore computers, including the PET, VIC-20, Commodore 64, and Commodore 128. A number of versions were used, deriving from a version that was licensed perpetually from Microsoft by Commodore for a one-time fee, and further developed internally at Commodore. The most common version is 2.0, which was found on the Commodore 64, though earlier PET computers had BASIC 4.0 (Commodore put an out-of-date BASIC in the 64 because it was "just a home computer" not expected to be used for serious stuff). The Commodore 128 had BASIC 7.0.

Like most BASICs of its era, Commodore BASIC used a tokenized format to save its programs, rather than plain-text source code. Printable PETSCII characters generally stood for themselves, but other bytes had different meanings. The "high-bit" bytes from #128-#254 stood for the various BASIC commands and mathematical operators (#255 was used for the "pi" character). A null (#0) byte marked the end of a program line, and some header bytes were used at the start of the line to encode the line number and the byte offset to the next line (a 2-byte little-endian unsigned integer, with 0 indicating the last line of the program).

Only the characters up to #203 were actually assigned BASIC commands, leaving #204-#254 unassigned and available for future expansion; there may be third-party extended BASICs that use some of them.

Unlike some other BASICs of the time, the Commodore tokenizer didn't collapse extra whitespace; all space characters entered by the programmer were stored in the file. This meant that you could sometimes save disk and memory space by eliminating all unnecessary spaces from the code, though this might make the code harder to read at places.

Format documentation

 * Commodore BASIC tokens