Commodore BASIC tokenized file

From Just Solve the File Format Problem
(Difference between revisions)
Jump to: navigation, search
Line 8: Line 8:
 
'''Commodore BASIC tokenized files''' stored programs in the version of the [[BASIC]] programming language used on Commodore computers, including the PET, VIC-20, Commodore 64, and Commodore 128. A number of versions were used, deriving from a version that was licensed perpetually from Microsoft by Commodore for a one-time fee, and further developed internally at Commodore. The most common version is 2.0, which was found on the Commodore 64, though earlier PET computers had BASIC 4.0 (Commodore put an out-of-date BASIC in the 64 because it was "just a home computer" not expected to be used for serious stuff). The Commodore 128 had BASIC 7.0.
 
'''Commodore BASIC tokenized files''' stored programs in the version of the [[BASIC]] programming language used on Commodore computers, including the PET, VIC-20, Commodore 64, and Commodore 128. A number of versions were used, deriving from a version that was licensed perpetually from Microsoft by Commodore for a one-time fee, and further developed internally at Commodore. The most common version is 2.0, which was found on the Commodore 64, though earlier PET computers had BASIC 4.0 (Commodore put an out-of-date BASIC in the 64 because it was "just a home computer" not expected to be used for serious stuff). The Commodore 128 had BASIC 7.0.
  
Like most BASICs of its era, Commodore BASIC used a tokenized format to save its programs, rather than plain-text source code. Printable [[PETSCII]] characters generally stood for themselves, but other bytes had different meanings. The "high-bit" bytes from #128-#254 stood for the various BASIC commands and mathematical operators (#255 was used for the "pi" character). A null (#0) byte marked the end of a program line, and some header bytes were used at the start of the line to encode the line number and the byte offset to the next line (a 2-byte little-endian unsigned integer, with 0 indicating the last line of the program).
+
Like most BASICs of its era, Commodore BASIC used a tokenized format to save its programs, rather than plain-text source code. Printable [[PETSCII]] characters (and the various control codes which could be used within literal strings to do things like change the color of text) generally stood for themselves, but other bytes had different meanings. The "high-bit" bytes from #128-#254 stood for the various BASIC commands and mathematical operators (#255 was used for the "pi" character). A null (#0) byte marked the end of a program line, and some header bytes were used at the start of the line to encode the line number and the byte offset to the next line (a 2-byte little-endian unsigned integer, with 0 indicating the last line of the program).
  
 
Only the characters up to #203 were actually assigned BASIC commands, leaving #204-#254 unassigned and available for future expansion; there may be third-party extended BASICs that use some of them.
 
Only the characters up to #203 were actually assigned BASIC commands, leaving #204-#254 unassigned and available for future expansion; there may be third-party extended BASICs that use some of them.

Revision as of 13:52, 23 December 2012

File Format
Name Commodore BASIC tokenized file
Ontology
Extension(s) .prg
Released 1977

Commodore BASIC tokenized files stored programs in the version of the BASIC programming language used on Commodore computers, including the PET, VIC-20, Commodore 64, and Commodore 128. A number of versions were used, deriving from a version that was licensed perpetually from Microsoft by Commodore for a one-time fee, and further developed internally at Commodore. The most common version is 2.0, which was found on the Commodore 64, though earlier PET computers had BASIC 4.0 (Commodore put an out-of-date BASIC in the 64 because it was "just a home computer" not expected to be used for serious stuff). The Commodore 128 had BASIC 7.0.

Like most BASICs of its era, Commodore BASIC used a tokenized format to save its programs, rather than plain-text source code. Printable PETSCII characters (and the various control codes which could be used within literal strings to do things like change the color of text) generally stood for themselves, but other bytes had different meanings. The "high-bit" bytes from #128-#254 stood for the various BASIC commands and mathematical operators (#255 was used for the "pi" character). A null (#0) byte marked the end of a program line, and some header bytes were used at the start of the line to encode the line number and the byte offset to the next line (a 2-byte little-endian unsigned integer, with 0 indicating the last line of the program).

Only the characters up to #203 were actually assigned BASIC commands, leaving #204-#254 unassigned and available for future expansion; there may be third-party extended BASICs that use some of them.

Unlike some other BASICs of the time, the Commodore tokenizer didn't collapse extra whitespace; all space characters entered by the programmer were stored in the file. This meant that you could sometimes save disk and memory space by eliminating all unnecessary spaces from the code, though this might make the code harder to read at places.

BASIC programs were stored by Commodore DOS as file type "PRG" (program), in which the first two bytes stored the memory location it was expected to be loaded into. This was only used when the file was loaded with the LOAD filename,8,1 command, where the final '1' told it to use the memory location in the file; LOAD filename,8 always loaded it into the normal BASIC program memory space. When these files are transferred to other platforms, they are often saved with .prg extensions, though this extension was not part of the original filename on the Commodore (the file-type is a separate field in Commodore directory structures).

Format documentation

Software

Other links and references

Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox