Tokenized BASIC

From Just Solve the File Format Problem
Jump to: navigation, search
File Format
Name Tokenized BASIC
Extension(s) .bas

Sol BASIC screen shot (as simulated in Solace)

Sol BASIC screen shot (as simulated in Solace)

Tokenized BASIC is a method of storing programs in the BASIC programming language by encoding the various keywords of the language as "tokens" instead of as plain text. Since the tokens are shorter byte sequences than the full text of the keywords, such programs take up less storage space in memory and in external storage such as disks or tapes, which was a significant concern in an era when computers were much more limited in memory and disk space than they are at present. It can also take less processing time for the interpreters to parse the code when it is in the form of tokens, which is another important concern for slower computers (and is the reason some languages such as Python create bytecode files from the source code to help their interpreter even now). Since computers are much faster and have much more memory and disk space now, tokenized languages are rarely used for source code storage, though compilers may generate intermediate data that is tokenized in some way in the course of producing executable code from text-based sources.

In its heyday of the 1960s through 1980s, BASIC existed in many dialects, designed for specific machine platforms, and the format of tokenized programs was different in each. On systems where file types were commonly identified using extensions, .BAS was usually used for BASIC programs, while other systems had their own ways of identifying file types and often had a type code specific to their own platform's BASIC interpreter (or multiple codes for different versions of BASIC, such as Apple II DOS's 'I' for Integer BASIC and 'A' for Applesoft floating-point BASIC).

People intending to transfer BASIC programs cross-system would usually export them in text form by piping the output of the LIST command to a text file (which sometimes required special tweaking to get the proper format; for instance, on the Apple II, one needed to do a poke first: POKE 33,33, to set the screen window width narrow enough to defeat the automatic insertion of padding spaces on normal-size lines). Some BASICs made things easier by offering a "save-as-text" option in the SAVE command (sometimes appending ",A", with A for ASCII, worked). Cross-system porting usually required considerable program revision as well due to the great differences between different BASIC dialects.

Specific tokenized BASIC formats:

As a bit of trivia, three of the companies referenced above are named after U.S. states: Texas Instruments (TI), Ohio Scientific, and Connecticut Leather Company (Coleco).

Links and references

Personal tools