GW-BASIC tokenized file

GW-BASIC tokenized files stored programs in the version of the BASIC programming language used on IBM PC compatibles in the days when interpreted BASIC was regularly included on personal computers as shipped from the factory. Originally the IBM PC had versions of BASIC called BASIC and BASICA, the latter being an "advanced" BASIC with a few more features. Part of it was in ROM, and part was loaded from disk. Other manufacturers' PC compatibles (or "clones") didn't have the ROM BASIC, but used a BASIC from Microsoft which was compatible to it, and went by a few manufacturer-specific names but was generically known as GW-BASIC (with varying claims existing about what the GW stands for, either the initials of a Microsoft employee (Greg Whitten) involved in adapting it from Bill Gates' original CP/M BASIC, or possibly for "Gee Whiz").

Like most BASICs of its era, BASIC/BASICA/GW-BASIC used a tokenized format to save its programs, rather than plain-text source code. Printable ASCII characters (space through tilde) generally standed for themselves (except when part of a multi-byte sequence), but other bytes had different meanings. The "high-bit" bytes from #128-#255 stood for the various BASIC commands (some as single bytes, others as part of two-byte sequences), while some of the control characters had special meanings including signifying the start of a binary-encoded sequence encapsulating a numeric constant. A null (#0) byte marked the end of a program line, and some header bytes were used at the start of the line to encode the line number and some byte offsets.

Format documentation

 * GW-BASIC tokenised program format