From Just Solve the File Format Problem
Jump to: navigation, search
File Format
Name JOT
Released 1970s/1980s

JOT (Juggler Of Text) was an experimental word processor / text editor / notation program created by Ted Nelson (of Project Xanadu fame) and his associates as an attempt to embody how Nelson believed the creation and editing of computerized text ought to be done. Never released as a product, it had a number of experimental versions in the 1970s and 1980s, with version 0.53 for the Apple II platform (1986) its latest known version.

As envisioned by Nelson, JOT would move gracefully through a text document by logical units of words, sentences, and paragraphs, with increasingly abbreviated versions of parts of the document above and below the cursor showing summaries of distant document parts and full detail of near ones, changing smoothly as you move around. Technology of the day didn't fully allow this dream to be implemented, however.

The Lollipop state machine language was used in the design of this program.

File format

An examination of the raw bytes of the 0.53 disk dump (linked below) yields some clues about the format in which text was stored. The sample document loading upon startup of this version begins at offset 9400 (hex) (sector boundaries in Apple II 5.25" disks represented in DSK format are at the hex numbers ending in "00").

The document begins with a number of null (00) bytes, 51 (decimal) of them to be exact. Perhaps this was padding to allow the early part of the document to be moved backward to accommodate insertions, which would be faster in some cases than moving the later part forward if the insertion is nearer the beginning. Then the text follows, with ASCII characters having their own normal values (except that capital letters at the start of sentences are stored as lowercase, with capitalization performed upon display by JOT's algorithm; other capitals such as in proper names, acronyms, or emphasized "shouted" words are stored as actual capitals). However, word, sentence, and paragraph breaks are represented using special characters in the 8-bit range of 80 - FF (all character code points here are given as hexadecimal).

  • 80 represents a word break (displayed as a space).
  • 90 represents a sentence break (displayed as two spaces, with the first letter of the next sentence capitalized).
  • A0 represents a paragraph break (displayed as a line break followed by an indent).
  • F0 represents the end of the document (perhaps B0, C0, D0, and E0 were reserved for other division levels such as chapters?)


Personal tools