LocoScript

From Just Solve the File Format Problem
(Difference between revisions)
Jump to: navigation, search
(Identification: another PC version)
Line 40: Line 40:
  
 
All known (PCW) LocoScript files start with the three ASCII bytes <tt>4A 4F 59</tt> ("JOY"), followed by two bytes identifying the major format version:
 
All known (PCW) LocoScript files start with the three ASCII bytes <tt>4A 4F 59</tt> ("JOY"), followed by two bytes identifying the major format version:
 +
* <tt>01 00</tt>: Identified by LocoScript 4 as "LocoScript 2 document (Export)"
 
* <tt>01 01</tt>: LocoScript 1 (emitted by 1.1, 1.11e, 1.20, 1.42H)
 
* <tt>01 01</tt>: LocoScript 1 (emitted by 1.1, 1.11e, 1.20, 1.42H)
 
* <tt>01 02</tt>: LocoScript 2 (emitted by 2.03, 2.12, 2.16, 2.28b)
 
* <tt>01 02</tt>: LocoScript 2 (emitted by 2.03, 2.12, 2.16, 2.28b)
 +
* <tt>01 03</tt>: Identified by LocoScript 4 as "LocoScript 3 document (Export)"
 
* <tt>01 04</tt>: LocoScript 3 (emitted by 3.06b)
 
* <tt>01 04</tt>: LocoScript 3 (emitted by 3.06b)
 +
* <tt>01 05</tt>: Identified by LocoScript 4 as "LocoScript 4 document (Export)"
 
* <tt>01 06</tt>: LocoScript 4 (emitted by 4.06, 4.10, 4.11)
 
* <tt>01 06</tt>: LocoScript 4 (emitted by 4.06, 4.10, 4.11)
 +
 +
In addition, the byte at offset 0x7F must be the 8-bit checksum of the preceding 127 bytes.
  
 
PC versions' documents seem to have a similar initial structure but start with the three ASCII bytes <tt>44 4F 43</tt> ("DOC"):
 
PC versions' documents seem to have a similar initial structure but start with the three ASCII bytes <tt>44 4F 43</tt> ("DOC"):
 
* LocoScript PC 1.08 (© 1990) comes with documents starting "DOC" followed by <tt>01 01</tt>.
 
* LocoScript PC 1.08 (© 1990) comes with documents starting "DOC" followed by <tt>01 01</tt>.
 
* LocoScript Professional 2 Plus for MS-DOS (2.51) emits and reads documents starting with "DOC" followed by <tt>01 03</tt>.
 
* LocoScript Professional 2 Plus for MS-DOS (2.51) emits and reads documents starting with "DOC" followed by <tt>01 03</tt>.
 +
 +
Other files used by LocoScript have a similar header, with their own three-letter identification codes:
 +
* "BMP" - Scalable font bitmap
 +
* "CHR" - Printer font
 +
* "CMB" - Dot matrix printer driver
 +
* "DMN" - Disc Manager data (LocoScript PC)
 +
* "DRV" - Driver
 +
* "EDC" - Spellchecker dictionary
 +
* "HLP" - Help file (LocoScript PC)
 +
* "KBD" - Keyboard layout
 +
* "KNO" - Settings
 +
* "OML" - Overlay (Mail merge)
 +
* "OSP" - Overlay (Spell checker)
 +
* "OVL" - Overlay
 +
* "PHR" - List of phrases
 +
* "PRI" - Printer driver
 +
* "SCR" - Screen characters
 +
* "SDC" - Spellchecker dictionary
 +
* "UDC" - Spellchecker dictionary
 +
* "XCH" - Scalable font
 +
  
 
== Converting LocoScript documents ==
 
== Converting LocoScript documents ==

Revision as of 21:03, 12 January 2021

File Format
Name LocoScript
Ontology

LocoScript was the word processor bundled with the Amstrad PCW. There were four major versions for the PCW, and two for MSDOS.

8-bit versions:

  • LocoScript 1 (1985) was bundled with the Amstrad PCW 8256/8512 (3" drives) and PcW 9256 and 10 (3.5" drives).
  • LocoScript 2 (1987) was bundled with the Amstrad PCW 9512 (3" drive) and PcW 9512+ (3.5" drive), and was a common upgrade for the other models.
  • LocoScript 3 (1993) was only available separately. It added semi-scalable fonts.
  • LocoScript 4 (1996/7) was only available separately. It added support for images (in MDA format) and colour printing.

Most PCW documents, on either 3" or 3.5" floppy discs, are thus likely to be in LocoScript 1 or 2 format.

(The PcW 16 did not run LocoScript.)

While standard LocoScript had a relatively wide range of characters, there were some specialised versions for particular scripts, such as Euro-Arabic LocoScript and Hebrew LocoScript.

PC versions:

  • LocoScript PC (later LocoScript PC Easy)
  • LocoScript Professional

Contents

File formats

Each major version of LocoScript changed the file format. Newer versions could read files from older versions, but not vice versa.

  • LocoScript 1 is relatively well documented.
    • Schneider PC International 1988/01 (pp84-97) has a fairly detailed description of the LocoScript 1 format (in German), and provides Turbo Pascal source for a program LOCOCONV to convert it (updated in the 1988/10 issue, pp92-97). Versions of that program exist in various places:
      • Werner Cirsovius' site (on Wayback machine) had a copy. Wayback Machine doesn't have all the code, but the whole website is archived as a .7z here.
      • A modified/translated version is on Frank van Empel's site (search for LOCOCON).
    • Another, briefer description of the LocoScript 1 file format (in English)
    • The character encoding used for the text portions is the Amstrad CP/M Plus character set, except that the C1 controls range was used for control codes (different ones from the C1 control standard, which probably didn't exist yet) instead of the box-drawing characters of the CP/M Plus set.
  • LocoScript 2 and up: no known descriptions (although plenty of software exists to read them). These versions had a greatly expanded character repertoire, more than can fit in a single-byte character set; see reference from John Elliott. They share some of the same basic structure as LocoScript 1.
    • Reportedly, Locomotive/LocoScript Software did produce format documentation for at least Loco 3 and 4 documents, the latter called The Structure of LocoScript 4 Documents and released under NDA; they don't seem to have made their way online. Refs: David Langford's columns in PCW Plus 114 (March 1996) and PCW Today issue 8 (Winter 97/98).

LocoScript documents did not have a conventional file extension. The default filenames it suggested were DOCUMENT.000, DOCUMENT.001, etc.

Identification

All known (PCW) LocoScript files start with the three ASCII bytes 4A 4F 59 ("JOY"), followed by two bytes identifying the major format version:

  • 01 00: Identified by LocoScript 4 as "LocoScript 2 document (Export)"
  • 01 01: LocoScript 1 (emitted by 1.1, 1.11e, 1.20, 1.42H)
  • 01 02: LocoScript 2 (emitted by 2.03, 2.12, 2.16, 2.28b)
  • 01 03: Identified by LocoScript 4 as "LocoScript 3 document (Export)"
  • 01 04: LocoScript 3 (emitted by 3.06b)
  • 01 05: Identified by LocoScript 4 as "LocoScript 4 document (Export)"
  • 01 06: LocoScript 4 (emitted by 4.06, 4.10, 4.11)

In addition, the byte at offset 0x7F must be the 8-bit checksum of the preceding 127 bytes.

PC versions' documents seem to have a similar initial structure but start with the three ASCII bytes 44 4F 43 ("DOC"):

  • LocoScript PC 1.08 (© 1990) comes with documents starting "DOC" followed by 01 01.
  • LocoScript Professional 2 Plus for MS-DOS (2.51) emits and reads documents starting with "DOC" followed by 01 03.

Other files used by LocoScript have a similar header, with their own three-letter identification codes:

  • "BMP" - Scalable font bitmap
  • "CHR" - Printer font
  • "CMB" - Dot matrix printer driver
  • "DMN" - Disc Manager data (LocoScript PC)
  • "DRV" - Driver
  • "EDC" - Spellchecker dictionary
  • "HLP" - Help file (LocoScript PC)
  • "KBD" - Keyboard layout
  • "KNO" - Settings
  • "OML" - Overlay (Mail merge)
  • "OSP" - Overlay (Spell checker)
  • "OVL" - Overlay
  • "PHR" - List of phrases
  • "PRI" - Printer driver
  • "SCR" - Screen characters
  • "SDC" - Spellchecker dictionary
  • "UDC" - Spellchecker dictionary
  • "XCH" - Scalable font


Converting LocoScript documents

Probably the most difficult problem with converting LocoScript documents into more readable formats is not the conversion process itself, but the fact that the majority of LocoScript files were stored on 3-inch floppy disks, which are now difficult to access. See the linked page for ideas for how to deal with this.

LocoLink (and the later 'LocoLink for Windows') is a hardware / software combination that connects a PC parallel port to the expansion connector of an Amstrad PCW, and provides the software tools for both transferring LocoScript documents to the PC, as well as converting them to RTF or TXT formats. Note that the later PcW 16 has part of LocoLink built-in, and while this means documents can be transferred from an older PCW to a PcW 16 and then onto a PC, PcW 16 computers are few and far between, making this option unlikely.

PCW LocoScript used CP/M format for its discs, so LocoScript files are likely to be found in a CP/M file system.

Once you are at the stage of having individual document files:

  • AILINK by Ansible Information is former commercial software for Windows, now free, which can convert PCW LocoScript 1-4 documents to more modern formats such as RTF, keeping most of the formatting codes and special characters (not Greek and Cyrillic). It can do bulk conversions. If you're not trying to read actual floppy discs with it, it should work fine under modern Windows. It runs adequately on Linux under Wine.
  • The PC versions of LocoScript could read PCW files, and had an export function to other formats, but are no longer particularly easy to acquire and run themselves.

Links

Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox