Crunch

From Just Solve the File Format Problem
(Difference between revisions)
Jump to: navigation, search
(+ lbrate)
m (Sample files)
 
(21 intermediate revisions by 4 users not shown)
Line 3: Line 3:
 
|subcat=Compression
 
|subcat=Compression
 
|extensions={{ext|?z?}}, {{ext|zzz}}
 
|extensions={{ext|?z?}}, {{ext|zzz}}
 +
|released=~1986
 
}}
 
}}
 +
:''This article is about the CP/M compressed file format. See the [[#Disambiguation|disambiguation section]] for other "Crunch" formats.''
  
[[Crunch]] was a method of compressing single files popular on [[CP/M]], devised by Steve Greenberg circa 1986. It superseded [[Squeeze]] and was succeeded by [[CrLZH]], and crunched files were common in [[LBR]] archives. The underlying compression uses the [[LZW]] algorithm.
+
[[Crunch]] was a method of compressing single files popular on [[CP/M]], devised by Steve Greenberg circa 1986. It superseded [[Squeeze]] and was succeeded by [[CrLZH]], and crunched files were common in [[LBR]] archives. The underlying compression uses the [[LZW]] algorithm, combined with [[run-length encoding]].
  
 
Similar to [[Squeeze]], crunched files were signified in CP/M's 8.3 filename format by replacing the middle letter of the extension with Z (.?Z?), with the extension .ZZZ used for corner cases such as a blank extension.
 
Similar to [[Squeeze]], crunched files were signified in CP/M's 8.3 filename format by replacing the middle letter of the extension with Z (.?Z?), with the extension .ZZZ used for corner cases such as a blank extension.
  
== Tools ==
+
There are two main versions of the compressed data format, and not all decompressors support both. The new (v2.x) format is apparently more common.
 +
 
 +
== Disambiguation ==
 +
Not to be confused with:
 +
* The "crunched" compression methods used in [[ARC (compression format)|ARC]] format, though they are related.
 +
* [[Crunch-Mania]] - An Amiga file compression utility
 +
* [[Cruncher]] - An executable compression utility for DOS, by Ori Berger
 +
* [[Crunch (Luck Martins)|Crunch]] - A file encryption utility for DOS, by Luck Martins ({{OldskoolDOSEXE}} → DOSEXE Executable Tools Pack → packers/crunch.14...)
 +
* CRUNCH - A compression optimization utility for DOS, by Bruce Gavin [{{CdTextfilesURL|20mnn/ARCHIVE/CRUNCH10.ZIP}}]
 +
* Crunch - An old [[ARC (compression format)|ARC]] compression utility by Richard P. Byrne [{{CdTextfilesURL|rbbsv3n1/d86v/crunch.zip}}]
 +
* CRUNCH - A PKARC automation utility by Chuck Zulker [{{CdTextfilesURL|megarom/megarom1/ARC_LBR/CRUNCH.ZIP}}]
 +
 
 +
== See also ==
 +
* [[LZWCOM]] - predecessor
 +
* [[Squeeze]] - predecessor
 +
* [[CrLZH]] - successor
 +
* [[LBR]] - container
 +
* [[ZSQ (LZW compression)‎]] - Similar format
 +
* [[Zoo Z format]] - Same file naming convention
 +
 
 +
== Format details ==
 +
The file header follows a similar/compatible structure to [[CrLZH]]. It was derived from [[Squeeze]], but bears only a little resemblance to it.
 +
 
 +
Note that, as explained in the format documentation, the "filename" field contains not only the filename, but also extension data. If extension data exists, the filename extension is padded with spaces until it is exactly three characters long.
 +
 
 +
In archives originating on CP/M systems, the high bit of each byte in the filename field may contain encoded CP/M file attributes. To extract the original filename, each byte should be masked with 0x7F.
 +
 
 +
=== Compression ===
 +
V1.x compression is based on [[RLE90]] and [[LZWCOM]], very similar to [[ARC (compression format)|ARC]]'s method #6, except that Crunch reserves code 0 to mean "stop".
 +
 
 +
V2.x compression is considerably more complex. CRUNCH20.DOC shipped in CRUNCH20.LBR says: ''It embodies all of the concepts employed in the UNIX COMPRESS / ARC512 algorithm, but is additionally enhanced by a "metastatic code reassignment" facility. This is one of several concepts I am developing as part of an effort to advance data compression techniques beyond current performance limits. I believe this is the first time this principle has been proposed or implemented.''
 +
 
 +
== Identification ==
 +
Files begin with bytes {{magic|76 fe}}.
 +
 
 +
== Specifications ==
 +
 
 +
* The file header is described in the text file LZDEF20.DOC shipped with [http://cpmarchives.classiccmp.org/cpm/mirrors/oak.oakland.edu/pub/sigm/vol294/crunch20.lbr CRUNCH20.LBR].
 +
** An extracted copy is provided [[Crunch/LZDEF20.DOC|here]].
 +
* [http://cpmarchives.classiccmp.org/cpm/mirrors/oak.oakland.edu/pub/cpm/squsq/crunch.abs crunch.abs] - "Technical Abstract" by Steven Greenberg, 16 November 1986
 +
* [http://cpmarchives.classiccmp.org/cpm/mirrors/oak.oakland.edu/pub/cpm/squsq/crunch.izf crunch.izf] → crunch.inf - Collected information about the format
 +
 
 +
== Software ==
  
 
* [[CFX]] (DOS/Unix)
 
* [[CFX]] (DOS/Unix)
* [http://www.svgalib.org/rus/lbrate.html lbrate] (Unix)
+
* [http://www.svgalib.org/rus/lbrate.html lbrate] by Russell Marks, c. 2001 (Unix, GPL2)
 +
* [[The Unarchiver]]
 
* On CP/M (or emulators):
 
* On CP/M (or emulators):
** The canonical tools were CRUNCH and UNCR. Possibly Greenberg's last version (Feb 1988) is v2.4: [http://www.classiccmp.org/cpmarchives/cpm/mirrors/oak.oakland.edu/pub/cpm/squsq/crunch24.lbr CRUNCH24.LBR], [http://www.classiccmp.org/cpmarchives/cpm/mirrors/oak.oakland.edu/pub/cpm/squsq/crnch24s.lbr CRNCH24S.LBR] (source code).
+
** The canonical tools were CRUNCH and UNCR. Possibly Greenberg's last version (Feb 1988) is v2.4:
** The later LT31 deals with extracting from all of [[Squeeze]], [[Crunch]], [[CrLZH]] and [[LBR]] formats. Widely available in CP/M archives, e.g. [http://www.classiccmp.org/cpmarchives/cpm/mirrors/oak.oakland.edu/pub/cpm/arc-lbr/lt31.lbr LT31.LBR]
+
*** [http://cpmarchives.classiccmp.org/cpm/mirrors/oak.oakland.edu/pub/cpm/squsq/crunch24.lbr CRUNCH24.LBR]
 +
*** [http://cpmarchives.classiccmp.org/cpm/mirrors/oak.oakland.edu/pub/cpm/squsq/crnch24s.lbr CRNCH24S.LBR] (source code)
 +
** The later LT31 deals with extracting from all of [[Squeeze]], [[Crunch]], [[CrLZH]] and [[LBR]] formats. Widely available in CP/M archives, e.g. [http://cpmarchives.classiccmp.org/cpm/mirrors/oak.oakland.edu/pub/cpm/arc-lbr/lt31.lbr LT31.LBR]
 +
** crunch12.lbr - Crunch 1.2 - Possible sources: [http://gaby.de/ftp/pub/cpm/znode51/pcwworld/u111/user_0/crunch12.lbr], [https://www.worldofsam.org/products/fdos-disk-002-file-compressors-and-archivers]
 +
** [http://cpmarchives.classiccmp.org/cpm/mirrors/oak.oakland.edu/pub/sigm/vol294/crunch20.lbr crunch20.lbr] - Crunch 2.0
 +
** [http://cpmarchives.classiccmp.org/cpm/mirrors/oak.oakland.edu/pub/cpm/squsq/fcrnch11.lbr fcrnch11.lbr] - FCRUNCH v1.1 - An improved version of Crunch 2.x, by C.B. Falconer
 +
* {{CdTextfiles|megarom/megarom1/ARC_LBR/UNCR_DOS.ZIP|UNCR version "UNCR231"}} - Crunch v2 decompression source code by Frank Prindle. Package includes a DOS binary.
 +
** {{CdTextfiles|megarom/megarom1/ARC_LBR/UNCR233.ZIP|UNCR233}} - Based on UNCR231, with modifications by Skip Hansen (source code + DOS binary)
  
== References ==
+
== Sample files ==
 +
* [http://cpmarchives.classiccmp.org/cpm/mirrors/oak.oakland.edu/pub/cpm/ OAK CP/M archive] → .../*.?z?
 +
* Found in many [[LBR#Sample files|LBR]] files. Note that you may have to tell your LBR utility not to decompress them (e.g. <code>lbrate -n</code>).
 +
* {{DexvertSamples|archive/crunch}}
  
* The file header is described in the text file LZDEF20.DOC shipped with [http://www.classiccmp.org/cpmarchives/cpm/mirrors/oak.oakland.edu/pub/sigm/vol294/crunch20.lbr CRUNCH20.LBR].
+
[[Category:File formats with too many extensions]]
** An extracted copy is provided [[Crunch/LZDEF20.DOC|here]].
+
[[Category:CP/M]]
** Note that the file header follows a similar/compatible structure to [[Squeeze]] and [[CrLZH]].
+
* FIXME: is the exact compression algorithm documented anywhere?
+
** CRUNCH20.DOC shipped in CRUNCH20.LBR says: ''It embodies all of the concepts employed in the UNIX COMPRESS / ARC512 algorithm, but is additionally enhanced by a "metastatic code reassignment" facility. This is one of several concepts I am developing as part of an effort to advance data compression techniques beyond current performance limits. I believe this is the first time this principle has been proposed or implemented.''
+
** See also "Technical Abstract" by Steven Greenberg, 16 November 1986: [http://www.classiccmp.org/cpmarchives/cpm/mirrors/oak.oakland.edu/pub/cpm/squsq/crunch.abs CRUNCH.ABS]
+

Latest revision as of 04:14, 28 December 2023

File Format
Name Crunch
Ontology
Extension(s) .?z?, .zzz
Released ~1986
This article is about the CP/M compressed file format. See the disambiguation section for other "Crunch" formats.

Crunch was a method of compressing single files popular on CP/M, devised by Steve Greenberg circa 1986. It superseded Squeeze and was succeeded by CrLZH, and crunched files were common in LBR archives. The underlying compression uses the LZW algorithm, combined with run-length encoding.

Similar to Squeeze, crunched files were signified in CP/M's 8.3 filename format by replacing the middle letter of the extension with Z (.?Z?), with the extension .ZZZ used for corner cases such as a blank extension.

There are two main versions of the compressed data format, and not all decompressors support both. The new (v2.x) format is apparently more common.

Contents

[edit] Disambiguation

Not to be confused with:

  • The "crunched" compression methods used in ARC format, though they are related.
  • Crunch-Mania - An Amiga file compression utility
  • Cruncher - An executable compression utility for DOS, by Ori Berger
  • Crunch - A file encryption utility for DOS, by Luck Martins (ANORMAL's DOSEXE collections → DOSEXE Executable Tools Pack → packers/crunch.14...)
  • CRUNCH - A compression optimization utility for DOS, by Bruce Gavin [1]
  • Crunch - An old ARC compression utility by Richard P. Byrne [2]
  • CRUNCH - A PKARC automation utility by Chuck Zulker [3]

[edit] See also

[edit] Format details

The file header follows a similar/compatible structure to CrLZH. It was derived from Squeeze, but bears only a little resemblance to it.

Note that, as explained in the format documentation, the "filename" field contains not only the filename, but also extension data. If extension data exists, the filename extension is padded with spaces until it is exactly three characters long.

In archives originating on CP/M systems, the high bit of each byte in the filename field may contain encoded CP/M file attributes. To extract the original filename, each byte should be masked with 0x7F.

[edit] Compression

V1.x compression is based on RLE90 and LZWCOM, very similar to ARC's method #6, except that Crunch reserves code 0 to mean "stop".

V2.x compression is considerably more complex. CRUNCH20.DOC shipped in CRUNCH20.LBR says: It embodies all of the concepts employed in the UNIX COMPRESS / ARC512 algorithm, but is additionally enhanced by a "metastatic code reassignment" facility. This is one of several concepts I am developing as part of an effort to advance data compression techniques beyond current performance limits. I believe this is the first time this principle has been proposed or implemented.

[edit] Identification

Files begin with bytes 76 fe.

[edit] Specifications

  • The file header is described in the text file LZDEF20.DOC shipped with CRUNCH20.LBR.
    • An extracted copy is provided here.
  • crunch.abs - "Technical Abstract" by Steven Greenberg, 16 November 1986
  • crunch.izf → crunch.inf - Collected information about the format

[edit] Software

  • CFX (DOS/Unix)
  • lbrate by Russell Marks, c. 2001 (Unix, GPL2)
  • The Unarchiver
  • On CP/M (or emulators):
    • The canonical tools were CRUNCH and UNCR. Possibly Greenberg's last version (Feb 1988) is v2.4:
    • The later LT31 deals with extracting from all of Squeeze, Crunch, CrLZH and LBR formats. Widely available in CP/M archives, e.g. LT31.LBR
    • crunch12.lbr - Crunch 1.2 - Possible sources: [4], [5]
    • crunch20.lbr - Crunch 2.0
    • fcrnch11.lbr - FCRUNCH v1.1 - An improved version of Crunch 2.x, by C.B. Falconer
  • UNCR version "UNCR231" - Crunch v2 decompression source code by Frank Prindle. Package includes a DOS binary.
    • UNCR233 - Based on UNCR231, with modifications by Skip Hansen (source code + DOS binary)

[edit] Sample files

Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox