Squeeze
(Identification) |
|||
Line 6: | Line 6: | ||
}} | }} | ||
:''Not to be confused with [[Squeeze It]], a compression-and-archival format for [[DOS]] released around 1992.'' | :''Not to be confused with [[Squeeze It]], a compression-and-archival format for [[DOS]] released around 1992.'' | ||
− | [[Squeeze]] was a method of compressing single files popular on [[CP/M]], devised by Richard (Dick) Greenlaw(?) circa 1981. It was superseded by [[Crunch]] and later [[CrLZH]]. Squeezed files were common in [[LBR]] archives. It uses Huffman compression. | + | [[Squeeze]] was a method of compressing single files popular on [[CP/M]], devised by Richard (Dick) Greenlaw(?) circa 1981. It was superseded by [[Crunch]] and later [[CrLZH]]. Squeezed files were common in [[LBR]] archives. It uses [[Huffman coding|Huffman compression]]. |
Squeezed files were signified in CP/M's 8.3 filename format by replacing the middle letter of the extension with Q (.?Q? -- so FOO.TXT became FOO.TQT), with the extension .QQQ used for corner cases such as a blank extension. | Squeezed files were signified in CP/M's 8.3 filename format by replacing the middle letter of the extension with Q (.?Q? -- so FOO.TXT became FOO.TQT), with the extension .QQQ used for corner cases such as a blank extension. | ||
The <tt>/usr/share/misc/magic</tt> file on Linux systems suggests that it was perhaps also in use on the [[Apple II|Apple ][]] platform, and in fact [[Binary II]] files are often found squeezed (as .bqy instead of .bny). Versions for PC/MS-DOS were also in use in the early 1980s before [[ARC (compression format)|ARC]] caught on as the dominant archiver. | The <tt>/usr/share/misc/magic</tt> file on Linux systems suggests that it was perhaps also in use on the [[Apple II|Apple ][]] platform, and in fact [[Binary II]] files are often found squeezed (as .bqy instead of .bny). Versions for PC/MS-DOS were also in use in the early 1980s before [[ARC (compression format)|ARC]] caught on as the dominant archiver. | ||
+ | |||
+ | == Format details == | ||
+ | Multi-byte integers are little-endian. | ||
+ | |||
+ | {| class="wikitable" | ||
+ | ! Field !! Size in bytes !! Description | ||
+ | |- | ||
+ | |signature || 2 || <code>0x76 0xff</code> | ||
+ | |- | ||
+ | |checksum || 2 || Low 16 bits of the sum of the decompressed byte values. | ||
+ | |- | ||
+ | |original filename || variable || Terminated by a NUL byte. | ||
+ | |- | ||
+ | |compressed data || variable || See below. | ||
+ | |} | ||
+ | |||
+ | The "compressed data" section. This part is equivalent to [[ARC (compression format)|ARC]] compression method 4. | ||
+ | |||
+ | {| class="wikitable" | ||
+ | ! Field !! Size in bytes !! Description | ||
+ | |- | ||
+ | |node_count || 2 || Number of nodes in the table. Valid values are 0 through 256, inclusive. | ||
+ | |- | ||
+ | |node table || 4 × node_count || The encoded Huffman tree. See below. | ||
+ | |- | ||
+ | |data || variable || Huffman-encoded data. Least-significant bit first. After Huffman decoding, the data is [[RLE90]]-compressed. Generally speaking, the data is terminated by a special "stop" code. However, this compression format could be used in situations where it could also be terminated by other means. | ||
+ | |} | ||
+ | |||
+ | A table node contains two encoded values. Each is a signed 16-bit integer interpreted as follows: | ||
+ | |||
+ | {| class="wikitable" | ||
+ | ! Encoded value !! Meaning | ||
+ | |- | ||
+ | | −257 || Stop | ||
+ | |- | ||
+ | | −256 ... −1 || Byte value 255 ... 0 | ||
+ | |- | ||
+ | | 0 ... 255 || Pointer to a child node | ||
+ | |} | ||
+ | |||
+ | Some Squeeze software limits the length of a Huffman code to at most 16 bits. | ||
== Identification == | == Identification == | ||
Line 27: | Line 68: | ||
* See the [http://www.classiccmp.org/cpmarchives/cpm/mirrors/oak.oakland.edu/pub/cpm/squsq/ SQUSQ] directory on CP/M archives for various source code and documentation (much of it, unfortunately, itself squeezed/crunched). | * See the [http://www.classiccmp.org/cpmarchives/cpm/mirrors/oak.oakland.edu/pub/cpm/squsq/ SQUSQ] directory on CP/M archives for various source code and documentation (much of it, unfortunately, itself squeezed/crunched). | ||
* The file header follows a similar/compatible structure to [[Crunch]] and [[CrLZH]]. | * The file header follows a similar/compatible structure to [[Crunch]] and [[CrLZH]]. | ||
− | |||
* [[Wikipedia:SQ (program)]] | * [[Wikipedia:SQ (program)]] | ||
[[Category:File formats with too many extensions]] | [[Category:File formats with too many extensions]] | ||
[[Category:CP/M]] | [[Category:CP/M]] |
Revision as of 15:16, 1 August 2020
- Not to be confused with Squeeze It, a compression-and-archival format for DOS released around 1992.
Squeeze was a method of compressing single files popular on CP/M, devised by Richard (Dick) Greenlaw(?) circa 1981. It was superseded by Crunch and later CrLZH. Squeezed files were common in LBR archives. It uses Huffman compression.
Squeezed files were signified in CP/M's 8.3 filename format by replacing the middle letter of the extension with Q (.?Q? -- so FOO.TXT became FOO.TQT), with the extension .QQQ used for corner cases such as a blank extension.
The /usr/share/misc/magic file on Linux systems suggests that it was perhaps also in use on the Apple ][ platform, and in fact Binary II files are often found squeezed (as .bqy instead of .bny). Versions for PC/MS-DOS were also in use in the early 1980s before ARC caught on as the dominant archiver.
Contents |
Format details
Multi-byte integers are little-endian.
Field | Size in bytes | Description |
---|---|---|
signature | 2 | 0x76 0xff
|
checksum | 2 | Low 16 bits of the sum of the decompressed byte values. |
original filename | variable | Terminated by a NUL byte. |
compressed data | variable | See below. |
The "compressed data" section. This part is equivalent to ARC compression method 4.
Field | Size in bytes | Description |
---|---|---|
node_count | 2 | Number of nodes in the table. Valid values are 0 through 256, inclusive. |
node table | 4 × node_count | The encoded Huffman tree. See below. |
data | variable | Huffman-encoded data. Least-significant bit first. After Huffman decoding, the data is RLE90-compressed. Generally speaking, the data is terminated by a special "stop" code. However, this compression format could be used in situations where it could also be terminated by other means. |
A table node contains two encoded values. Each is a signed 16-bit integer interpreted as follows:
Encoded value | Meaning |
---|---|
−257 | Stop |
−256 ... −1 | Byte value 255 ... 0 |
0 ... 255 | Pointer to a child node |
Some Squeeze software limits the length of a Huffman code to at most 16 bits.
Identification
Files begin with bytes 76 ff
.
Tools
References
- See the SQUSQ directory on CP/M archives for various source code and documentation (much of it, unfortunately, itself squeezed/crunched).
- The file header follows a similar/compatible structure to Crunch and CrLZH.
- Wikipedia:SQ (program)