DIET (compression)

From Just Solve the File Format Problem
(Difference between revisions)
Jump to: navigation, search
(Added sample files)
(Software)
 
(10 intermediate revisions by 2 users not shown)
Line 3: Line 3:
 
|subcat=Compression
 
|subcat=Compression
 
|subcat2=Executable compression
 
|subcat2=Executable compression
|released=1991
+
|released=1990<!-- The first non-"test" version might have been 1991, but test versions are not clearly marked as such. -->
 
}}
 
}}
'''DIET''' is an executable compression and file compression utility for DOS, developed by Teddy Matsumoto. It does executable compression of [[MS-DOS EXE|EXE]] and [[DOS executable (.com)|COM]] files.
+
'''DIET''' is an executable compression and file compression utility for DOS, developed by Teddy Matsumoto. It does executable compression of [[MS-DOS EXE|EXE]] files (to EXE) and [[DOS executable (.com)|COM]] files (to EXE or COM).
  
 
It can also compress arbitrary data files. Such files can be transparently decompressed by DIET's TSR utility.
 
It can also compress arbitrary data files. Such files can be transparently decompressed by DIET's TSR utility.
  
Both types of files can be decompressed using the <code>-ra</code> option.
+
Both types of files can be decompressed using the <code>-RA</code> option.
 +
 
 +
== Technical notes ==
 +
Researchers should note that DIET's behavior depends on the cluster size of the relevant filesystem. Use the <code>-B</code> option (introduced in v1.10a) to turn off this feature, or else DIET will probably decide not to compress most of your files.
 +
 
 +
== Format details ==
 +
Roughly speaking, the known versions of DIET can be grouped into three format "eras": 1.00-1.00d, 1.02b-1.20, and 1.44-1.45f. Multiplied by the three file types (EXE, COM, data), that makes about 9 different DIET file formats.
 +
 
 +
Most of the formats contain a common 11-byte header preceding the compressed data:
 +
 
 +
{| class="wikitable"
 +
! Offset !! Size !! Description
 +
|-
 +
| +0 || 3 || Signature: ASCII "{{magic|dlz}}"
 +
|-
 +
| +3 || 1 || Flags, and high 4 bits of compressed size
 +
|-
 +
| +4 || 2 || Low 16 bits of compressed size
 +
|-
 +
| +6 || 2 || [[CRC-16#CRC-16/ARC|CRC-16/ARC]] of compressed data
 +
|-
 +
| +8 || 1 || High 6 bits of original size
 +
|-
 +
| +9 || 2 || Low 16 bits of original size
 +
|}
 +
 
 +
There is also a two-byte signature, {{magic|0x9d 0x89}}, that appears in most of the formats.
  
 
== Identification ==
 
== Identification ==
Compressed data files apparently start with bytes {{magic|b4 4c cd 21 9d 89 64 6c 7a}}.
+
For what it's worth, the newer versions of DIET detect compressed files by searching for the byte sequence {{magic|0x9d 0x89}}, and ASCII "{{magic|dlz}}", in the first 126 bytes of the file. Both must appear, in that order. This works for the newer formats, but not for all of the older ones.
  
EXE files most likely have ASCII "{{magic|diet}}" at offset 28.
+
=== Identification of EXE files ===
 +
Below are some version-specific characteristics of DIET-compressed EXE files.
 +
 
 +
Some DIET-compressed EXE files have {{magic|9d 89}} in the EXE checksum field at offset 18 (refer to [[MS-DOS EXE#Header structure]]), and some have ASCII "{{magic|diet}}" in the unused bytes at offset 28. These signatures might be less reliable than other means of identifying DIET format, as they could be modified.
 +
 
 +
Also, be aware of [[LGLZ]] format, which can be mistaken for DIET.
 +
 
 +
Let "<code>8e db 8e...</code>" be the byte sequence {{magic|8e db 8e c0 33 f6 33 ff b9 08 00 f3 a5 4b 48 4a}}.
 +
 
 +
v1.00-1.00d:
 +
* <code>03 00</code> at offset 20 (the IP register)
 +
* <code>8e db 8e...</code> at offset 55
 +
 
 +
v1.02b-1.20
 +
* {{magic|9d 89}} at offset 18
 +
* <code>8e db 8e...</code> at offset 52
 +
* "{{magic|dlz}}" at offset 87
 +
 
 +
v1.44
 +
* {{magic|9d 89}} at offset 18
 +
* "{{magic|diet}}" at offset 28
 +
* <code>8e db 8e...</code> at offset 72
 +
* "{{magic|dlz}}" at offset 107
 +
 
 +
v1.45f
 +
* {{magic|9d 89}} at offset 18
 +
* "{{magic|diet}}" at offset 28
 +
* <code>8e db 8e...</code> at offset 77
 +
* "{{magic|dlz}}" at offset 108
 +
 
 +
=== Identification of COM files ===
 +
v1.00-1.00d: Files start with {{magic|bf}}, and have {{magic|fd f3 a5 fc 8b f7 bf 00}} at offset 17. Note: The CRC field is at offset 35, and the compressed data starts at offset 37.
 +
 
 +
v1.02b-1.20: Files start with {{magic|be}}, have {{magic|fd f3 a5 fc 8b f7 bf 00}} at offset 17, and {{magic|'d' 'l' 'z'}} at offset 35.
 +
 
 +
v1.44-1.45f: Files start with {{magic|f9}}, have {{magic|9d 89}} at offset 10, and {{magic|'d' 'l' 'z'}} at offset 65.
 +
 
 +
=== Identification of data files ===
 +
v1.00-1.00d: Files start with bytes {{magic|b4 4c cd 21 9d 89}}. Note: The CRC field is at offset 6, and the compressed data starts at offset 8.
 +
 
 +
v1.02b-1.20: Files start with bytes {{magic|9d 89 'd' 'l' 'z'}}.
 +
 
 +
v1.44-1.45f: Files start with bytes {{magic|b4 4c cd 21 9d 89 'd' 'l' 'z'}}.
 +
 
 +
== See also ==
 +
* [[LEXEM]]
 +
* [[LGLZ]]
 +
* [[PACK (NoddegamrA)]]
 +
 
 +
== Specifications ==
 +
* [https://archive.org/details/msdos_shareware_fb_DIET102B DIET v1.02b] → DIETTECH.DOC [Possibly an unfinished draft -- lots of errors.]
  
 
== Software ==
 
== Software ==
 +
DIET:
 +
 +
* {{CdTextfiles|swextrav1993/disk3/assorted/diet10.zip|v1.00}}
 +
* v1.00d: [https://archive.org/details/tekno-6-1998 TEKNO 6-1998] → DEMOER/GRAVITY3/PC_4KB/SPARK.ZIP → SRC_NORM.ZIP → DIET.EXE
 +
* [https://archive.org/details/msdos_shareware_fb_DIET102B v1.02b]
 +
* {{CdTextfiles|microhaus/mhblackbox3/ARCHIVER/DIET110A.ZIP|v1.10a}}
 +
* {{CdTextfiles|garbo/PC/EXECOMP/DIET120.ZIP|v1.20}}
 +
* {{CdTextfiles|ftp.wwiv.com/pub/COMPRESS/DIET144.ZIP|v1.44}}
 +
* {{CdTextfiles|pdos9606/ARCHIVER/EXECOMP/DIET145F.ZIP|v1.45f}}
 +
* {{OldskoolDOSEXE}} → Executable Tools Pack → packers/diet.*
 +
* [http://old-dos.ru/index.php?page=files&mode=files&do=show&id=141 Various versions at old-dos.ru]
 +
 +
Decompression:
 +
 
* DIET
 
* DIET
** {{CdTextfiles|garbo/PC/EXECOMP/DIET120.ZIP|v1.20}}
+
* {{Deark}}
** {{CdTextfiles|ftp.wwiv.com/pub/COMPRESS/DIET144.ZIP|v1.44}}
+
* For other utilities that may decompress DIET-compressed executables, see [[Executable compression#Decompression software]].
** {{CdTextfiles|pdos9606/ARCHIVER/EXECOMP/DIET145F.ZIP|v1.45f}}
+
  
 
== Sample files ==
 
== Sample files ==
* https://telparia.com/fileFormatSamples/archive/diet/
+
* {{DexvertSamples|archive/diet}}

Latest revision as of 17:51, 29 December 2023

File Format
Name DIET (compression)
Ontology
Released 1990

DIET is an executable compression and file compression utility for DOS, developed by Teddy Matsumoto. It does executable compression of EXE files (to EXE) and COM files (to EXE or COM).

It can also compress arbitrary data files. Such files can be transparently decompressed by DIET's TSR utility.

Both types of files can be decompressed using the -RA option.

Contents

[edit] Technical notes

Researchers should note that DIET's behavior depends on the cluster size of the relevant filesystem. Use the -B option (introduced in v1.10a) to turn off this feature, or else DIET will probably decide not to compress most of your files.

[edit] Format details

Roughly speaking, the known versions of DIET can be grouped into three format "eras": 1.00-1.00d, 1.02b-1.20, and 1.44-1.45f. Multiplied by the three file types (EXE, COM, data), that makes about 9 different DIET file formats.

Most of the formats contain a common 11-byte header preceding the compressed data:

Offset Size Description
+0 3 Signature: ASCII "dlz"
+3 1 Flags, and high 4 bits of compressed size
+4 2 Low 16 bits of compressed size
+6 2 CRC-16/ARC of compressed data
+8 1 High 6 bits of original size
+9 2 Low 16 bits of original size

There is also a two-byte signature, 0x9d 0x89, that appears in most of the formats.

[edit] Identification

For what it's worth, the newer versions of DIET detect compressed files by searching for the byte sequence 0x9d 0x89, and ASCII "dlz", in the first 126 bytes of the file. Both must appear, in that order. This works for the newer formats, but not for all of the older ones.

[edit] Identification of EXE files

Below are some version-specific characteristics of DIET-compressed EXE files.

Some DIET-compressed EXE files have 9d 89 in the EXE checksum field at offset 18 (refer to MS-DOS EXE#Header structure), and some have ASCII "diet" in the unused bytes at offset 28. These signatures might be less reliable than other means of identifying DIET format, as they could be modified.

Also, be aware of LGLZ format, which can be mistaken for DIET.

Let "8e db 8e..." be the byte sequence 8e db 8e c0 33 f6 33 ff b9 08 00 f3 a5 4b 48 4a.

v1.00-1.00d:

  • 03 00 at offset 20 (the IP register)
  • 8e db 8e... at offset 55

v1.02b-1.20

  • 9d 89 at offset 18
  • 8e db 8e... at offset 52
  • "dlz" at offset 87

v1.44

  • 9d 89 at offset 18
  • "diet" at offset 28
  • 8e db 8e... at offset 72
  • "dlz" at offset 107

v1.45f

  • 9d 89 at offset 18
  • "diet" at offset 28
  • 8e db 8e... at offset 77
  • "dlz" at offset 108

[edit] Identification of COM files

v1.00-1.00d: Files start with bf, and have fd f3 a5 fc 8b f7 bf 00 at offset 17. Note: The CRC field is at offset 35, and the compressed data starts at offset 37.

v1.02b-1.20: Files start with be, have fd f3 a5 fc 8b f7 bf 00 at offset 17, and 'd' 'l' 'z' at offset 35.

v1.44-1.45f: Files start with f9, have 9d 89 at offset 10, and 'd' 'l' 'z' at offset 65.

[edit] Identification of data files

v1.00-1.00d: Files start with bytes b4 4c cd 21 9d 89. Note: The CRC field is at offset 6, and the compressed data starts at offset 8.

v1.02b-1.20: Files start with bytes 9d 89 'd' 'l' 'z'.

v1.44-1.45f: Files start with bytes b4 4c cd 21 9d 89 'd' 'l' 'z'.

[edit] See also

[edit] Specifications

  • DIET v1.02b → DIETTECH.DOC [Possibly an unfinished draft -- lots of errors.]

[edit] Software

DIET:

Decompression:

[edit] Sample files

Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox