BinHex

From Just Solve the File Format Problem
Revision as of 04:07, 10 December 2012 by Dan Tobias (Talk | contribs)

Jump to: navigation, search
File Format
Name BinHex
Ontology
Extension(s) .hqx, .hcx, .hex
MIME Type(s) application/mac-binhex40, application/mac-binhex, application/binhex
Released 1981


BinHex has a somewhat convoluted history. Beginning as a built-in protocol in a TRS-80 terminal program to encode binary files as hexadecimal numbers written in printable ASCII characters for safe transmission to online services that couldn't handle other characters, it was implemented by Tim Mann as a standalone program. The file extension .hex was normally used for files in this format. CompuServe in particular had a problem with bytes out of the 7-bit character range until the mid-1980s, so BinHex was popular for binaries on that service. Macintosh users had the same problem with binaries as TRS-80 users did, so William Davis created a ported Mac version of BinHex in 1984, programmed in BASIC. This then went through a number of versions, and by version 3 it supported the encoding of both the data and resource forks of the original file in a single download file, taking care of a particular Mac issue. The BASIC program was very slow, however. That's when yet another author came in, Yves Lempereur, who implemented an assembly-language version which was much faster. This was labeled "BinHex 1.0" despite all the earlier BinHex versions that had been written by others.

So far, all of the BinHexes used the same basic format (though later Mac versions had special coding to deal with resource forks) and the .hex extension, but with version 2.0 a new encoding was introduced which used more characters than the 16 hexadecimal digits of the earlier versions, and hence could encode 6 bits in every (1-byte) character, instead of 4 bits as in the earlier versions. This made the files smaller (though still bigger than the original binaries; this was not a file compression format). Due to this file format being incompatible with the earlier BinHex format, a new extension .hcx was used. However, this version had some problems, since some of the characters chosen for the encoding were problematic when sent by e-mail because different internationalized servers altered some of them in an attempt at localization. BinHex was, by this time, often used for e-mail attachments (mail programs of those days were not yet intelligent enough to apply proper encoding to binary data by themselves as they are now).

To deal with this problem, yet another version was released, called 4.0 because the author belatedly realized that other version numbers including 3.0 had been used for a different author's BinHex. This one used a different set of encoding characters, safer for transmission, and adopted the file extension .hqx which became very popular for Mac binary uploads and e-mail attachments on online services and the Internet. By then CompuServe had improved its protocols to handle raw binary files, so its utility there was lessened, but users still liked the ability to combine data and resource forks in one file (though other methods such as MacBinary were devised for this purpose).

BinHex 5.0 used MacBinary to combine the forks of a file before encoding them in the BinHex characters, but this was not much used. Users either moved to straight MacBinary or stuck with the old, familiar BinHex 4.0, and the latter remained in use well into the 1990s.

Format

Starting with the move to the .hcx format and continuing with .hqx, the format was no longer actually in hexadecimal despite its name; the encoding, using 64 different characters, is actually very similar to Base64 encoding, though using a different set of characters.

Files are entirely in 7-bit ASCII, with this first line:

(This file must be converted with BinHex 4.0)

What follows (after a blank line) is a block of characters from the set:

!"#$%&'()*+,-012345689@ABCDEFGHIJKLMNPQRSTUVXYZ[`abcdefhijklmpqr

which are used in that order as the digits of a base-64 representation of the binary data. There are three parts, one after another, the first being a header with file metadata, the second being the data fork, and the third the resource fork. A two-byte CRC checksum is added to each. The entire data block begins and ends with a colon (:). Lines are separated with CR every 64 characters. (This might turn into CR+LF or LF if the file is transferred across diverse systems.)

File extension

The most-used version of BinHex uses the .hqx extension as noted above. Seldom-encountered earlier ones used .hex and .hcx. The .hqx extension is rather quirky for "data archeologists" because it clashes with a different convention, that of files compressed with the Squeeze protocol using file extensions with the middle letter replaced with "q". Thus, one would expect a .hqx file to be a .hex file that has been squeezed, but this is wrong; the .hqx format does not in fact have any compression (being somewhat larger than the original file rather than smaller). It's just another one of the "gotchas" that plagues people who delve into old computer archives. Just what one would use as a file extension for a .hqx file that's been run through Squeeze remains unknown.

References

Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox