BinHex

BinHex is a family of formats used as binary-to-text transfer encodings and/or archive formats (to combine data and resource forks into a single file). It is mainly associated with Macintosh computers.

History
BinHex has a somewhat convoluted history. Beginning as a built-in protocol in a TRS-80 terminal program to encode binary files as hexadecimal numbers written in printable ASCII characters for safe transmission to online services that couldn't handle other characters, it was implemented by Tim Mann as a standalone program. The file extension .hex was normally used for files in this format. CompuServe in particular had a problem with bytes out of the 7-bit character range until the mid-1980s, so BinHex was popular for binaries on that service. Macintosh users had the same problem with binaries as TRS-80 users did, so William Davis created a ported Mac version of BinHex in 1984, programmed in BASIC. This then went through a number of versions, and by version 3 it supported the encoding of both the data and resource forks of the original file in a single download file, taking care of a particular Mac issue. The BASIC program was very slow, however. That's when yet another author came in, Yves Lempereur, who implemented an assembly-language version which was much faster. This was labeled "BinHex 1.0" despite all the earlier BinHex versions that had been written by others.

So far, all of the BinHexes used the same basic format (though later Mac versions had special coding to deal with resource forks) and the .hex extension, but with version 2.0 a new encoding was introduced which used more characters than the 16 hexadecimal digits of the earlier versions, and hence could encode 6 bits in every (1-byte) character, instead of 4 bits as in the earlier versions. This made the files smaller (though still bigger than the original binaries; this was not a file compression format). Due to this file format being incompatible with the earlier BinHex format, a new extension .hcx was used. However, this version had some problems, since some of the characters chosen for the encoding were problematic when sent by e-mail because different internationalized servers altered some of them in an attempt at localization. BinHex was, by this time, often used for e-mail attachments (mail programs of those days were not yet intelligent enough to apply proper encoding to binary data by themselves as they are now).

To deal with this problem, yet another version was released, called 4.0 because the author belatedly realized that other version numbers including 3.0 had been used for a different author's BinHex. This one used a different set of encoding characters, safer for transmission, and adopted the file extension .hqx which became very popular for Mac binary uploads and e-mail attachments on online services and the Internet. By then CompuServe had improved its protocols to handle raw binary files, so its utility there was lessened, but users still liked the ability to combine data and resource forks in one file (though other methods such as MacBinary were devised for this purpose).

BinHex 5.0 used MacBinary to combine the forks of a file before encoding them in the BinHex characters, but this was not much used. Users either moved to straight MacBinary or stuck with the old, familiar BinHex 4.0, and the latter remained in use well into the 1990s.

Format details
Starting with the move to the .hcx format and continuing with .hqx, the format was no longer actually in hexadecimal despite its name; the encoding, using 64 different characters, is actually very similar to Base64 encoding, though using a different set of characters.

Version 4.0
The BinHex part of the files is encoded as 7-bit ASCII. It starts with the following line of text. Any text before this line is to be ignored.

(This file must be converted with BinHex 4.0)

What follows (after a blank line) is a block of characters from the set:

!"#$%&'*+,-012345689@ABCDEFGHIJKLMNPQRSTUVXYZ[`abcdefhijklmpqr

which are used in that order as the digits of a base-64 representation of the binary data. There are three parts, one after another, the first being a header with file metadata, the second being the data fork, and the third the resource fork. A two-byte CRC checksum is added to each. The entire data block begins and ends with a colon. Lines are separated with CR every 64 characters. (This might turn into CR+LF or LF if the file is transferred across diverse systems.)

BinHex 4 uses the RLE90 compression scheme.

File extension
The most-used version of BinHex uses the .hqx extension as noted above. Seldom-encountered earlier ones used .hex and .hcx. The .hqx extension is rather quirky for "data archeologists" because it clashes with a different convention, that of files compressed with the Squeeze protocol using file extensions with the middle letter replaced with "q". Thus, one would expect a .hqx file to be a .hex file that has been squeezed, but this is wrong. It's just another one of the "gotchas" that plague people who delve into old computer archives. Just what one would use as a file extension for a .hqx file that's been run through Squeeze remains unknown.

Programs and utilities

 * BinHex Perl library
 * Online BinHex encoder/decoder
 * UUDeview
 * macutil → hexbin

Sample files

 * http://libxad.cvs.sourceforge.net/viewvc/libxad/testfiles/ASCII/hqx/
 * http://cd.textfiles.com/carousel344/MACTOSH/ → LANG/, TECH/, ...
 * https://telparia.com/fileFormatSamples/archive/binHex/