FWKCS

FWKCS is a contents signature system for files, released by Frederick W. Kantor in 1989 and continuing in development at least through 1993. Its primary intended user was a computer bulletin board system (BBS) sysop wishing to check for duplicates among files, including ones already on the system, new ones being uploaded, and ones available through networks from other systems. By calculating hash values for each file (including compressed data within ZIP files) and checking them against stored values from previous file scans, the system could determine with a high degree of probability that a given file matched another one, without needing to compare every byte of it. This hash check is not 100% reliable, as "hash collisions" exist in which multiple different files yield the same hash value (a logical necessity given that the full data has many more bits in it than the hash, and hence many more possible values), but a well-chosen hash algorithm will rarely produce collisions in actual data.

MD5 hashing was used.

This particular program got a bit of notoriety in the 2010s when it was cited as prior art to invalidate a group of patents regarding the use of file signature values in web platforms.

FWKCS was released for free download and installation, but users were expected to pay for a license if continuing use beyond a trial period.

Files used
In addition to the various (PC/MS-DOS) executables and batch files used for the program itself, FWKCS uses files with .NDX and .SRT extensions to store data on the files being checked and indexed. CSLIST.SRT has a list of file paths, alphabetically arranged, in ASCII form, one path per line. CSLIST1.SRT has filenames (without paths) and hash values, sorted by the hash. CSLIST.NDX and CSLIST1.NDX are binary files with index data.

CSLIST1.SRT
The format of CSLIST1.SRT looks like this:

1498DE63  8902F FWKCS122.ZIPv z   cs 14A1E90A     587 ACCESION.BATaFWKCS122.ZIPv 15B661D9    407 PRIVSPLT.BASaFWKCS122.ZIPv

The hash is first (well, a 32-bit hash... MD5 is 128 bits, so it's a bit unclear what's stored here), as a 8-digit hexadecimal number in columns 0-7. Then there is the file length, right justified to end at column 15. After a space, the next 12 positions contain the filename (in DOS 8+3 format), then if the file is contained within a ZIP archive that filename follows (with an 'a' preceding it and a 'v' following, apparently; various other letters are used in these positions and in the position of the archive name in entries for files not within an archive).

Links

 * 1993 shareware CD in which version 1.22 of FWKCS could be found: disc image
 * Testimony of Jason Scott regarding this CD
 * One of the patents that was invalidated as a result of the testimony
 * Final decision invalidating various patent claims
 * Documentation of extra fields in ZIP files, including FWKCS signatures