DjVu

DjVu is a multi-layer raster image file format for digital documents. It was originally developed at AT&T Labs, and is commonly used in book digitization, for example by the Internet Archive.

DjVu documents may include a plain text layer (e.g. from OCR), as well as other data such as a document outline, so the format can serve some of the same purposes as PDF.

Format
Files have a 4-byte preamble. The rest of the file uses IFF format.

Identification
Files begin with ASCII characters " ".

At offset 12 should be a tag indicating the specific file type. For DjVu v3, the possibilities are " ", " ", " ", and " ".

There is an extension of DjVu called Secure DjVu. Secure DjVu files begin with " ".

Specifications

 * DjVu v3 Reference (requires DjVu plug-in)
 * DjVu 1999-04-29 (v2) Reference (requires DjVu plug-in)
 * Secure DjVu Specification (requires DjVu plug-in)

Software

 * DjVuLibre: Viewers, tools, C++ reference library
 * Viewers & Plug-ins
 * Konvertor

Sample files

 * The Specifications documents listed above
 * The DjVuLibre distributions include some DjVu files.
 * https://telparia.com/fileFormatSamples/document/djvu/

Links

 * DjVu.org
 * Overview
 * Wikipedia article
 * Media type registration