DFDL

DFDL (Data Format Definition Language) is a file format for describing file formats. It is an XML-based schema that allows arbitrary binary or text-based data formats to be described in a manner which allows the automated parsing of the data file into a corresponding XML file with the same data elements in the same order, capable of round-trip transformations to and from the original data format without loss. A DFDL document consists of an XML schema describing the data fields of a format, supplemented with annotations which describe how the data is stored (defining delimiters, endianness, etc.).

A partial (and not quite fully standards-compliant) implementation of a DFDL parser has been released as the open-source project Defuddle. Implementation is in progress of a new, improved DFDL parser called Daffodil.

Format specs

 * DFDL 1.0 specification

Software

 * Defuddle
 * Defuddle files

Other links

 * Official DFDL site
 * Investigations of Data Representation
 * Do you speak Volkswriter? MultiMate? Visicalc? Making Steps Toward a Universal File Format Reader
 * Implementing Daffodil, a new DFDL parser