Boxes/atoms format

From Just Solve the File Format Problem
Revision as of 22:12, 30 September 2013 by Jsummers (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search
File Format
Name Boxes/atoms format
Ontology

Boxes/atoms format is our name for the metaformat used by JPEG 2000, QuickTime, and other formats. It is a tagged, segmented, hierarchical format, similar to IFF and RIFF. In some descriptions of it, the primary data structure is called a box, and in others it's called an atom.

Contents

Disambiguation

The QuickTime specification defines a data structure called a QT Atom, and another called an Atom Container. These are not part of the format described in this article.

Format

A file consists of a sequence of one or more boxes. Some boxes contain data, and some contain other boxes. The only way to tell which is which is to have knowledge of the specific application format.

A box begins with a header of either 8 or (rarely) 16 bytes. The first four bytes of the header are a field we'll call size32. The next four are the Box Type code, usually corresponding to four ASCII characters. Sometimes the Box Type field is followed by an 8-byte size64 field. Following the header is the payload data.

The size fields use big-endian byte order. They include the size of the header. The size of a box's payload data is determined as follows:

size32 size64 Box payload size
0 not present Payload extends to the end of the file
1 ≥16 (size64 − 16) bytes
2–7 (reserved)
≥8 not present (size32 − 8) bytes

Variations

Boxes/atoms format is independently defined by each specification that uses it. There are some slight differences in the various formulations of it. For example, for QuickTime, a size32 of 0 is only allowed for top-level boxes. For JPEG 2000, it is also allowed for a box whose parent has a size32 of 0.

Related formats

Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox