Boxes/atoms format

From Just Solve the File Format Problem
Jump to: navigation, search
File Format
Name Boxes/atoms format
Ontology

Boxes/atoms format is our name for the metaformat used by JPEG 2000, QuickTime, and other formats. (We also call it Box file format, which is apparently the name used by JPEG XT.) It is a tagged, segmented, hierarchical format, similar to IFF and RIFF. In some descriptions of it, the primary data structure is called a box, and in others it's called an atom.

Contents

Format details

A file consists of a sequence of one or more boxes. Some boxes contain data, and some contain other boxes. The only way to tell which is which is to have knowledge of the specific application format.

A box begins with a header of either 8 or (rarely) 16 bytes. The first four bytes of the header are a field we'll call size32. The next four are the Box Type code, usually corresponding to four ASCII characters. Sometimes, the Box Type field is followed by an 8-byte size64 field. Following the header is the payload data.

The size fields use big-endian byte order. They include the size of the header. The size of a box's payload data is determined as follows:

size32 size64 Box payload size
0 not present Payload extends to the end of the file
1 ≥16 (size64 − 16) bytes
2–7 (reserved)
≥8 not present (size32 − 8) bytes

Variations

Boxes/atoms format is independently defined by each specification that uses it. There are some slight differences in the various formulations of it. For example, for QuickTime, a size32 of 0 is only allowed for top-level boxes. For JPEG 2000, it is also allowed for a box whose parent has a size32 of 0.

Brands

Most (but not all) boxes/atoms-based formats use an "ftyp" box, and along with it a concept called "brands". A brand is a four-letter code representing a format or subformat. Each file has a major brand (or primary brand), and also a compatibility list of brands. A brand's presence in the compatibility list indicates that a decoder which supports that brand can usefully decode the file in some way, even if it doesn't fully support it. This can be useful, though it makes it more complicated to explain just exactly what format a given file is in.

Brand cross-reference

This table lists some of the known brands. There are many more brands that are not listed here.

Brand Description Refer to
3g[a-z]? 3GP
3g2? 3G2
avc1 AVC (file format)
avif AVIF
f4v F4V
heic HEIF
isom
iso2
iso[3-9]
ISO Base Media File Format
jp2 JP2
jpm JPM
jpx JPX
jpxb Baseline JPX JPX
jpxt JPEG XT
M4A AAC
mif1 HEIF
mj2s MJ2 Simple Profile MJ2
mjp2 MJ2
mp41 MP4
mp42 MP4
piff Protected Interoperable File Format
qt QuickTime
crx Canon RAW 3
jph HTJ2K

UUID boxes

Some formats support private extensions using box type "uuid". In a uuid box, the first 16 bytes of what would normally be the payload data are a UUID (arbitrary 16-byte identifier). The real payload data begins immediately after the UUID.

Some of the known UUIDs are listed below. They are not necessarily official in any sense, and they should not be assumed to be meaningful in every format.

UUID Format Reference
0537cdab-9d0c-4431-a72a-fa561f2a113e Exif ExifTool
2c4c0100-8504-40b9-a03e-562148d6dfeb Photoshop Image Resources ExifTool
33c7a4d2-b81d-4723-a0ba-f1a3e097ad38 IPTC-IIM
8974dbce-7be7-4c51-84f9-7148f9882554 PIFF Track Encryption Box PIFF specification
96a9f1f1-dc98-402d-a7ae-d68e34451809 GeoJP2 World File Box GeoJP2 specification
a2394f52-5a9b-4f14-a244-6c427c648df4 PIFF Sample Encryption Box PIFF specification
b14bf8bd-083d-4b43-a5ae-8cd7d5a6ce03 GeoJP2 GeoTIFF Box GeoJP2 specification
be7acfcb-97a9-42e8-9c71-999491e3afac XMP XMP specification
d08a4f18-10f3-4a82-b6c8-32d8aba183d3 PIFF Protection System Specific Header Box PIFF specification

Related formats

See also Category:Box file format.

Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox