MAT

From Just Solve the File Format Problem
(Difference between revisions)
Jump to: navigation, search
(PRONOM update)
(Identification for level 7 and 4)
Line 16: Line 16:
  
 
== Identification ==
 
== Identification ==
Level 5 files begin with the text string <code>MATLAB 5.0 MAT-file</code>. No such unique pattern exists for level 4. The MIME type of this format is unofficial, and it is used by [[Apache Tika]]
+
Level 5 files begin with the text string <code>MATLAB 5.0 MAT-file</code>.
 +
Level 7 files begin with the text string <code>MATLAB 7.0 MAT-file</code>.
 +
No such unique pattern exists for level 4.
 +
At offset 12 the imaginary flag is stored as 4 byte integer.
 +
If this is 1, the matrix contains an imaginary part.
 +
If the matrix only contains real data, the value is 0.
 +
Depending on the endian type, the byte at offset 12 or 15 has the value 0 or 1 and the three remaining bytes have the value 0.
 +
So the two middle bytes have the value 0.
 +
At offset 20 the matrix name is stored as null terminated ASCII string.
 +
So the byte at this offset contains a character like a letter.
 +
At offset 0 the type flag is stored as 4 byte integer.
 +
In decimal that type integer is represented as <code>MOPT</code>, where <code>M</code> counts the thousands and indicates the numeric format of numbers on the machine.
 +
Biggest possible value is 4052 (<code>0xFD4</code>). That means 2 upper bytes are always 0.
 +
For big endian <code>M</code> value is 1. So lowest flag value is 1000 (<code>0x3E8</code>) and highest value is 1052 (<code>0x41C</code>).
 +
The highest hexadecimal value with 4 as second byte occur for 16-bit and 8-bit integers.
 +
For other number formats second byte has value 3.
 +
For little endian machine <code>M</code> value is 0. That means highest type value is 52.
 +
The MIME type of this format is unofficial, and it is used by [[Apache Tika]]
  
 
== Specification ==
 
== Specification ==

Revision as of 01:27, 30 July 2021

File Format
Name MAT
Ontology
Extension(s) .mat
MIME Type(s) application/x-matlab-data
PRONOM fmt/806, fmt/828

MAT is a format used by Mathworks' Matlab for storing formatted data (matrices, structs, scalars and strings).

Apart from Matlab, the format is also supported by Mathematica. GNU Octave, an open source alternative to Matlab, should be able to read and write the format as well. Scientific computing packages NumPy and SciPy enable support for MAT-files in Python.

Contents

Versions

The published specification makes a distinction between Level 4 and Level 5 MAT-files, where Level 4 files are compatible with older MATLAB versions (up to version 4), whereas level 5 files are compatible with MATLAB versions 5 and higher. The overall layout of a Level 4 file is quite different from Level 5, and the two might even be considered separate formats. For example, Level 4 files don't have a unique 'magic' byte pattern that would allow easy identification, whereas the header of a Level 5 file includes a descriptive text field that could be used for this.

Matlab_figure files are really just a special case of the level 5 MAT-File format.

Identification

Level 5 files begin with the text string MATLAB 5.0 MAT-file. Level 7 files begin with the text string MATLAB 7.0 MAT-file. No such unique pattern exists for level 4. At offset 12 the imaginary flag is stored as 4 byte integer. If this is 1, the matrix contains an imaginary part. If the matrix only contains real data, the value is 0. Depending on the endian type, the byte at offset 12 or 15 has the value 0 or 1 and the three remaining bytes have the value 0. So the two middle bytes have the value 0. At offset 20 the matrix name is stored as null terminated ASCII string. So the byte at this offset contains a character like a letter. At offset 0 the type flag is stored as 4 byte integer. In decimal that type integer is represented as MOPT, where M counts the thousands and indicates the numeric format of numbers on the machine. Biggest possible value is 4052 (0xFD4). That means 2 upper bytes are always 0. For big endian M value is 1. So lowest flag value is 1000 (0x3E8) and highest value is 1052 (0x41C). The highest hexadecimal value with 4 as second byte occur for 16-bit and 8-bit integers. For other number formats second byte has value 3. For little endian machine M value is 0. That means highest type value is 52. The MIME type of this format is unofficial, and it is used by Apache Tika

Specification

Software

See also

Links

Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox