Identifying Files

From Just Solve the File Format Problem
Jump to: navigation, search

Once you have retrieved a file from its storage media, you'll need to identify what kind of information you now have access to. In some cases, like a phonograph record that has lost its label, you may have to find an expert on the media contained on the device to identify what song you now have (see Identifying Physical Media). However, when working with files generated by computer, there are several clues you can use to begin the process.


External signatures

File Extension

One of the best places to start if working with a file that is still stored under its original file name is to look at the extension. For many operating systems this will be the characters at the end of the name, often separated by a period. OFFICE.DXF would be a file of with a .DXF extension which we could use to discover it is likely an AutoCad file most likely storing a drafting drawing.

Creator and Type

The Macintosh from Apple did not use file extensions, but instead used 4 character creator and type codes.


Some systems, such as HTTP and MIME, use MIME types to identify a file's data type. However, Mime-types are of marginal use for identifying rare file types to humans. They may have been guessed from the file's filename extension or magic signature, which provides no new information.

Internal signatures

An internal signature is a distinctive pattern of bytes in the file's contents. Most often, it takes the form of a "magic signature" near the beginning of the file.

See File identification software for utilities that can help to identify files using such signatures.

Even without sophisticated software assistance, it may be possible to guess a file's format using a simple text editor or hex editor. For example, the second through fourth byte in every PNG file spells out "PNG" in ASCII.

Personal tools