Category:File Format Identification

From Just Solve the File Format Problem
(Difference between revisions)
Jump to: navigation, search
m
Line 1: Line 1:
The purpose of File Format Identification is to determine the file format of a digital object. Most file format identification software makes use of [[signatures]], also known as [[magic bytes]]. This method is more robust than "traditional" file format identification where a file its  [[extension]] or [[content header]] is used to determine the object type.
+
The purpose of File Format Identification is to determine the file format of a digital object.
 +
 
 +
Basically, there are 3 ways to identify a digital object.
 +
==Extension==
 +
The file extension is used, e.g. ".doc" if the file is named "example.doc". This tells us thie file ''might'' be a Word document.
 +
==Content Header==
 +
The content header is used to determine the file type based on the [[mime-type]]. As an example, your browser identified this webpage to be a "text/html" file.
 +
==Signature==
 +
The file is scanned for certain bytes, just as one would scan text looking for a keyword. Academic and forensic file format identification software make use of [[signatures]], also known as [[magic bytes]]. Each signature is assigned a unique identifier, such as a [[PUID]].
 +
 
 +
Furthermore, all anti-virus software use signatures to detect viruses, next to using [[semantics]].

Revision as of 04:03, 6 November 2012

The purpose of File Format Identification is to determine the file format of a digital object.

Basically, there are 3 ways to identify a digital object.

Extension

The file extension is used, e.g. ".doc" if the file is named "example.doc". This tells us thie file might be a Word document.

Content Header

The content header is used to determine the file type based on the mime-type. As an example, your browser identified this webpage to be a "text/html" file.

Signature

The file is scanned for certain bytes, just as one would scan text looking for a keyword. Academic and forensic file format identification software make use of signatures, also known as magic bytes. Each signature is assigned a unique identifier, such as a PUID.

Furthermore, all anti-virus software use signatures to detect viruses, next to using semantics.

Pages in category "File Format Identification"

The following 18 pages are in this category, out of 18 total.

Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox