Category:File Format Identification
m (→Extension: typo) |
|||
Line 3: | Line 3: | ||
Basically, there are 3 ways to identify a digital object. | Basically, there are 3 ways to identify a digital object. | ||
==Extension== | ==Extension== | ||
− | The file extension is used, e.g. ".doc" if the file is named "example.doc". This tells us | + | The file extension is used, e.g. ".doc" if the file is named "example.doc". This tells us the file ''might'' be a Word document. |
+ | |||
==Content Header== | ==Content Header== | ||
The content header is used to determine the file type based on the [[mime-type]]. As an example, your browser identified this webpage to be a "text/html" file. | The content header is used to determine the file type based on the [[mime-type]]. As an example, your browser identified this webpage to be a "text/html" file. |
Revision as of 04:06, 6 November 2012
The purpose of File Format Identification is to determine the file format of a digital object.
Basically, there are 3 ways to identify a digital object.
Extension
The file extension is used, e.g. ".doc" if the file is named "example.doc". This tells us the file might be a Word document.
Content Header
The content header is used to determine the file type based on the mime-type. As an example, your browser identified this webpage to be a "text/html" file.
Signature
The file is scanned for certain bytes, just as one would scan text looking for a keyword. Academic and forensic file format identification software make use of signatures, also known as magic bytes. Each signature is assigned a unique identifier, such as a PUID.
Furthermore, all anti-virus software use signatures to detect viruses, next to using semantics.
Pages in category "File Format Identification"
The following 18 pages are in this category, out of 18 total.