File identification software
From Just Solve the File Format Problem
				
								
				(Difference between revisions)
				
																
				
				
								
				Rhetoric X  (Talk | contribs) m  | 
			AndyJackson  (Talk | contribs)   (Added Apache Tika to the list.)  | 
			||
| Line 1: | Line 1: | ||
| + | Software that automates the process of [[Identifying Files]].  | ||
| + | |||
| + | * [[Apache Tika]] (Java, cross-platform, open source, [http://http://tika.apache.org/ website]): "The Apache Tika™ toolkit detects and extracts metadata and structured text content from various documents using existing parser libraries."  | ||
* [[DROID]] (cross-platform, open source, [http://digital-preservation.github.com/droid/ website]): ''"DROID is a software tool developed by The National Archives [of the United Kingdom] to perform automated batch identification of file formats."'' Requires Java 6, will not run on Java 7 as of 28 Oct 2012.  | * [[DROID]] (cross-platform, open source, [http://digital-preservation.github.com/droid/ website]): ''"DROID is a software tool developed by The National Archives [of the United Kingdom] to perform automated batch identification of file formats."'' Requires Java 6, will not run on Java 7 as of 28 Oct 2012.  | ||
* [[File command]] (various implementations): a standard Unix command, found on almost all Unix and Unix-like (i.e., Linux) systems. See the [http://manpages.debian.net/cgi-bin/man.cgi?query=file&apropos=0&sektion=0&manpath=Debian+6.0+squeeze&format=html&locale=en Debian man page] for an overview.  | * [[File command]] (various implementations): a standard Unix command, found on almost all Unix and Unix-like (i.e., Linux) systems. See the [http://manpages.debian.net/cgi-bin/man.cgi?query=file&apropos=0&sektion=0&manpath=Debian+6.0+squeeze&format=html&locale=en Debian man page] for an overview.  | ||
* [[TrID]] (Windows/Linux, free for non-commercial use, [http://mark0.net/soft-trid-e.html website]): identifies files using a database of filetype signatures. Also has an [http://mark0.net/onlinetrid.aspx online version].  | * [[TrID]] (Windows/Linux, free for non-commercial use, [http://mark0.net/soft-trid-e.html website]): identifies files using a database of filetype signatures. Also has an [http://mark0.net/onlinetrid.aspx online version].  | ||
Revision as of 14:36, 29 October 2012
Software that automates the process of Identifying Files.
- Apache Tika (Java, cross-platform, open source, website): "The Apache Tika™ toolkit detects and extracts metadata and structured text content from various documents using existing parser libraries."
 - DROID (cross-platform, open source, website): "DROID is a software tool developed by The National Archives [of the United Kingdom] to perform automated batch identification of file formats." Requires Java 6, will not run on Java 7 as of 28 Oct 2012.
 - File command (various implementations): a standard Unix command, found on almost all Unix and Unix-like (i.e., Linux) systems. See the Debian man page for an overview.
 - TrID (Windows/Linux, free for non-commercial use, website): identifies files using a database of filetype signatures. Also has an online version.