Scientific Data formats

From Just Solve the File Format Problem
(Difference between revisions)
Jump to: navigation, search
(Added a swag of Biological formats)
(Medical Imaging: added OME-TIFF)
Line 73: Line 73:
== Medical Imaging ==
== Medical Imaging ==
* [[DICOM]]
* [[DICOM]]
* [[OME-TIFF] (Open Microscopy Imaging format)
== Oceanographic, Atmospheric and Meteorological ==
== Oceanographic, Atmospheric and Meteorological ==

Revision as of 02:54, 3 November 2012

File Formats > Electronic File Formats > Scientific Data formats



  • cdf (Common Data Format)
  • hdf (Hierarchical Data Format, from NASA)
  • NetCDF (Network Common Data Format)
  • SDXF (Structured Data Exchange Format)
  • XDF (eXtensible Data Format)
  • XSIL (Extensible Scientific Interchange Language)

Astronomical and Space

  • FITS (Flexible Image Transport System)
  • PDS/ODL (Planetary Data System)


  • AB1 (Chromatogram files used by DNA sequencing instruments from Applied Biosystems)
  • ACE (Sequence assembly format)
  • BAM (Binary compressed SAM format)
  • BED (Browser extensible display format describing genes and other features of DNA sequences)
  • CAF (Common Assembly Format for sequence assembly)
  • EMBL (Flatfile format used by the EMBL for nucleotide and peptide sequences)
  • FASTA and FASTQ (File format for sequence data, FASTQ with quality).
  • GenBank (Flatfile format used by NCBI for nucleotide and peptide sequences)
  • GFF (General feature format for describing genes and other features of DNA, RNA and protein sequences)
  • GTF (Gene transfer format holds information about gene structure)
  • NEXUS (Encodes mixed information about genetic sequence data in a block structured format)
  • PDB (Structures of biomolecules deposited in Protein Data Bank)
  • PHD (Output from the basecalling software Phred)
  • SAM (Sequence Alignment/Map format)
  • SCF (Staden chromatogram files used to store data from DNA sequencing)
  • SBML (Systems Biology Markup Language used to store biochemical network computational models)
  • Stockholm (Representing multiple sequence alignments)
  • Swiss-Prot (Flatfile format used for protein sequences from the Swiss-Prot database)
  • VCF (Variant Call Format)



  • Darwin Core (Standard for sharing information about biological diversity)
  • EML (Ecological Metadata Language)

Geographic and Geospatial

See also Geospatial

  • DEM (Digital Elevation Model)
  • DOQ (Digital Orthophotos)
  • e00 (ESRI ArcInfo Interchange File)
  • FGDC (Content Standard for Digital Geospatial Metadata??)
  • GeoTIFF (Geospatial extensions to TIFF)
  • GML (Geography Markup Language)
  • HDFEOS, HD2, HD4 (Hierarchical Data Format-Earth Observing System)
  • KML (KML (formerly Keyhole Markup Language), Version 2.2)
  • NDF (National Landsat Archive Production System (NLAPS) Data Format)
  • SAIF (Spatial Archive and Interchange Format, Canadian)
  • SDTS (Spatial Data Transfer Standard)
  • shp and shx (ESRI Shaepfile must have components; other optional components as well, see entry)
  • SID (MrSID- Multi-resolution Seamless Image Database)
  • TAB (MapInfo dataset format, must have component)


Medical Imaging

  • [[OME-TIFF] (Open Microscopy Imaging format)

Oceanographic, Atmospheric and Meteorological

  • GRIB (Grid in Binary)
  • BUFR (Binary Universal Format Representation)
  • IOAPI (netCDF augmented with metadata from the I/O API)
  • PP (UK Met Office format for weather model data)


  • CGNS (Computational Fluid Dynamics General Notation System)
  • NeXuS (Common data format for neutron, x-ray and muon science)
  • QCDml (Lattice QCD gauge configuration markup language)

Social Sciences

  • DDI (Data Documentation Initiative)
  • SAS (Statistical package)
  • SPSS (Statistical package)
Personal tools