Scientific Data formats

From Just Solve the File Format Problem
(Difference between revisions)
Jump to: navigation, search
(Added an "Earth Sciences" section consisting mostly of redlinks (cannibalizing from the signal data section in the process) - I intend to fill most of these out when I have more time)
(Spectra)
(41 intermediate revisions by 5 users not shown)
Line 6: Line 6:
 
}}
 
}}
  
See also [[Health and Medicine]] for medical/biomedical data formats.
+
See also [[Health and Medicine]] for medical/biomedical data formats, and also see [[Engineering]].
  
 
== General ==
 
== General ==
Line 26: Line 26:
 
== Astronomical and Space ==
 
== Astronomical and Space ==
 
* [[Advanced Scientific Data Format]]
 
* [[Advanced Scientific Data Format]]
 +
* [[CPA (PRISM)]]
 
* [[Flexible Image Transport System]] (FITS)
 
* [[Flexible Image Transport System]] (FITS)
 
** [[PSRFITS]] (Pulsar data storage standard)
 
** [[PSRFITS]] (Pulsar data storage standard)
Line 52: Line 53:
 
* [[ARLEQUIN Project Format]]
 
* [[ARLEQUIN Project Format]]
 
* [[Axt Alignment Format]]
 
* [[Axt Alignment Format]]
* [[BAM]] (Binary compressed SAM format)
+
* [[BAM (Binary Alignment Map)|BAM]] (Binary compressed SAM format)
 
* [[BED]] (Browser extensible display format describing genes and other features of DNA sequences)
 
* [[BED]] (Browser extensible display format describing genes and other features of DNA sequences)
 
* [[BEDgraph]]
 
* [[BEDgraph]]
Line 72: Line 73:
 
* [[Clustered Data Table Format]]
 
* [[Clustered Data Table Format]]
 
* [[Complete Genomics]]
 
* [[Complete Genomics]]
 +
* [[CRAM]]
 
* [[DELTA]] (DEscription Language for TAxonomy)
 
* [[DELTA]] (DEscription Language for TAxonomy)
 
* [[DAS]] (Distributed Sequence Annotation System)
 
* [[DAS]] (Distributed Sequence Annotation System)
Line 93: Line 95:
 
* [[HMMER]]
 
* [[HMMER]]
 
* [[ICB]] (ICM binary file Format)
 
* [[ICB]] (ICM binary file Format)
 +
* [[Image Cytometry Experiment]] (ICE)
 +
* [[Image Cytometry Standard]] (ICS)
 
* [[imzML]] (imaging mz Markup Language)
 
* [[imzML]] (imaging mz Markup Language)
 
* [[ISA-Tab]] (Investigation Study Assay Tabular)
 
* [[ISA-Tab]] (Investigation Study Assay Tabular)
Line 140: Line 144:
 
* [[SDD]] (Structured Descriptive Data)
 
* [[SDD]] (Structured Descriptive Data)
 
* [[SED-ML]] (Simulation Experiment Description Markup Language)
 
* [[SED-ML]] (Simulation Experiment Description Markup Language)
* [[Sequence Alignment Map Format]]
 
 
* [[SOFT]] (Simple Omnibus Format in Text)
 
* [[SOFT]] (Simple Omnibus Format in Text)
 
* [[spML]] (Separation Markup Language)
 
* [[spML]] (Separation Markup Language)
Line 160: Line 163:
 
== Chemical ==
 
== Chemical ==
 
* [[CCP4]] (X-ray crystallography voxels (electron density))
 
* [[CCP4]] (X-ray crystallography voxels (electron density))
* [[CDX]] (ChemDraw file format)
+
* [[CDX (ChemDraw Exchange)|CDX]] (ChemDraw file format)
 
* [[CDXML]] (ChemDraw file format)
 
* [[CDXML]] (ChemDraw file format)
 
* [[CHM (ChemDraw)|CHM]] (ChemDraw file format)
 
* [[CHM (ChemDraw)|CHM]] (ChemDraw file format)
Line 173: Line 176:
 
* [[MST]] ACD/ChemSketch v1 file format
 
* [[MST]] ACD/ChemSketch v1 file format
 
* [[Protein Data Bank]] (PDB)
 
* [[Protein Data Bank]] (PDB)
* [[RPT]] ACD/ChemSketch v1 file format
+
* [[RPT (OpenLynx)]] Waters OpenLynx reports
 
* [[RXN]] (Reaction file format)
 
* [[RXN]] (Reaction file format)
 
* [[SK2]] (ACD/ChemSketch v2 file format)
 
* [[SK2]] (ACD/ChemSketch v2 file format)
Line 181: Line 184:
 
* [[Structure Data File]] (SDF)
 
* [[Structure Data File]] (SDF)
 
* [[TGF]] (ISIS/Draw reaction file format)
 
* [[TGF]] (ISIS/Draw reaction file format)
 +
* [[XYZ Chem]] [https://en.wikipedia.org/wiki/XYZ_file_format Wiki]
  
 
Chemical data may be distinguished in various ways, including [http://www.ch.ic.ac.uk/chemime/ Chemical MIME] types.
 
Chemical data may be distinguished in various ways, including [http://www.ch.ic.ac.uk/chemime/ Chemical MIME] types.
Line 190: Line 194:
 
* [[SEED]]
 
* [[SEED]]
 
* [[SEG Y]] (Reflection seismology data format)
 
* [[SEG Y]] (Reflection seismology data format)
 +
* [[SEIS-PROV]]
 +
* [[StationXML]]
  
 
== Ecological ==
 
== Ecological ==
Line 195: Line 201:
 
* [[Electronic Data Deliverable]] (EDD; EPA Superfund)
 
* [[Electronic Data Deliverable]] (EDD; EPA Superfund)
 
* [[EML (Ecological Metadata Language)]], not to be confused with [[EML (Environmental Markup Language)]]
 
* [[EML (Ecological Metadata Language)]], not to be confused with [[EML (Environmental Markup Language)]]
 +
 +
== Environmental ==
 +
* [[HYT]] (AquiferTest)
  
 
== Geographic and Geospatial ==
 
== Geographic and Geospatial ==
Line 221: Line 230:
 
* [[graph6, sparse6]] (ASCII encoding of Adjacency matrices (.g6, .s6))
 
* [[graph6, sparse6]] (ASCII encoding of Adjacency matrices (.g6, .s6))
 
* [[graphML]] (Graph Markup Language)
 
* [[graphML]] (Graph Markup Language)
 +
* [[JMP]] (.jmp)
 +
* [[KaleidaGraph]] (.qda, .qdc)
 +
* [[Life 1.05]]
 +
* [[Life 1.06]]
 
* [[MacWavelets]]
 
* [[MacWavelets]]
 
* Mathematica
 
* Mathematica
Line 227: Line 240:
 
** [[Mathematica package file]] (M)
 
** [[Mathematica package file]] (M)
 
** [[Wolfram Language]]
 
** [[Wolfram Language]]
 +
* [[Macrocell]]
 +
* [[MCell]]
 
* [[MathML]]
 
* [[MathML]]
 
* MATLAB
 
* MATLAB
Line 232: Line 247:
 
** [[Matlab figure]]
 
** [[Matlab figure]]
 
** [[MATLAB script file]] (m)
 
** [[MATLAB script file]] (m)
 +
** [[Matlab Model]] (.mdl, .slx)
 +
*[[Minitab]] (.mtw, .mpj)
 
* [[OPJ]] (Origin data format)
 
* [[OPJ]] (Origin data format)
 +
* [[PDL]] (Perl Data Language)
 +
* [[Plaintext (cellular automata)]]
 +
* [[RLE (cellular automata)]]
 +
* [[Rule (Golly)]]
 +
* [[Small Object Format]]
 
* [[Statistica]]
 
* [[Statistica]]
 +
** [[CSS Software]] (Complete Statistical System)
 +
** [[CSS STATISTICA]]
 
* [[WP2]] WinPlot
 
* [[WP2]] WinPlot
  
Line 243: Line 267:
 
* [[BioRad confocal image]]
 
* [[BioRad confocal image]]
 
* [[DeltaVision]]
 
* [[DeltaVision]]
* [[dm2]] (Gatan Digital Micrograph 2)
+
* [[DM2]] (Gatan Digital Micrograph 2)
* [[dm3]] (Gatan Digital Micrograph 3) ({{PRONOM|fmt/1131}})
+
* [[DM3]] (Gatan Digital Micrograph 3)
 +
* [[DM4]] (Gatan Digital Micrograph 4)
 
* [[GATAN]]
 
* [[GATAN]]
 +
* [[HMSA]] (.msa)
 +
* [[Image Cytometry Experiment]] (ICE)
 
* [[Image Cytometry Standard]] (ICS)
 
* [[Image Cytometry Standard]] (ICS)
 
* [[KONTRON]]
 
* [[KONTRON]]
Line 257: Line 284:
 
* [[VGS-8]]
 
* [[VGS-8]]
 
* [[Zeiss BIVAS]]
 
* [[Zeiss BIVAS]]
 +
 +
== Neutron and X-ray Scattering ==
 +
 +
* [[canSAS]] (tools for small-angle scattering)
 +
* [[CIF]] (Crystallographic Information File, standardised by IUCr)
 +
* [[NeXus]] (NeXus is a common data format for neutron, x-ray, and muon science)
  
 
== Oceanographic, Atmospheric and Meteorological ==
 
== Oceanographic, Atmospheric and Meteorological ==
Line 289: Line 322:
  
 
* [[Atlas.ti]] ([[Computer-assisted qualitative data analysis]] package)
 
* [[Atlas.ti]] ([[Computer-assisted qualitative data analysis]] package)
* [[DDI]] (Data Documentation Initiative)
+
* [[DDI (Data Documentation Initiative)|DDI]] (Data Documentation Initiative)
 
* [[DO]] ("DO file" command script for the [[Stata]] Statistical package)
 
* [[DO]] ("DO file" command script for the [[Stata]] Statistical package)
 
* [[DTA]] (Binary data file for the [[Stata]] Statistical package)
 
* [[DTA]] (Binary data file for the [[Stata]] Statistical package)
 +
* [[Linguistic Annotation Framework]] (LAF; used by computational linguists to annotate language samples)
 
* [[M2k]] (MAXQDA)
 
* [[M2k]] (MAXQDA)
 
* [[NVivo]] ([[Computer-assisted qualitative data analysis]] package)
 
* [[NVivo]] ([[Computer-assisted qualitative data analysis]] package)
 
* [[R]] (Statistical package)
 
* [[R]] (Statistical package)
 
* [[SAS]] (Statistical package)
 
* [[SAS]] (Statistical package)
 +
** [[SAS Transport File]] (.xpt)
 
* [[SAV]] (Binary "[[SPSS]] data format" for the [[SPSS]] Statistical package)
 
* [[SAV]] (Binary "[[SPSS]] data format" for the [[SPSS]] Statistical package)
 
* [[SPO]] (Output file for the [[SPSS]] Statistical package - version 14)
 
* [[SPO]] (Output file for the [[SPSS]] Statistical package - version 14)
Line 301: Line 336:
 
* [[SPV]] (Output file for the [[SPSS]] Statistical package - version 17 and later)
 
* [[SPV]] (Output file for the [[SPSS]] Statistical package - version 17 and later)
 
* [[Transana]] ([[Computer-assisted qualitative data analysis]] package)
 
* [[Transana]] ([[Computer-assisted qualitative data analysis]] package)
 +
 +
== Spectra ==
 +
* [[Bruker]] (XRF software, .pdz)
 +
* [[Niton]] (XRF software, .ndt)
 +
* [[EDAX Spectrum]] (.spc)
 +
* [[Thermo Scientific SPC]] (.spc)
 +
* [[EMSA/MAS]]
 +
* [[HMSA Hyper-Dimensional Data]]
  
 
== Miscellaneous ==
 
== Miscellaneous ==
  
 
* [[AIML]] (Artificial Intelligence Markup Language)
 
* [[AIML]] (Artificial Intelligence Markup Language)
 +
* [[IES]] (IESNA LM-63 Photometric Data File)
 
* [[Jupyter Notebook]] (.ipynb)
 
* [[Jupyter Notebook]] (.ipynb)
  
 
== Links ==
 
== Links ==
 
* [http://cameronneylon.net/blog/improving-on-access-to-research/ Improving on “Access to Research”]
 
* [http://cameronneylon.net/blog/improving-on-access-to-research/ Improving on “Access to Research”]
 +
* [[WikiBooks:Software Tools For Molecular Microscopy]]

Revision as of 01:34, 19 January 2022

File Format
Name Scientific Data formats
Ontology

Mad scientist from 1940 movie

Mad scientist from 1940 movie

See also Health and Medicine for medical/biomedical data formats, and also see Engineering.

Contents

General

  • Common Data Format (CDF)
  • EAS3 (binary file format for structured data)
  • HDF (Hierarchical Data Format, originally from NCSA, now maintained by The HDF Group)
  • NRRD (Nearly Raw Raster Data -- a simple format for n-dimensional raster data)
  • NetCDF (Network Common Data Format)
  • ROOT (CERN data-analysis package and related formats, used in their Open Data initiative)
  • SDXF (Structured Data Exchange Format)
  • Silo (a storage format for visualization developed at Lawrence Livermore National Laboratory)
  • Simple Data format (SDF) By George H. Fisher, Space Sciences Lab, UC Berkeley (A platform-independent, precision-preserving binary data I/O format capable of handling large, multi-dimensional arrays)
  • Standard Delay Format (SDF) A standard data structure for timing data
  • XDF (eXtensible Data Format)
  • XSIL (Extensible Scientific Interchange Language)

Astronomical and Space

Biological

Chemical

  • CCP4 (X-ray crystallography voxels (electron density))
  • CDX (ChemDraw file format)
  • CDXML (ChemDraw file format)
  • CHM (ChemDraw file format)
  • CIF (Crystallographic Information File, standardised by IUCr)
  • CML (Chemical markup language)
  • CTab (Chemical table file .mol, .sd, .sdf)
  • HITRAN (spectroscopic data with one optical/infrared transition per line in the ASCII file (.hit))
  • JCAMP (Joint Committee on Atomic and Molecular Physical Data, .dx, .jdx)
  • MOL (MDL Molfile)
  • MOP (MOPAC format)
  • MRC (voxels in cryo-electron microscopy)
  • MST ACD/ChemSketch v1 file format
  • Protein Data Bank (PDB)
  • RPT (OpenLynx) Waters OpenLynx reports
  • RXN (Reaction file format)
  • SK2 (ACD/ChemSketch v2 file format)
  • SKC (ISIS/Draw file format)
  • SMILES (Simplified molecular input line entry specification, .smi)
  • SPC (Spectroscopic Data)
  • Structure Data File (SDF)
  • TGF (ISIS/Draw reaction file format)
  • XYZ Chem Wiki

Chemical data may be distinguished in various ways, including Chemical MIME types.

Earth Sciences

Ecological

Environmental

  • HYT (AquiferTest)

Geographic and Geospatial

See also Geospatial

  • DEM (Digital Elevation Model)
  • DOQ (Digital Orthophotos)
  • e00 (ESRI ArcInfo Interchange File)
  • FGDC (Content Standard for Digital Geospatial Metadata??)
  • GeoTIFF (Geospatial extensions to TIFF)
  • GML (Geography Markup Language)
  • HDFEOS, HD2, HD4 (Hierarchical Data Format-Earth Observing System)
  • KML (KML (formerly Keyhole Markup Language), Version 2.2)
  • NDF (National Landsat Archive Production System (NLAPS) Data Format)
  • SAIF (Spatial Archive and Interchange Format, Canadian)
  • SDTS (Spatial Data Transfer Standard)
  • Shapefile (ESRI, shp/shx)
  • MrSID (MrSID- Multi-resolution Seamless Image Database)
  • TAB (MapInfo dataset format, must have component)

Mathematical

Microscopy

Neutron and X-ray Scattering

  • canSAS (tools for small-angle scattering)
  • CIF (Crystallographic Information File, standardised by IUCr)
  • NeXus (NeXus is a common data format for neutron, x-ray, and muon science)

Oceanographic, Atmospheric and Meteorological

  • GRIB (Gridded Binary)
  • BUFR (Binary Universal Format Representation)
  • IOAPI (netCDF augmented with metadata from the I/O API)
  • Meteosat data
  • PP (UK Met Office format for weather model data)

Physics

See subcategory Physics data

Scientific Signal data

  • ACQ (AcqKnowledge File Format for Windows)
  • BioSemi (BDF) data format
  • BKR (EEG data format)
  • CFWB (Chart Data File Format)
  • EDF (European data format)
  • FEF (File Exchange Format for Vital signs)
  • General Data Format for Biosignals (GDF)
  • GMS (Gesture And Motion Signal format)
  • IROCK (intelliRock Sensor Data File Format)
  • MFER (Medical waveform Format Encoding Rules)
  • REC (ATI Vision recorder file)
  • SCP-ECG (Standard Communication Protocol for Computer assisted electrocardiography)
  • SIGIF (SIGnal Interchange Format)

Social Sciences

Spectra

Miscellaneous

Links

Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox