FASTA and FASTQ

From Just Solve the File Format Problem
Revision as of 13:16, 13 October 2014 by Dan Tobias (Talk | contribs)

Jump to: navigation, search
File Format
Name FASTA and FASTQ
Ontology
Extension(s) .fasta, .fas, .fa, .seq, .fsa, .fna, .ffn, .faa, .mpfa, .frn, .fastq

FASTA and FASTQ are text-based formats for representing nucleotide (DNA or RNA) or peptide sequences, used in biology. The FASTA format is a simple representation of the elements of the sequences using letters (the standard C, G, T, and A for DNA nucleotides, and other letters for special uses, as well as a set of letters for peptides in amino acids), while FASTQ also encodes quality scores for the data.

File extensions

A number of extensions are used, and they are not always completely standardized.

  • .fasta, .fas, .fa, .seq, .fsa: Generic FASTA
  • .fna: FASTA nucleic acids
  • .ffn: FASTA nucleotide coding regions for a genome
  • .faa FASTA amino acids
  • .mpfa: FASTA amino acides in multiple proteins
  • .frn: FASTA non-coding RNA
  • .fastq: FASTQ

Files may also be distributed in compressed forms, adding second extensions such as .fastq.gz.

Links

Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox