SAM
From Just Solve the File Format Problem
(Difference between revisions)
Dan Tobias (Talk | contribs) |
Dan Tobias (Talk | contribs) |
||
Line 2: | Line 2: | ||
|formattype=electronic | |formattype=electronic | ||
|subcat=Scientific Data formats | |subcat=Scientific Data formats | ||
− | |extensions={{ext|sam}} | + | |extensions={{ext|sam}}, {{ext|sai}} |
}} | }} | ||
'''SAM''' (Sequence Alignment/Map) is a data format used for mapping sequences of DNA. It is in text (tab-separated) form, while A companion format [[BAM (Binary Alignment Map)|BAM]] is binary. It is classified as an alignment format, as is [[CRAM]]. This is as opposed to sequence-only, unaligned, formats such as [[FASTA and FASTQ]]. | '''SAM''' (Sequence Alignment/Map) is a data format used for mapping sequences of DNA. It is in text (tab-separated) form, while A companion format [[BAM (Binary Alignment Map)|BAM]] is binary. It is classified as an alignment format, as is [[CRAM]]. This is as opposed to sequence-only, unaligned, formats such as [[FASTA and FASTQ]]. | ||
+ | |||
+ | Indexes associated with a SAM file are called SAI. | ||
== Identification == | == Identification == |
Latest revision as of 03:51, 4 August 2020
SAM (Sequence Alignment/Map) is a data format used for mapping sequences of DNA. It is in text (tab-separated) form, while A companion format BAM is binary. It is classified as an alignment format, as is CRAM. This is as opposed to sequence-only, unaligned, formats such as FASTA and FASTQ.
Indexes associated with a SAM file are called SAI.
[edit] Identification
Files will often start with a header line (though this is technically optional), such as @HD VN:1.6 SO:coordinate
, where the VN parameter gives the version number of the format and this may be followed with a SO or G0 parameter giving sorting order or grouping of alignments.