# Canonical Huffman code

From Just Solve the File Format Problem

(Difference between revisions)

m |
m |
||

Line 5: | Line 5: | ||

A '''Canonical Huffman code''' is essentially a set of tiebreaking rules that establish a single "canonical" Huffman codebook (see [[Huffman coding]]), given the set of encoded symbols and their corresponding code lengths (in bits). The main benefit of using a canonical Huffman code is that it reduces the amount of information needed to represent a Huffman codebook. Specifically, you only need to know the code length of each symbol. | A '''Canonical Huffman code''' is essentially a set of tiebreaking rules that establish a single "canonical" Huffman codebook (see [[Huffman coding]]), given the set of encoded symbols and their corresponding code lengths (in bits). The main benefit of using a canonical Huffman code is that it reduces the amount of information needed to represent a Huffman codebook. Specifically, you only need to know the code length of each symbol. | ||

− | From a file format perspective, the term is normally used with "static" (non-adaptive) Huffman compression schemes in which the Huffman codebook to use is stored in the file prior to the encoded data. Canonical Huffman coding is common, though not universal, in such formats. | + | From a file format perspective, the term is normally used with "static" (non-[[Adaptive Huffman coding|adaptive]]) Huffman compression schemes in which the Huffman codebook to use is stored in the file prior to the encoded data. Canonical Huffman coding is common, though not universal, in such formats. |

== Links == | == Links == | ||

* [[Wikipedia: Canonical Huffman code]] | * [[Wikipedia: Canonical Huffman code]] |

## Latest revision as of 11:40, 12 September 2020

A **Canonical Huffman code** is essentially a set of tiebreaking rules that establish a single "canonical" Huffman codebook (see Huffman coding), given the set of encoded symbols and their corresponding code lengths (in bits). The main benefit of using a canonical Huffman code is that it reduces the amount of information needed to represent a Huffman codebook. Specifically, you only need to know the code length of each symbol.

From a file format perspective, the term is normally used with "static" (non-adaptive) Huffman compression schemes in which the Huffman codebook to use is stored in the file prior to the encoded data. Canonical Huffman coding is common, though not universal, in such formats.