Bzip2
Dan Tobias (Talk | contribs) (→Links) |
(→Software: XAD) |
||
(7 intermediate revisions by 4 users not shown) | |||
Line 6: | Line 6: | ||
|mimetypes={{mimetype|application/x-bzip2}} | |mimetypes={{mimetype|application/x-bzip2}} | ||
|pronom={{PRONOM|x-fmt/268}} | |pronom={{PRONOM|x-fmt/268}} | ||
+ | |wikidata={{wikidata|Q27866052}} | ||
|released=1997 | |released=1997 | ||
}} | }} | ||
− | '''bzip2''' is a data compression algorithm and compressed file format. | + | '''bzip2''' is a data compression algorithm and compressed file format. It was developed by Julian Seward. |
== Identification == | == Identification == | ||
− | + | A bzip2 file starts with the byte pattern {{magic|42 5a 68 ?? 31 41 59 26 53 59}}. | |
− | + | The first three bytes are ASCII "{{magic|BZh}}". (For signature "{{magic|BZ0}}", refer to the original [[bzip]] format.) The "<code>h</code>" has been said to stand for "Huffman coding", but confirmation is needed. | |
+ | |||
+ | The byte at offset 3 is a code for the block size. Its possible values range from <code>0x31</code> to <code>0x39</code> (ASCII "<code>1</code>" to "<code>9</code>"). | ||
+ | |||
+ | The bytes at offset 4-9 are derived from the digits of the mathematical constant π ([[Binary-coded decimal|BCD]]-encoded). | ||
+ | |||
+ | The end-of-file marker uses magic number (hex) {{magic|17 72 45 38 50 90}}, derived from the square root of π. However, it is not byte-aligned. The result is that one of the following byte sequences appears beginning 10 bytes from the end of the file: | ||
+ | |||
+ | b9 22 9c 28 48 | ||
+ | dc 91 4e 14 24 | ||
+ | ee 48 a7 0a 12 | ||
+ | 77 24 53 85 09 | ||
+ | bb 92 29 c2 84 | ||
+ | 5d c9 14 e1 42 | ||
+ | 2e e4 8a 70 a1 | ||
+ | 17 72 45 38 50 | ||
+ | |||
+ | == Specifications == | ||
+ | * [https://github.com/dsnet/compress/blob/master/doc/bzip2-format.pdf Unofficial specification by Joe Tsai] | ||
== Software == | == Software == | ||
− | * [ | + | * [https://sourceware.org/bzip2/ bzip2 and libbzip2] |
* [[7-Zip]] | * [[7-Zip]] | ||
+ | * {{XAD}} | ||
+ | |||
+ | == Sample files == | ||
+ | * {{DexvertSamples|archive/bz2}} | ||
== See also == | == See also == | ||
* [[Burrows–Wheeler transform]] | * [[Burrows–Wheeler transform]] | ||
− | * [[bzip]] | + | * [[bzip]] (predecessor) |
== Links == | == Links == | ||
* [[Wikipedia:Bzip2|Wikipedia article]] | * [[Wikipedia:Bzip2|Wikipedia article]] | ||
− | * [https:// | + | * [https://sourceware.org/bzip2/ bzip2 and libbzip2 website] |
− | * [https://lwn.net/Articles/762264/ bzip.org changes hands] | + | * [https://github.com/corkami/pics/blob/master/binary/BZ2.png Chart of format details] |
+ | * [https://lwn.net/Articles/762264/ bzip.org changes hands] (LWN article from August 9, 2018) | ||
+ | * [{{ForensicsWikiURL|bzip2}} ForensicsWiki entry] (also includes more details on the headers) | ||
+ | * [http://www.bzip.org/ bzip.org] |
Latest revision as of 10:37, 12 April 2024
bzip2 is a data compression algorithm and compressed file format. It was developed by Julian Seward.
Contents |
[edit] Identification
A bzip2 file starts with the byte pattern 42 5a 68 ?? 31 41 59 26 53 59
.
The first three bytes are ASCII "BZh
". (For signature "BZ0
", refer to the original bzip format.) The "h
" has been said to stand for "Huffman coding", but confirmation is needed.
The byte at offset 3 is a code for the block size. Its possible values range from 0x31
to 0x39
(ASCII "1
" to "9
").
The bytes at offset 4-9 are derived from the digits of the mathematical constant π (BCD-encoded).
The end-of-file marker uses magic number (hex) 17 72 45 38 50 90
, derived from the square root of π. However, it is not byte-aligned. The result is that one of the following byte sequences appears beginning 10 bytes from the end of the file:
b9 22 9c 28 48 dc 91 4e 14 24 ee 48 a7 0a 12 77 24 53 85 09 bb 92 29 c2 84 5d c9 14 e1 42 2e e4 8a 70 a1 17 72 45 38 50
[edit] Specifications
[edit] Software
[edit] Sample files
[edit] See also
- Burrows–Wheeler transform
- bzip (predecessor)
[edit] Links
- Wikipedia article
- bzip2 and libbzip2 website
- Chart of format details
- bzip.org changes hands (LWN article from August 9, 2018)
- ForensicsWiki entry (also includes more details on the headers)
- bzip.org