H2database
From Just Solve the File Format Problem
Revision as of 10:42, 17 September 2023 by Ross-spencer (Talk | contribs)
h2database (H2) is an open source JAVA SQL database that can be used as an embedded, server, or in-memory database.
Contents |
Embedded format
H2's embedded format is a MVStore file. An example header is as follows:
00000000: 48 3a 32 2c 62 6c 6f 63 6b 3a 33 2c 62 6c 6f 63 H:2,block:3,bloc 00000010: 6b 53 69 7a 65 3a 31 30 30 30 2c 63 68 75 6e 6b kSize:1000,chunk 00000020: 3a 37 2c 63 72 65 61 74 65 64 3a 31 38 61 39 66 :7,created:18a9f 00000030: 61 34 64 65 35 39 2c 66 6f 72 6d 61 74 3a 32 2c a4de59,format:2, 00000040: 76 65 72 73 69 6f 6e 3a 37 2c 66 6c 65 74 63 68 version:7,fletch 00000050: 65 72 3a 33 35 34 65 34 38 65 31 0a 00 00 00 00 er:354e48e1.....
From the documentation:
There are two file headers, which normally contain the exact same data. But once in a while, the file headers are updated, and writing could partially fail, which could corrupt a header. That's why there is a second header. Only the file headers are updated in this way (called "in-place update"). The headers contain the following data:
The data is stored in one file. The file contains two file headers (for safety), and a number of chunks. The file headers are one block each; a block is 4096 bytes.
The fields are broken down in version 1 of the docs as follows:
- H: The entry "H:2" stands for the H2 database.
- block: The block number where one of the newest chunks starts (but not necessarily the newest).
- blockSize: The block size of the file; currently always hex 1000, which is decimal 4096, to match the * * disk sector length of modern hard disks.
- chunk: The chunk id, which is normally the same value as the version; however, the chunk id might roll over to 0, while the version doesn't.
- created: The number of milliseconds since 1970 when the file was created.
- format: The file format number. Currently 1.
- version: The version number of the chunk.
- fletcher: The Fletcher-32 checksum of the header.
Additional files in H2's persistent implementation
H2's database objects are described as follows. It may be important to identify these together in a preservation context but the documentation is unclear as to whether the database file can be interpreted in its own right.
File Name | Description | Number of Files |
---|---|---|
test.mv.db | Database file. Contains the transaction log, indexes, and data for all tables. Format: <database>.mv.db | 1 per database |
test.newFile | Temporary file for database compaction. Contains the new MVStore file. Format: <database>.newFile | 0 or 1 per database |
test.tempFile | Temporary file for database compaction. Contains the temporary MVStore file. Format: <database>.tempFile | 0 or 1 per database |
test.lock.db | Database lock file.
Automatically (re-)created while the database is in use. Format: <database>.lock.db |
1 per database (only if in use) |
test.trace.db | Trace file (if the trace option is enabled). Contains trace information. Format: <database>.trace.db Renamed to<database>.trace.db.old if too big. | 0 or 1 per database |
test.123.temp.db | Temporary file. Contains a temporary blob or a large result set. Format: <database>.<id>.temp.db | 1 per object |
Use in digital preservation
Fedora 6.0's webapp is backed by a H2 database by default.