H2database
From Just Solve the File Format Problem
Revision as of 15:30, 17 September 2023 by Ross-spencer (Talk | contribs)
h2database (H2) is an open source JAVA SQL database that can be used as an embedded, server, or in-memory database.
Contents |
Embedded format
H2's embedded format is a MVStore file. An example header is as follows:
00000000: 48 3a 32 2c 62 6c 6f 63 6b 3a 33 2c 62 6c 6f 63 H:2,block:3,bloc 00000010: 6b 53 69 7a 65 3a 31 30 30 30 2c 63 68 75 6e 6b kSize:1000,chunk 00000020: 3a 37 2c 63 72 65 61 74 65 64 3a 31 38 61 39 66 :7,created:18a9f 00000030: 61 34 64 65 35 39 2c 66 6f 72 6d 61 74 3a 32 2c a4de59,format:2, 00000040: 76 65 72 73 69 6f 6e 3a 37 2c 66 6c 65 74 63 68 version:7,fletch 00000050: 65 72 3a 33 35 34 65 34 38 65 31 0a 00 00 00 00 er:354e48e1.....
From the documentation:
There are two file headers, which normally contain the exact same data. But once in a while, the file headers are updated, and writing could partially fail, which could corrupt a header. That's why there is a second header. Only the file headers are updated in this way (called "in-place update"). The headers contain the following data:
The data is stored in one file. The file contains two file headers (for safety), and a number of chunks. The file headers are one block each; a block is 4096 bytes.
The fields are broken down in version 1 of the docs as follows:
- H: The entry "H:2" stands for the H2 database.
- block: The block number where one of the newest chunks starts (but not necessarily the newest).
- blockSize: The block size of the file; currently always hex 1000, which is decimal 4096, to match the * * disk sector length of modern hard disks.
- chunk: The chunk id, which is normally the same value as the version; however, the chunk id might roll over to 0, while the version doesn't.
- created: The number of milliseconds since 1970 when the file was created.
- format: The file format number. Currently 1.
- version: The version number of the chunk.
- fletcher: The Fletcher-32 checksum of the header.
Versioning
From the developer [1], the versioning doesn't map 1:1 onto major releases. It maps as follows:
No, it doesn't. Historic version of H2 (1.4.200 and older) use format 1, H2 2.0.* and 2.1.* use format 2, H2 2.2.* uses format 3. You cannot read actual version of H2 from headers of MVStore files, MVStore doesn't preserve it, only H2 does. </quote> Format signature development therefore might be as granular as overall "version" but it won't map to specific software.Additional files in H2's persistent implementation
H2's database objects are described as follows. It may be important to identify these together in a preservation context but the documentation is unclear as to whether the database file can be interpreted in its own right.
File Name Description Number of Files test.mv.db Database file. Contains the transaction log, indexes, and data for all tables. Format: <database>.mv.db 1 per database test.newFile Temporary file for database compaction. Contains the new MVStore file. Format: <database>.newFile 0 or 1 per database test.tempFile Temporary file for database compaction. Contains the temporary MVStore file. Format: <database>.tempFile 0 or 1 per database test.lock.db Database lock file. Automatically (re-)created while the database is in use. Format: <database>.lock.db
1 per database (only if in use) test.trace.db Trace file (if the trace option is enabled). Contains trace information. Format: <database>.trace.db Renamed to<database>.trace.db.old if too big. 0 or 1 per database test.123.temp.db Temporary file. Contains a temporary blob or a large result set. Format: <database>.<id>.temp.db 1 per object
Use in digital preservation
Fedora 6.0's webapp is backed by a H2 database by default.
Further information