H2database

From Just Solve the File Format Problem
Jump to: navigation, search
File Format
Name H2database
Ontology
Extension(s) .db

h2database (H2) is an open source JAVA SQL database that can be used as an embedded, server, or in-memory database.

Contents

Embedded format

H2's embedded format is a MVStore file. An example header is as follows:

00000000: 48 3a 32 2c 62 6c 6f 63 6b 3a 33 2c 62 6c 6f 63  H:2,block:3,bloc
00000010: 6b 53 69 7a 65 3a 31 30 30 30 2c 63 68 75 6e 6b  kSize:1000,chunk
00000020: 3a 37 2c 63 72 65 61 74 65 64 3a 31 38 61 39 66  :7,created:18a9f
00000030: 61 34 64 65 35 39 2c 66 6f 72 6d 61 74 3a 32 2c  a4de59,format:2,
00000040: 76 65 72 73 69 6f 6e 3a 37 2c 66 6c 65 74 63 68  version:7,fletch
00000050: 65 72 3a 33 35 34 65 34 38 65 31 0a 00 00 00 00  er:354e48e1.....

From the documentation:

There are two file headers, which normally contain the exact same data. But once in a while, the file headers are updated, and writing could partially fail, which could corrupt a header. That's why there is a second header. Only the file headers are updated in this way (called "in-place update"). The headers contain the following data:
The data is stored in one file. The file contains two file headers (for safety), and a number of chunks. The file headers are one block each; a block is 4096 bytes.

The fields are broken down in version 1 of the docs as follows:

  • H: The entry "H:2" stands for the H2 database.
  • block: The block number where one of the newest chunks starts (but not necessarily the newest).
  • blockSize: The block size of the file; currently always hex 1000, which is decimal 4096, to match the * * disk sector length of modern hard disks.
  • chunk: The chunk id, which is normally the same value as the version; however, the chunk id might roll over to 0, while the version doesn't.
  • created: The number of milliseconds since 1970 when the file was created.
  • format: The file format number. Currently 1.
  • version: The version number of the chunk.
  • fletcher: The Fletcher-32 checksum of the header.


Versioning

From the developer [1], the versioning doesn't map 1:1 onto major releases. It maps as follows:

No, it doesn't. Historic version of H2 (1.4.200 and older) use format 1, H2 2.0.* and 2.1.* use format 2, H2 2.2.* uses format 3. You cannot read actual version of H2 from headers of MVStore files, MVStore doesn't preserve it, only H2 does.

Format signature development therefore might be as granular as overall "version" but it won't map to specific software.

Additional files in H2's persistent implementation

H2's database objects are described as follows. It may be important to identify these together in a preservation context but the documentation is unclear as to whether the database file can be interpreted in its own right.

File Name Description Number of Files
test.mv.db Database file. Contains the transaction log, indexes, and data for all tables. Format: <database>.mv.db 1 per database
test.newFile Temporary file for database compaction. Contains the new MVStore file. Format: <database>.newFile 0 or 1 per database
test.tempFile Temporary file for database compaction. Contains the temporary MVStore file. Format: <database>.tempFile 0 or 1 per database
test.lock.db Database lock file.

Automatically (re-)created while the database is in use. Format: <database>.lock.db

1 per database (only if in use)
test.trace.db Trace file (if the trace option is enabled). Contains trace information. Format: <database>.trace.db Renamed to<database>.trace.db.old if too big. 0 or 1 per database
test.123.temp.db Temporary file. Contains a temporary blob or a large result set. Format: <database>.<id>.temp.db 1 per object


Use in digital preservation

Fedora 6.0's webapp is backed by a H2 database by default.

Further information

Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox