Gwtar
From Just Solve the File Format Problem
Gwtar (pronounced like "guitar") is a format for storing a full webpage in a single HTML file, developed in 2026 by Gwern Branwen and Said Achmiz.
The format was developed as a convenient solution for archiving webpages, which would satisfy three criteria:
- the full contents of the webpage, including images, scripts, etc. must be stored locally;
- the webpage and all its assets must be stored as a single file;
- users can browse the archived webpage without having to first download the entire file (parts of the file with webpage assets are downloaded as needed).
A Gwtar file consists of three parts:
- the header - a combination of HTML, JavaScript, and JSON, which contains the actual HTML markup, as well as scripts which handle piece-by-piece download of the rest of the file;
- all of the webpage resources, stored in tarball format;
- any other arbitary data (e.g. metadata, electronic signatures, error correction codes).
Specifications
- Gwtar: a static efficient single-file HTML format, from gwern.net