Archive Team hostname file

From Just Solve the File Format Problem
Jump to: navigation, search
File Format
Name Archive Team hostname file
Ontology
Extension(s) .hostnames

When the Archive Team is preparing to archive data from a multi-user, multi-hostname site that's about to be terminated (e.g., Posterous), often an early step will be to obtain (through automated scripted access) a list of the hostnames used on that site, so that in a later stage of archiving, the web data in those hostnames can be retrieved.

The format is simple: plain ASCII, Unix-style line breaks (LF, hex 0A, as newline character), one hostname per line. Each line has a sequential serial number followed by a tab (09) and then the hostname:

2000001	dwellz.posterous.com

The file is saved with a .hostnames extension, and a filename that is a number one less than the first serial numbered line in the file (e.g., 2000000.hostnames). It is then compressed in gzip format for upload/download.

Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox