CDX

From Just Solve the File Format Problem
(Difference between revisions)
Jump to: navigation, search
(A small amount on the relationship with the ARC or WARC)
 
(One intermediate revision by one user not shown)
Line 7: Line 7:
 
A '''CDX''' file contains metadata related to web archiving. It is a text-based file format designed to accompany [[ARC (Internet Archive)|ARC]] or [[WARC]] files.
 
A '''CDX''' file contains metadata related to web archiving. It is a text-based file format designed to accompany [[ARC (Internet Archive)|ARC]] or [[WARC]] files.
  
Usually, CDX files will contain little or no information not contained in or derivable from the ARC or WARC, and their primary purpose is to serve as an index of the it.
+
Usually, CDX files will contain little or no information not contained in or derivable from the ARC or WARC, and their primary purpose is to serve as an index of it.
  
 
== Specifications ==
 
== Specifications ==
 
* [https://iipc.github.io/warc-specifications/specifications/cdx-format/cdx-2006/ The CDX File Format (c.2006)]
 
* [https://iipc.github.io/warc-specifications/specifications/cdx-format/cdx-2006/ The CDX File Format (c.2006)]
 +
* [https://github.com/iipc/warc-specifications/blob/master/specifications/cdx-format/cdx-2015/index.md The CDX File Format (c.2015)]
 +
* [https://archive.org/web/researcher/cdx_file_format.php CDX File Format]
  
 
== Other links and references ==
 
== Other links and references ==

Latest revision as of 20:24, 3 June 2023

File Format
Name CDX
Ontology
Extension(s) .cdx
PRONOM fmt/869
Released ~2006

A CDX file contains metadata related to web archiving. It is a text-based file format designed to accompany ARC or WARC files.

Usually, CDX files will contain little or no information not contained in or derivable from the ARC or WARC, and their primary purpose is to serve as an index of it.

[edit] Specifications

[edit] Other links and references

Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox