PDFXML

From Just Solve the File Format Problem
Jump to: navigation, search
File Format
Name PDFXML
Ontology
Extension(s) .mars, .pdfxml
MIME Type(s) application/vnd.adobe.x-mars, application/vnd.adobe.pdfxml
Released 2006

Adobe labs developed a plugin for Acrobat to create an "XML-friendly representation of PDF".[1] Originally called the MARS project[2], it was later renamed to PDFXML. Started in 2006, the project was shutdown in 2011 and removed from public access.[3]

File Information

The PDFXML file format uses ZIP as a container format, and SVG for each page and JPEG2000 for each image.

File format specifications and schema were published at the time, but unfortunately the archive.org captures of the website have truncated PDF's.[4] If anyone has copies of the original specification or schema please post a link here.

Basic container contains the following structure:

├── META-INF
│   ├── compatibility.pdf
│   ├── container.xml
│   └── metadata.xml
├── backbone.xml
├── bookmarks.xml
├── color
│   └── cs-0.icc
├── form
│   └── form_data.xfdf
├── mimetype
├── page
│   └── 0
│       ├── form_0.svg
│       ├── form_1.svg
│       ├── form_2.svg
│       ├── info.xml
│       ├── pg.can
│       └── pg.svg
└── script
    ├── javascripts.xml
    └── js_0

References

  1. https://web.archive.org/web/20080919181131/https://blogs.adobe.com/mars/2008/09/pdfxml_plugin_prerelease.html
  2. http://csis.pace.edu/~marchese/CS835/Student_Readings/p161-hardy.pdf
  3. https://web.archive.org/web/20110902063203/http://labs.adobe.com/technologies/mars
  4. https://web.archive.org/web/20061222203316/http://labs.adobe.com/technologies/mars/
Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox