PDFXML
From Just Solve the File Format Problem
(Difference between revisions)
Ross-spencer (Talk | contribs) (Added ResearchGate reference for forensic completeness) |
Ross-spencer (Talk | contribs) (Add informative references) |
||
| Line 50: | Line 50: | ||
* [https://microsoft.fandom.com/wiki/Portable_Document_Format#Mars Microsoft.Fandom: Adobe Mars] | * [https://microsoft.fandom.com/wiki/Portable_Document_Format#Mars Microsoft.Fandom: Adobe Mars] | ||
* [https://www.researchgate.net/publication/221353051_The_Mars_project_PDF_in_XML The Mars project: PDF in XML by Matthew R. B. Hardy on ResearchGate] (also in-line at Pace Uni) | * [https://www.researchgate.net/publication/221353051_The_Mars_project_PDF_in_XML The Mars project: PDF in XML by Matthew R. B. Hardy on ResearchGate] (also in-line at Pace Uni) | ||
| + | * [https://scispace.com/pdf/mapping-and-displaying-structural-transformations-between-32fig49lb8.pdf Mapping and Displaying Structural Transformations between XML and PDF] | ||
==References== | ==References== | ||
Latest revision as of 10:40, 19 June 2026
Adobe labs developed a plugin for Acrobat to create an "XML-friendly representation of PDF".[1] Originally called the MARS project[2], it was later renamed to PDFXML. Started in 2006, the project was shutdown in 2011 and removed from public access.[3]
[edit] File Information
The PDFXML file format uses ZIP as a container format, and SVG for each page and JPEG2000 for each image.
File format specifications and schema were published at the time, but unfortunately the archive.org captures of the website have truncated PDF's.[4] If anyone has copies of the original specification or schema please post a link here.
Basic container contains the following structure:
├── META-INF
│ ├── compatibility.pdf
│ ├── container.xml
│ └── metadata.xml
├── backbone.xml
├── bookmarks.xml
├── color
│ └── cs-0.icc
├── form
│ └── form_data.xfdf
├── mimetype
├── page
│ └── 0
│ ├── form_0.svg
│ ├── form_1.svg
│ ├── form_2.svg
│ ├── info.xml
│ ├── pg.can
│ └── pg.svg
└── script
├── javascripts.xml
└── js_0
[edit] Further information
Information not immediately visible through the references above.
- MARS FAQ
- MARS via PDF Junkie
- Martin Kováč on PDF XML
- Eliot Kimber - Adobe Mars: Looks Interesting
- Microsoft.Fandom: Adobe Mars
- The Mars project: PDF in XML by Matthew R. B. Hardy on ResearchGate (also in-line at Pace Uni)
- Mapping and Displaying Structural Transformations between XML and PDF
[edit] References
- ↑ https://web.archive.org/web/20080919181131/https://blogs.adobe.com/mars/2008/09/pdfxml_plugin_prerelease.html
- ↑ http://csis.pace.edu/~marchese/CS835/Student_Readings/p161-hardy.pdf
- ↑ https://web.archive.org/web/20110902063203/http://labs.adobe.com/technologies/mars
- ↑ https://web.archive.org/web/20061222203316/http://labs.adobe.com/technologies/mars/