Chrome bookmarks

From Just Solve the File Format Problem
(Difference between revisions)
Jump to: navigation, search
(Date format)
 
(9 intermediate revisions by 3 users not shown)
Line 3: Line 3:
 
|subcat=Web
 
|subcat=Web
 
|subcat2=Web browser files
 
|subcat2=Web browser files
 +
|released=2008
 
}}
 
}}
  
 
Bookmarks in Google Chrome are stored in a file named '''Bookmarks''' (with no extension) in a data directory with a location that is system-specific, but in Windows Vista can be found in the directory '''\Users\''username''\AppData\Local\Google\Chrome\User Data\Default'''. A backup copy of the previous version of the file is saved in '''Bookmarks.bak'''.
 
Bookmarks in Google Chrome are stored in a file named '''Bookmarks''' (with no extension) in a data directory with a location that is system-specific, but in Windows Vista can be found in the directory '''\Users\''username''\AppData\Local\Google\Chrome\User Data\Default'''. A backup copy of the previous version of the file is saved in '''Bookmarks.bak'''.
  
The format does not appear to be documented anywhere, but it seems to be a hierarchical structure using lots of curly braces {} and some square brackets [], with the bookmarks stored as a sequence of quoted name and value pairs, one to a line, with a colon followed by a space between the attribute name and value. Attribute names include "date added", "id", "name", "type", and "url". The "id" values are numeric, probably assigned sequentially as bookmarks are added.
+
The format does not appear to be documented anywhere, but it is apparently in [[JSON]] format, using a hierarchical structure corresponding to the bookmark folder structure. Attribute names in a bookmark entry include "date added", "id", "name", "type", and "url". The "id" values are numeric, probably assigned sequentially as bookmarks are added.
 +
 
 +
== Character set ==
 +
 
 +
The file appears to be expressed in [[ASCII]], with line breaks as CRLF (0D+0A), though the file examined was from the Windows version; it is possible that bookmark files from other systems could use line-break conventions appropriate to those systems. Non-ASCII characters are given as escape sequences like '''\u00ED''', representing the [[Unicode]] character given by a four-digit hexadecimal number following "\u" (in this case the accented letter í).
  
 
== Date format ==
 
== Date format ==
  
The date format consists of huge numbers like 12871673787657328 or 12605573593000000. These examples are taken from an actual bookmark file; the one with six zeroes at the end is a clue that perhaps the time units are some tiny fraction of a second. Stripping the zeroes yields 12605573593, still one digit longer than current Unix-style timestamps, but stripping the final 3 gets a timestamp that translates to Fri, 11 Dec 2009 18:49:19 GMT, which may be the correct timestamp for the bookmark; this means that the timestamps are expressed with seven digits of precision beneath the seconds (in other words, in ten-millionths of a second). Or perhaps the added digits have some other specialized meaning; apparently only Google knows.
+
The date format consists of huge numbers like 12871673787657328 or 12605573593000000. These examples are taken from an actual bookmark file; the one with six zeroes at the end is a clue that perhaps the time units are some tiny fraction of a second; this is different from the normal [[Unix time|Unix-style timestamps]] that count seconds since 1970.
 +
 
 +
In fact, the Chrome bookmark file date format, e.g., the <tt>date_added</tt> field, is in ''microseconds'' since January 1, 1601 (a much earlier epoch than the Unix 1970 date). Note that this only differs from [[Windows FILETIME]] by a factor of 10. See the open-source [http://dev.chromium.org/Home Chromium] source code, and search for <tt>MicrosecondsToFileTime</tt>.  Sample conversion code can be found on stackoverflow [http://stackoverflow.com/questions/19074423/how-to-parse-the-date-added-field-in-chrome-bookmarks-file "How to parse the date_added field in Chrome bookmarks file?"]
 +
 
 +
Please note the internal chrome bookmark file date format differs from the Netscape exported bookmark HTML file date format.  Netscape exported bookmark HTML file dates, e.g., the <tt>ADD_DATE</tt> field, is in ''seconds'' since January 1, 1970.
 +
 
 +
== Source code ==
 +
 
 +
It's possible more info about the format could be gleaned by looking through the open-source [http://dev.chromium.org/Home Chromium] source code, if that deals with bookmarks the same way as Chrome itself. Try searching on "bookmarks". The code there says that timestamps are milliseconds since the epoch, but as seen above there seem to be way more digits than that.
 +
 
 +
[[Category:Google]]
 +
[[Category:JSON based file formats]]

Latest revision as of 17:02, 19 July 2017

File Format
Name Chrome bookmarks
Ontology
Released 2008

Bookmarks in Google Chrome are stored in a file named Bookmarks (with no extension) in a data directory with a location that is system-specific, but in Windows Vista can be found in the directory \Users\username\AppData\Local\Google\Chrome\User Data\Default. A backup copy of the previous version of the file is saved in Bookmarks.bak.

The format does not appear to be documented anywhere, but it is apparently in JSON format, using a hierarchical structure corresponding to the bookmark folder structure. Attribute names in a bookmark entry include "date added", "id", "name", "type", and "url". The "id" values are numeric, probably assigned sequentially as bookmarks are added.

[edit] Character set

The file appears to be expressed in ASCII, with line breaks as CRLF (0D+0A), though the file examined was from the Windows version; it is possible that bookmark files from other systems could use line-break conventions appropriate to those systems. Non-ASCII characters are given as escape sequences like \u00ED, representing the Unicode character given by a four-digit hexadecimal number following "\u" (in this case the accented letter í).

[edit] Date format

The date format consists of huge numbers like 12871673787657328 or 12605573593000000. These examples are taken from an actual bookmark file; the one with six zeroes at the end is a clue that perhaps the time units are some tiny fraction of a second; this is different from the normal Unix-style timestamps that count seconds since 1970.

In fact, the Chrome bookmark file date format, e.g., the date_added field, is in microseconds since January 1, 1601 (a much earlier epoch than the Unix 1970 date). Note that this only differs from Windows FILETIME by a factor of 10. See the open-source Chromium source code, and search for MicrosecondsToFileTime. Sample conversion code can be found on stackoverflow "How to parse the date_added field in Chrome bookmarks file?"

Please note the internal chrome bookmark file date format differs from the Netscape exported bookmark HTML file date format. Netscape exported bookmark HTML file dates, e.g., the ADD_DATE field, is in seconds since January 1, 1970.

[edit] Source code

It's possible more info about the format could be gleaned by looking through the open-source Chromium source code, if that deals with bookmarks the same way as Chrome itself. Try searching on "bookmarks". The code there says that timestamps are milliseconds since the epoch, but as seen above there seem to be way more digits than that.

Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox