Skip to content

Latest commit

 

History

History
78 lines (52 loc) · 4.25 KB

browser_bookmark_extraction.md

File metadata and controls

78 lines (52 loc) · 4.25 KB

Browser Bookmark Extraction

Export bookmark data from web browsers and save as text files. These can then by parsed added to the database, using the URL Manager application.

Chrome

This section is applicable for both Chrome and Chromium web browsers. The two may both exist on the same system and both may be imported in the URL Manager application.

Before continuing, follow the Identify Chrome Profiles instructions. Your user will be Default, or Profile 2, Profile 3, etc.

Copy an existing JSON file

Choose a user from Chrome or Chromium which you want to include in the project. In the example below, the 'Research' user from Chrome is used.

Path to JSON file:

OS Path
Linux ~/.config/google-chrome/Default/Bookmarks
macOS ~/Library/Application\ Support/Google/Chrome/Default/Bookmarks

Navigate to the root of this repo then run one of the following.

  • Symlink
    • Put a symlink to the user's Preferences file in the project. The linked file will always point to the most up to date data in the original file. Note the file naming convention of area, browser name and then profile name.
    $ ln -s BOOKMARK_PATH \
        url_manager/var/lib/raw/bookmarks_chrome_research.json
  • Copy Make a copy of the preferences data in the project. Though, this duplicated file will not be updated if the original changes so this is not recommended unless you want to experiment with editing the copy by hand.
    $ cp BOOKMARK_PATH \
        url_manager/var/lib/raw/bookmarks_chrome_research.json

You now have a reference to a single user's bookmarks in your project. Repeat the steps for all users which you want to import into the URL Manager application.

Those steps above could be automated with a script or Makefile, but then that will require handling OS and browser types and requiring inputs for username or display name (internally lookup username) and then a way to generate a filename or link name with some input.

Export as JSON using JavaScript

You can also use JavaScript to export Chrome (or Chromium) bookmarks to a text file, in JSON format. This is more tedious than the above method, but is included anyway for intrest.

  1. Open Chrome.
  2. Choose the Chrome user for which you want to get bookmarks for.
  3. Enable bookmark permissions for a Chrome extension, as required by the Chrome Bookmarks API. Either edit the manifest file on existing extension, but it is easy enough find one which has permissions already. Therefore install Bookmarks to JSON Extension in Chrome. This can be used directly by clicking on the icon, then Options and Export, however the output info is limited.
  4. Click the Bookmarks to JSON Extension icon and Options item.
  5. Open the developer console (ctrl+shift+i).
  6. Open the JavaScript Console tab.
  7. To get the data as single string indented to 4 spaces, paste the following in and press enter:
    > chrome.bookmarks.getTree(function (tree) { console.log(JSON.stringify(tree, null, 4)) ;  } ) ;
  8. Click Copy at the end to copy the entire result to the clipboard.
  9. Paste in a text editor and save as .json file.

Export as XML

It is possible to export Chrome bookmarks to a XML file using the browser's Bookmark Manager and built-in Export functionality.

But the file seems to follow a format of self-closing tags which is not understood by parsers and therefore this is not practical to use.

This was the case when attempting to parse with these 3 methods:

  • Online XML to JSON converter - failed to process
  • Python package xmltodict - failed to process
  • Python package bs4 (BeautifulSoup4) - file was processed, but tags were incorrectly nested too deeply due to a lack of closing tags.

Therefore using neither straight XML or parsing XML to JSON is supported in this project. However, if other parsers can be used or upgrading to newer versions works, then I'll be able to use the XML export.

Firefox

To be completed.

There doesn't seem to a way to export bookmarks within Firefox.