Skip to content

KeywordScape - Visual Document Exploration using Contextualized Keyword Embeddings

Notifications You must be signed in to change notification settings

clause-bielefeld/keywordscape

Repository files navigation

KeywordScape

KeywordScape - Visual Document Exploration using Contextualized Keyword Embeddings

Jupyter Notebook Setup

  1. Clone Repository
  2. Rebuild Environment using either conda or pip from conda_requirements.txt or pip_requirements.txt SOLUTION can be found here: https://stackoverflow.com/questions/50777849/from-conda-create-requirements-txt-for-pip3
  3. cd into repository
  4. execute: jupyter lab
  5. open KeywordScape_Pipeline.ipynb

Add Custom PDF Dataset

  1. Create a separate folder in your working environment and collect a set of PDFs in that folder
  2. Set the path to the folder in the variable: pdf_files_folder (cell 4)
  3. Parse PDFs to JSON using allenai science-parse lib (run cell 13)
  4. Clean Corpus (run cell 14)
  5. Create Maps (run cell 15)
  6. System exported 3 map files named as: interactive_document_corpus_base_map_points_file_path, interactive_document_corpus_corpus_points_file_path, interactive_document_corpus_topic_corpus_file_path
  7. Copy Files into directory: keywordscape/docker/src/backend/base_fastapi_backend/src/web_frontend/ui/pages/web_pages/playgrounds/keywordscape_playground/assets/
  8. Import 3 map files in keywordscape_playground.js
  9. Rebuild application by using: docker-compose up --build
  10. Open localhost:8080/keywordscape_playground in browser

Demo Setup

  1. Clone Repository
  2. cd into docker/src/
  3. execute: docker-compose up --build
  4. open localhost:8080/keywordscape_playground in browser

About

KeywordScape - Visual Document Exploration using Contextualized Keyword Embeddings

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published