KeywordScape - Visual Document Exploration using Contextualized Keyword Embeddings
- Clone Repository
- Rebuild Environment using either conda or pip from conda_requirements.txt or pip_requirements.txt SOLUTION can be found here: https://stackoverflow.com/questions/50777849/from-conda-create-requirements-txt-for-pip3
- cd into repository
- execute: jupyter lab
- open KeywordScape_Pipeline.ipynb
- Create a separate folder in your working environment and collect a set of PDFs in that folder
- Set the path to the folder in the variable: pdf_files_folder (cell 4)
- Parse PDFs to JSON using allenai science-parse lib (run cell 13)
- Clean Corpus (run cell 14)
- Create Maps (run cell 15)
- System exported 3 map files named as: interactive_document_corpus_base_map_points_file_path, interactive_document_corpus_corpus_points_file_path, interactive_document_corpus_topic_corpus_file_path
- Copy Files into directory: keywordscape/docker/src/backend/base_fastapi_backend/src/web_frontend/ui/pages/web_pages/playgrounds/keywordscape_playground/assets/
- Import 3 map files in keywordscape_playground.js
- Rebuild application by using: docker-compose up --build
- Open localhost:8080/keywordscape_playground in browser
- Clone Repository
- cd into docker/src/
- execute: docker-compose up --build
- open localhost:8080/keywordscape_playground in browser