A web app to detect, analyze and visualize topics from your documents using BERTopic.
Developers: Chris Tam, Ian Bulovic, Nhi Le (Brandeis University)
Table of contents generated with markdown-toc
Note: Some of the pages in the app might take a few minutes to load because of the underlying BERTopic model.
This page shows all the documents you have uploaded, classified into two upload types: documents (for individual file uploads), and dataset (for zip uploads).
You can filter and delete documents by various criteria.
You can either:
- Upload a single document and run a pretrained topic model on it (BERTopic Wikipedia in this case):
- Upload a dataset in the form of a zip folder and train a topic model. Note: this may take a while depending on the size of your dataset.
On this page, you are given the option to visualize topic modeling results on document uploads or dataset uploads, using either the pretrained BERTopic Wikipedia model, or your own model.
You can also view different topic visualizations of your trained model:
Note: it may take a few minutes for the build process to complete.
- Building the image:
docker build -t topic-modeling-app .
- Running the app:
docker run -p 8501:8501 topic-modeling-app
Navigate to http://localhost:8501 to access app.
- Setup virutal environment:
conda create -n topic-modeling-app python=3.11
conda activate topic-modeling-app
python -m pip install -r requirements.txt
cdinto app directory and run the streamlit app:
python -m streamlit run 1_Your_Documents.py
cd into app directory:
python -m unittest discover -s unit_tests






