DocMind AI is a powerful, open-source Streamlit application that leverages the capabilities of Large Language Models (LLMs) running locally on your machine through Ollama. Analyze a vast array of document types, extract key insights, generate summaries, identify action items, and surface open questions โ all without sending your data to the cloud!
- Privacy-Focused: Your documents are processed locally, ensuring data privacy and security.
- Versatile Document Handling: Supports a wide range of file formats:
- ๐ PDF
- ๐ DOCX
- ๐ TXT
- ๐ XLSX
- ๐ MD (Markdown)
- ๐๏ธ JSON
- ๐๏ธ XML
- ๐ค RTF
- ๐ CSV
- ๐ง MSG (Email)
- ๐ฅ๏ธ PPTX (PowerPoint)
- ๐ ODT (OpenDocument Text)
- ๐ EPUB (E-book)
- ๐ป Code files (PY, JS, JAVA, TS, TSX, C, CPP, H, and more!)
- Powerful AI Analysis: Uses the power of LangChain to provide in-depth analysis.
- Structured Output: Get results in a well-defined format using Pydantic.
- Customizable Prompts: Tailor the analysis to your specific needs with pre-defined or custom prompts.
- Tone and Instruction Control: Fine-tune the LLM's responses by selecting the desired tone (e.g., professional, informal, academic) and specific instructions (e.g., act as a researcher, software engineer, business analyst).
- Length/Detail Selection: Control the length and level of detail of the generated responses (e.g., concise, detailed, comprehensive).
- Flexible Analysis Modes: Choose to analyze each document individually or combine them for a holistic analysis.
- Interactive Chat: Continue the conversation with the LLM to explore the documents further.
- Docker Support: Easily deploy the application using Docker or Docker Compose.
- Features of DocMind AI
- Getting Started with DocMind AI: Local LLM Analysis
- Usage
- Architecture
- How to Cite
- Contributing
- License
- Ollama installed and running.
- Python 3.8 or higher.
- (Optional) Docker and Docker Compose for containerized deployment.
-
Clone the repository:
git clone https://github.com/BjornMelin/docmind-ai.git cd docmind-ai
-
Install dependencies:
pip install -r requirements.txt
Locally:
streamlit run app.py
With Docker:
docker-compose up --build
The app will be accessible at http://localhost:8501
.
- Enter the Ollama Base URL (default:
http://localhost:11434
). - Choose your desired Ollama Model Name (e.g.,
llama2
) from the dropdown.
Click the "Browse files" button to upload one or more documents. Supported file types are listed above in the Features section.
Select a pre-defined prompt from the dropdown:
- Comprehensive Document Analysis: Get a summary, key insights, action items, and open questions.
- Extract Key Insights and Action Items: Focus on extracting these two elements.
- Summarize and Identify Open Questions: Generate a summary and a list of open questions.
- Custom Prompt: Enter your own prompt to guide the analysis.
Choose the desired tone for the LLM's response:
- Professional: Objective and formal.
- Academic: Scholarly and research-oriented.
- Informal: Casual and conversational.
- Creative: Imaginative and artistic.
- Neutral: Unbiased and objective.
- Direct: Concise and to-the-point.
- Empathetic: Understanding and compassionate.
- Humorous: Witty and lighthearted.
- Authoritative: Confident and expert.
- Inquisitive: Curious and exploratory.
Select the persona or instructions that the LLM should follow:
- General Assistant: Act as a helpful assistant.
- Researcher: Provide in-depth research and analysis.
- Software Engineer: Focus on technical details and code.
- Product Manager: Consider product strategy and user experience.
- Data Scientist: Emphasize data analysis and modeling.
- Business Analyst: Analyze from a business and strategic perspective.
- Technical Writer: Create clear and concise documentation.
- Marketing Specialist: Focus on branding and customer engagement.
- HR Manager: Consider human resources aspects.
- Legal Advisor: Provide information from a legal standpoint.
- Custom Instructions: Enter your own specific instructions.
Choose the desired length and level of detail for the LLM's response:
- Concise: Brief and to-the-point.
- Detailed: Thorough and comprehensive.
- Comprehensive: Extensive and in-depth.
- Bullet Points: Provide response in bullet point format.
Select the analysis mode:
- Analyze each document separately: Process and analyze each document individually.
- Combine analysis for all documents: Treat all uploaded documents as a single unit for analysis.
- Upload your documents.
- Choose your analysis prompt, tone, instructions, desired length, and analysis mode.
- Click the "Extract and Analyze" button.
The application will display the analysis results, attempting to format them according to the defined output schema. If parsing fails, the raw LLM output will be shown.
Use the chat interface to ask follow-up questions about the analyzed documents. The LLM will use the extracted information as context for its responses.
Here's a Mermaid diagram illustrating the application's architecture:
graph TD
A[User] -->|Uploads Documents| B(Streamlit App - app.py);
B -->|Selects Model, Prompt, Tone, Instructions, Length, Mode| C{Ollama API};
C -->|Processes Documents| D[LangChain];
D -->|Loads Documents| E{Document Loaders};
E -->|PDF| F[PyPDFLoader];
E -->|DOCX| G[Docx2Loader];
E -->|TXT, Code| H[TextLoader];
E -->|...| I;
D -->|Splits Text| J[RecursiveCharacterTextSplitter];
D -->|Generates Analysis| K[LLM - Ollama Model];
K -->|Structured Output| L[PydanticOutputParser];
B -->|Displays Results| A;
A -->|Asks Follow-up Questions| B;
B -->|Interacts with LLM| C;
style A fill:#f9f,stroke:#333,stroke-width:2px
style B fill:#ccf,stroke:#333,stroke-width:2px
style C fill:#cfc,stroke:#333,stroke-width:2px
style D fill:#fcc,stroke:#333,stroke-width:2px
style K fill:#ccf,stroke:#333,stroke-width:2px
If you use DocMind AI in your research or work, please cite it as follows:
@software{melin_docmind_ai_2025,
author = {Melin, Bjorn},
title = {DocMind AI: Local LLM for AI-Powered Document Analysis},
url = {https://github.com/BjornMelin/docmind-ai},
version = {0.1.0},
year = {2025}
}
We welcome contributions! Please see the CONTRIBUTING.md file for details on how to contribute to this project.
This project is licensed under the MIT License - see the LICENSE file for details.
Built with โค๏ธ by Bjorn Melin