LangChain Documentation Helper

A Streamlit-based chatbot that helps users navigate and understand LangChain documentation by providing intelligent answers to questions using RAG (Retrieval-Augmented Generation) with Pinecone vector database.

🚀 Features

Intelligent Q&A: Ask questions about LangChain documentation and get accurate answers
Source Citations: Each answer includes links to the relevant documentation sources
Chat Interface: Interactive chat-like interface built with Streamlit
Vector Search: Uses Pinecone vector database for efficient document retrieval
OpenAI Integration: Powered by OpenAI's GPT models for natural language understanding

🛠️ Tech Stack

Frontend: Streamlit
Backend: Python, LangChain
Vector Database: Pinecone
LLM: OpenAI GPT
Document Processing: BeautifulSoup4, LangChain document loaders

📋 Prerequisites

Before running this project, you'll need:

Python 3.11+ installed on your system
OpenAI API Key - Get one from OpenAI Platform
Pinecone API Key - Get one from Pinecone Console
Pinecone Environment - Your Pinecone environment (e.g., us-east-1-aws)

🚀 Installation

Option 1: Using pip (Recommended)

Clone the repository

git clone <your-repo-url>
cd documentation-helper

Create a virtual environment

python -m venv venv
# On Windows
venv\Scripts\activate
# On macOS/Linux
source venv/bin/activate

Install dependencies
```
pip install -r requirements.txt
```

Option 2: Using pipenv

Clone the repository

git clone <your-repo-url>
cd documentation-helper

Install dependencies
```
pipenv install
pipenv shell
```

⚙️ Configuration

Create environment file Create a .env file in the project root:

OPENAI_API_KEY=your_openai_api_key_here
PINECONE_API_KEY=your_pinecone_api_key_here
PINECONE_ENVIRONMENT=your_pinecone_environment_here

Set up Pinecone Index
- Create a new index in your Pinecone console
- Use dimension: 1536 (for text-embedding-3-small)
- Use metric: cosine

📚 Data Ingestion

Before using the chatbot, you need to ingest the LangChain documentation:

Download documentation (if not already present)

# The ingestion script will download LangChain docs automatically

Run the ingestion script
```
python ingestion.py
```
This will:
- Download LangChain documentation
- Process and chunk the documents
- Upload embeddings to Pinecone
- Create the vector index

🎯 Usage

Start the Streamlit app
```
streamlit run main.py
```
Open your browser Navigate to http://localhost:8501
Ask questions
- Type your question about LangChain in the input field
- Press Enter or click the button
- Get answers with source citations

📁 Project Structure

documentation-helper/
├── backend/
│   ├── __init__.py
│   └── core.py              # Main LLM logic and RAG implementation
├── langchain-docs/          # Downloaded documentation (gitignored)
├── venv/                    # Virtual environment (gitignored)
├── .env                     # Environment variables (gitignored)
├── .gitignore              # Git ignore rules
├── LICENSE                 # Apache 2.0 license
├── Pipfile                 # Dependencies (pipenv)
├── Pipfile.lock            # Locked dependencies
├── requirements.txt        # Dependencies (pip)
├── README.md              # This file
├── ingestion.py           # Data ingestion script
└── main.py                # Streamlit application

🔧 Key Components

`main.py`

Streamlit web interface
Chat history management
User interaction handling

`backend/core.py`

RAG implementation using LangChain
Pinecone vector store integration
OpenAI LLM configuration

`ingestion.py`

Document downloading and processing
Text chunking and embedding
Vector database population

🤝 Contributing

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

📄 License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

🙏 Acknowledgments

LangChain for the amazing framework
OpenAI for the language models
Pinecone for the vector database
Streamlit for the web framework

📞 Support

If you encounter any issues or have questions:

Check the Issues page
Create a new issue with detailed information
Include error messages and steps to reproduce

Happy coding! 🚀

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LangChain Documentation Helper

🚀 Features

🛠️ Tech Stack

📋 Prerequisites

🚀 Installation

Option 1: Using pip (Recommended)

Option 2: Using pipenv

⚙️ Configuration

📚 Data Ingestion

🎯 Usage

📁 Project Structure

🔧 Key Components

`main.py`

`backend/core.py`

`ingestion.py`

🤝 Contributing

📄 License

🙏 Acknowledgments

📞 Support

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
backend		backend
.gitignore		.gitignore
LICENSE		LICENSE
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.md		README.md
ingestion.py		ingestion.py
main.py		main.py
requirements.txt		requirements.txt

License

adi9336/Documentation-Helper

Folders and files

Latest commit

History

Repository files navigation

LangChain Documentation Helper

🚀 Features

🛠️ Tech Stack

📋 Prerequisites

🚀 Installation

Option 1: Using pip (Recommended)

Option 2: Using pipenv

⚙️ Configuration

📚 Data Ingestion

🎯 Usage

📁 Project Structure

🔧 Key Components

main.py

backend/core.py

ingestion.py

🤝 Contributing

📄 License

🙏 Acknowledgments

📞 Support

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`main.py`

`backend/core.py`

`ingestion.py`

Packages