Skip to content

SharanyaAchanta/LexTransition-AI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

246 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

LexTransition-AI

Live Demo: https://kvbgkvw4mehwhhdjt7crrg.streamlit.app/

LexTransition AI is an open-source, offline-first legal assistant. It helps users navigate the transition from old Indian laws (IPC/CrPC/IEA) to the new BNS/BNSS/BSA frameworks. Using local Machine Learning and OCR, it analyzes legal documents and maps law sections with 100% grounded accuracy.


βš–οΈ LexTransition AI: Law Mapper & Document Analyzer

LexTransition AI is an open-source, offline-first legal assistant. It helps users navigate the transition from old Indian laws (IPC/CrPC/IEA) to the new BNS/BNSS/BSA frameworks. Using local Machine Learning and OCR, it analyzes legal documents and maps law sections with 100% grounded accuracy.


πŸš€ Key Modules

πŸš€ Core Features

  • πŸ”„ Intelligent Law Mapper: Maps old IPC sections to new BNS equivalents. Uses an LLM to highlight specific changes in wording, penalties, and scope.
  • πŸ–ΌοΈ Multimodal OCR Analysis: Upload photos of legal notices or FIRs. The system extracts text using local OCR and generates actionable summaries.
  • πŸ“š Grounded Fact-Checking (RAG): Ask legal questions and get answers backed by official citations. The AI identifies the exact Section and Page from uploaded Law PDFs to prevent hallucinations.
  • πŸŽ™οΈ Environment-Aware Voice Agent: Features high-fidelity offline TTS (Piper) with an automatic, lightweight cloud fallback (gTTS) to ensure seamless audio playback on headless platforms like Streamlit Cloud.

πŸ› οΈ Offline Tech Stack (No-API Approach)

To ensure privacy and offline accessibility, this project can be configured to run without external APIs:

  • Frontend: Streamlit
  • Backend: Python, LangChain/LlamaIndex.
  • Local LLM Engine: Ollama (Llama 3 / Mistral)
  • Voice / TTS: Piper TTS (ONNX models)
  • OCR Engine: EasyOCR / PyTesseract
  • Vector Database (RAG): FAISS + Sentence-Transformers

πŸ“‚ Project Structure

LexTransition-AI/
β”œβ”€β”€ .github/
β”‚   └── workflows/
β”‚       └── lextransition-ci.yml  # GitHub Actions CI/CD pipeline
β”œβ”€β”€ engine/
β”‚   β”œβ”€β”€ comparator.py             # AI logic for comparing IPC & BNS texts
β”‚   β”œβ”€β”€ llm.py                    # Fallback logic and LLM summarization
β”‚   β”œβ”€β”€ mapping_logic.py          # Core IPC to BNS transition logic
β”‚   β”œβ”€β”€ ocr_processor.py          # Local OCR extraction and processing
β”‚   β”œβ”€β”€ rag_engine.py             # Local Vector Search logic (FAISS)
β”‚   └── db.py                     # Database connection and queries
β”œβ”€β”€ utils/
β”‚   └── timeout_handler.py        # Resiliency and API timeout handlers
β”œβ”€β”€ tests/
β”‚   └── test_embeddings.py        # Pytest suite for automated testing
β”œβ”€β”€ scripts/
β”‚   └── ocr_benchmark.py          # OCR character error rate testing
β”œβ”€β”€ models/
β”‚   └── tts/                      # Local storage for Piper ONNX voice models
β”œβ”€β”€ law_pdfs/                     # Upload directory for Grounded Fact-Checking
β”œβ”€β”€ app.py                        # Main Streamlit UI application
β”œβ”€β”€ Dockerfile                    # Production container configuration
β”œβ”€β”€ requirements.txt              # Python dependencies & OS-specific markers
β”œβ”€β”€ setup_agent.py                # Manual setup script for downloading TTS binaries
└── README.md                     # Master project documentation

βš™οΈ Installation & Local Setup

Option A: Using Docker (Recommended)

The easiest way to run LexTransition-AI is with Docker. This handles all dependencies (including Tesseract OCR and system libraries) automatically.

  1. Clone the repository:

    git clone [https://github.com/[username]/LexTransition-AI.git](https://github.com/[username]/LexTransition-AI.git)
    cd LexTransition-AI
  2. Build the Docker Image in terminal

    docker build -t lextransition .
  3. Run the Application

    docker run -p 8501:8501 -e LTA_OLLAMA_URL="[http://host.docker.internal:11434](http://host.docker.internal:11434)" lextransition-ai
  4. Open the App

    http://localhost:8501

Option B: Manual Local Setup (Windows/Linux/Mac)

If you prefer to run the app directly in your local Python environment:

  1. Install Dependencies (requires Python 3.10)

    python -m venv venv
    source venv/bin/activate  # On Windows use: venv\Scripts\activate
    pip install -r requirements.txt
  2. Download Voice Agent Models

    python setup_agent.py
  3. Start the Local LLM

    ollama serve
    ollama pull llama3
  4. Launch the App

    export LTA_OLLAMA_URL="http://localhost:11434"  # On Windows use: set LTA_OLLAMA_URL=http://localhost:11434
    streamlit run app.py

🟒 Current Implementation Status & Architecture

All core modules, offline LLM integrations, and containerization features are fully implemented and production-ready.

   =========================================================================
                  πŸš€ LEXTRANSITION-AI: SYSTEM ARCHITECTURE
   =========================================================================

                  [ πŸ–₯️ Streamlit Frontend (app.py) ]
                                 |
         -------------------------------------------------
         |                       |                       |
   [ πŸ”„ IPC β†’ BNS Mapper ]  [ πŸ–ΌοΈ Document OCR ]   [ πŸ“š Fact-Checker (RAG) ]
         |                       |                       |
   (SQLite Mapping DB)    (EasyOCR / PyTesseract)  (FAISS + sentence-transformers)
         |                       |                       |
         -------------------------------------------------
                                 |
                                 v
                  [ 🧠 Local LLM Engine (Ollama) ]
            (Semantic Analysis, Action Items, Summarization)
                                 |
                                 v
                  [ πŸŽ™οΈ Offline Voice Agent (Piper TTS) ]
            (High-fidelity vocal dictation of AI outputs)

   =========================================================================
                     βš™οΈ INFRASTRUCTURE GUARANTEES
   =========================================================================
   βœ”οΈ 100% Offline Capable (No external API keys required)
   βœ”οΈ Dockerized Deployment (Verified networking & TTS dependencies)
   βœ”οΈ CI/CD Pipeline Active (GitHub Actions + Pytest)

πŸ’Ύ Data Persistence & Testing

  1. Local Data Storage (Privacy-First) To maintain our strict offline-first architecture, no user data or legal documents ever leave your machine:
  • Relational Data: Mappings and system configurations are persisted securely using a local SQLite database (replacing the legacy mapping_db.json).
  • Vector Store: Uploaded law PDFs for Grounded Fact-Checking are processed and stored locally in a FAISS vector index (./vector_store).
  1. Automated Testing & CI/CD LexTransition-AI maintains high reliability through local testing and GitHub Actions.

    Local Unit Tests To run the test suite locally, ensure your virtual environment is active (Python 3.10) and execute:

    pip install -r requirements.txt
    pytest -q

Continuous Integration (GitHub Actions)

Every Pull Request automatically triggers our .github/workflows/lextransition-ci.yml pipeline.


OCR Benchmark Harness

To evaluate the local OCR engine's Character Error Rate (CER) and Keyword Recall against custom scanned datasets

python scripts/ocr_benchmark.py --dataset data/ocr_dataset.csv --report ocr_report.md

βš™οΈ Advanced Configuration (Environment Variables)

LexTransition-AI is designed to be plug-and-play, but power users can customize the engine behavior using environment variables. If you are using Docker, these are passed via the -e flag.

Variable Default Description
LTA_OLLAMA_URL http://localhost:11434 The endpoint for the local LLM. When running in Docker, use http://host.docker.internal:11434 to route traffic to your host machine.
LTA_OLLAMA_MODEL llama3 Specifies which local model to use for analysis and summarization.
LTA_USE_EMBEDDINGS 1 Toggles the FAISS/Sentence-Transformer RAG engine. Set to 0 to fallback to legacy keyword search.

πŸ—ΊοΈ Project Roadmap & Future Scope

All foundational features (Local LLM, OCR, Vector DB, and CI/CD) are fully operational. The next phase of development focuses on expanding accessibility and enterprise utility:

  • Speech-to-Text (STT) Integration: Implement local Whisper models to allow users to verbally query the Fact-Checker without typing.
  • Multilingual Support (Indic Languages): Translate BNS mappings and OCR summaries into Hindi, Bengali, and other regional languages for broader accessibility.
  • Precedent & Case Law Expansion: Expand the RAG Vector Database beyond standard Bare Acts to include landmark judicial precedents.
  • Automated Legal Briefs: Add a reporting engine to export OCR analysis and IPC-to-BNS comparisons into cleanly formatted PDF/Docx files.

✨ Contributors

This project exists thanks to the amazing people who contribute their time, ideas, and improvements.

We truly appreciate every contribution πŸ’™

About

LexTransition AI is an open-source, offline-first legal assistant. It helps users navigate the transition from old Indian laws (IPC/CrPC/IEA) to the new **BNS/BNSS/BSA** frameworks. Using local Machine Learning and OCR, it analyzes legal documents and maps law sections with 100% grounded accuracy.

Topics

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors