Skip to content

astevens-lmds/Tabular_Review

Β 
Β 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

58 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Tabular Review for Bulk Document Analysis

License React AI

An AI-powered document review workspace that transforms unstructured legal contracts into structured, queryable datasets. Designed for legal professionals, auditors, and procurement teams to accelerate due diligence and contract analysis.

πŸš€ Features

  • AI-Powered Extraction: Automatically extract key clauses, dates, amounts, and entities from PDFs using Google Gemini 2.5 Pro / 3.0.
  • High-Fidelity Conversion: Uses Docling (running locally) to convert PDFs and DOCX files to clean Markdown text, preserving formatting and structure without hallucination.
  • Dynamic Schema: Define columns with natural language prompts (e.g., "What is the governing law?").
  • Verification & Citations: Click any extracted cell to view the exact source quote highlighted in the original document.
  • Spreadsheet Interface: A high-density, Excel-like grid for managing bulk document reviews.
  • Integrated Chat Analyst: Ask questions across your entire dataset (e.g., "Which contract has the most favorable MFN clause?").
  • Real-Time Progress: Progress bar showing extraction status (X/Y cells completed).

🎬 Demo

TabularReview.Final.mp4

πŸ— Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚         React Frontend          β”‚
β”‚  (Vite + TypeScript + Tailwind) β”‚
β”‚                                 β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚DataGrid β”‚  β”‚ Gemini SDK   β”‚  β”‚
β”‚  β”‚Sidebar  β”‚  β”‚ (extraction  β”‚  β”‚
β”‚  β”‚Chat     β”‚  β”‚  & chat)     β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β”‚                      β”‚          β”‚
β”‚       Google Gemini API         β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
           β”‚ /convert (file upload)
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚      FastAPI Backend            β”‚
β”‚  (Python + Docling)             β”‚
β”‚                                 β”‚
β”‚  PDF/DOCX β†’ Markdown conversion β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Frontend (/): React 19 SPA. Handles the grid UI, Gemini API calls for extraction/chat (client-side via @google/genai SDK), and document viewing.

Backend (/server): Python FastAPI server running Docling for document conversion. Converts uploaded PDF/DOCX files to Markdown text. The frontend sends files here before storing them.

πŸ›  Tech Stack

  • Frontend: React 19, TypeScript, Tailwind CSS, Vite
  • AI Integration: Google GenAI SDK (Gemini 2.5 Flash, 2.5 Pro, 3.0 Pro)
  • Backend: Python, FastAPI, Docling (document conversion)

πŸ“¦ Getting Started

Prerequisites

  • Node.js 18+ (with npm)
  • Python 3.10+
  • Google Gemini API Key β€” get one from Google AI Studio

1. Clone the repository

git clone https://github.com/astevens-lmds/Tabular_Review.git
cd Tabular_Review

2. Environment Variables

Copy the example file and add your API key:

cp .env.example .env

Edit .env and set:

VITE_GEMINI_API_KEY=your_google_api_key_here
VITE_API_URL=http://localhost:8000
Variable Required Description
VITE_GEMINI_API_KEY Yes Google Gemini API key for AI extraction and chat
VITE_API_URL No Backend URL (defaults to http://localhost:8000)

3. Setup Frontend

npm install

4. Setup Backend (Docling)

The backend is required for document conversion (PDF/DOCX β†’ Markdown).

cd server
python3 -m venv venv
source venv/bin/activate   # On Windows: venv\Scripts\activate
pip install -r requirements.txt

5. Run

Start the backend (in one terminal):

cd server
source venv/bin/activate
python main.py
# Server runs at http://localhost:8000

Start the frontend (in another terminal):

npm run dev
# App runs at http://localhost:3000

🐳 Docker Deployment (Alternative)

cp .env.example .env
# Edit .env and add your Google Gemini API key

docker-compose up --build

πŸ§ͺ Testing

Frontend (Vitest)

npx vitest run          # Run all tests once
npx vitest              # Watch mode

Tests live in tests/ and cover components (App, DataGrid, AddColumnMenu, KeyboardShortcutsHelp, ErrorBoundary, BatchUploadProgress), utilities (CSV/PDF export, column templates, theming), and type definitions.

Backend (pytest)

cd server
source venv/bin/activate
pytest test_main.py -v

Backend tests cover the /convert and /health endpoints, rate limiting, file size limits, CORS headers, and filename validation.

πŸ“ Project Structure

β”œβ”€β”€ App.tsx                    # Main application component
β”œβ”€β”€ index.tsx                  # React entry point
β”œβ”€β”€ types.ts                   # TypeScript type definitions
β”œβ”€β”€ components/
β”‚   β”œβ”€β”€ DataGrid.tsx           # Spreadsheet-like grid
β”‚   β”œβ”€β”€ VerificationSidebar.tsx # Cell inspection & document viewer
β”‚   β”œβ”€β”€ ChatInterface.tsx      # AI chat analyst
β”‚   β”œβ”€β”€ AddColumnMenu.tsx      # Column creation/editing
β”‚   β”œβ”€β”€ BatchUploadProgress.tsx # Batch upload progress overlay
β”‚   β”œβ”€β”€ ColumnTemplateMenu.tsx # Pre-built column templates
β”‚   β”œβ”€β”€ ErrorBoundary.tsx      # React error boundary
β”‚   β”œβ”€β”€ KeyboardShortcutsHelp.tsx # Shortcuts modal
β”‚   β”œβ”€β”€ ProjectManager.tsx     # Project save/load
β”‚   └── Icons.tsx              # Icon re-exports from lucide-react
β”œβ”€β”€ hooks/
β”‚   β”œβ”€β”€ useTheme.ts            # Dark mode hook
β”‚   └── useKeyboardShortcuts.ts # Keyboard shortcut handler
β”œβ”€β”€ services/
β”‚   β”œβ”€β”€ geminiService.ts       # Gemini API integration
β”‚   β”œβ”€β”€ documentProcessor.ts   # Frontend β†’ backend file conversion
β”‚   β”œβ”€β”€ batchExport.ts         # Multi-format export
β”‚   └── projectStore.ts        # LocalStorage project persistence
β”œβ”€β”€ tests/                     # Vitest test suite
β”œβ”€β”€ utils/
β”‚   β”œβ”€β”€ sampleData.ts          # Built-in sample documents
β”‚   └── columnTemplates.ts     # Column template definitions
β”œβ”€β”€ server/
β”‚   β”œβ”€β”€ main.py                # FastAPI backend
β”‚   └── requirements.txt       # Python dependencies
β”œβ”€β”€ .env.example               # Environment variable template
β”œβ”€β”€ vite.config.ts             # Vite configuration
β”œβ”€β”€ tsconfig.json              # TypeScript configuration
β”œβ”€β”€ docker-compose.yml         # Docker setup
β”œβ”€β”€ Dockerfile.frontend        # Frontend Docker image
└── Dockerfile.backend         # Backend Docker image

πŸ“– API Documentation

The FastAPI backend includes auto-generated interactive API documentation:

Endpoints

Method Path Description
POST /convert Upload a document (PDF, DOCX, etc.) and receive Markdown text. Rate-limited to 30 req/min per IP.
GET /health Health check endpoint. Returns {"status": "ok"}.

πŸ›‘ License

This project is licensed under the MIT License - see the LICENSE file for details.


Disclaimer: This tool is an AI assistant and should not be used as a substitute for professional legal advice. Always verify AI-generated results against the original documents.

About

An AI-powered tabular review tool for legal professionals. Ingest unstructured documents, define dynamic extraction columns, and query your data with an integrated analyst chat.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • TypeScript 92.1%
  • Python 5.8%
  • CSS 1.8%
  • HTML 0.3%