Skip to content

suyashsachdeva/Document_analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 

Repository files navigation

Document_analysis 📝

A Python-based toolkit for document layout analysis and classification, leveraging deep learning (CNNs/Transformers) to detect and segment document elements such as text, figures, tables, headers, and footnotes.

🚀 Features

  • Detects and segments layout components: titles, text blocks, images, tables, etc.
  • Trains on large-scale datasets like PubLayNet.
  • End-to-end pipeline: Data preparation ➜ Model training ➜ Inference ➜ Evaluation.
  • Supports state-of-the-art architectures: Faster R-CNN, Cascade R-CNN, or Transformer-based detectors.
  • Metrics: mAP, IoU for layout components.

🗂️ Table of Contents

  1. Installation
  2. Dataset
  3. Usage
  4. Examples
  5. Project Structure
  6. Configuration
  7. Troubleshooting
  8. License
  9. Contact

Installation

  1. Clone the repo:
    git clone https://github.com/suyashsachdeva/Document_analysis.git
    cd Document_analysis

2. (Recommended) Create a virtual environment:

   ```bash
   python3 -m venv venv
   source venv/bin/activate
  1. Install dependencies:

    pip install -r requirements.txt

🚀 Usage

1. Data Preparation

Prepare and preprocess data:

python scripts/prepare_data.py \
  --input_dir data/publaynet/images \
  --anno_dir data/publaynet/annotations \
  --output_dir processed_data

2. Train the Model

Train a layout detection model:

python train.py \
  --data_dir processed_data \
  --model_dir models/layout_detector \
  --epochs 30 \
  --batch_size 8 \
  --lr 1e-4

3. Run Inference

Detect layouts on new PDF or image files:

python inference.py \
  --model_dir models/layout_detector \
  --input_file samples/sample_page.jpg \
  --output_file results/prediction.json

📸 Examples

Visuals of layout detection overlaid on sample documents can be found in /results/.


🗂️ Project Structure

Document_analysis/
├── data/
│   └── publaynet/
├── processed_data/
├── models/
│   └── layout_detector/
├── scripts/
│   ├── prepare_data.py
│   ├── train.py
│   └── inference.py
├── requirements.txt
└── README.md

⚙️ Configuration

Customize parameters via CLI flags or config files:

  • --epochs, --batch_size, --lr
  • Paths: --data_dir, --model_dir, --input_file, --output_file
  • Backbone model params (ResNet, Transformer)

🛠️ Troubleshooting

  • CUDA errors: Ensure CUDA toolkit and GPU drivers are installed correctly.
  • Slow performance: Reduce batch size or lower backbone resolution.
  • Low accuracy: Check dataset labels, augmentation pipeline, or model depth.

📬 Contact

For queries or issues, open an issue or contact Suyash Sachdeva.


About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors