Skip to content

ViratSrivastava/OCR

Repository files navigation

OCR Model

This repository contains an Optical Character Recognition (OCR) model developed using PyTorch.

Table of Contents

Introduction

This OCR model is designed to recognize and extract text from images. It leverages the power of PyTorch to build and train deep learning models for accurate text recognition.

Installation

To install the necessary dependencies, run the following command:

pip install -r requirements.txt
## Folder Structure
OCR-Project/
│── data/                    # Dataset folder
│   ├── train/               # Training images
│   ├── val/                 # Validation images
│   ├── labels/              # Labels for OCR
│── weights/                 # Folder to store trained model weights
│── nn.py                    # Model definition (CRNN)
│── train.py                  # Training script
│── utils.py                  # Helper functions
│── requirements.txt          # Dependencies

Usage

To use the OCR model, follow these steps:

  1. Clone the repository:
    git clone https://github.com/yourusername/ocr-model.git
  2. Navigate to the project directory:
    cd ocr-model
  3. Run the OCR script on an image:
    python ocr.py --image path/to/your/image.jpg

Model Architecture

The OCR model is built using a combination of convolutional neural networks (CNNs) and recurrent neural networks (RNNs) to effectively recognize text in images. The architecture includes:

  • Convolutional layers for feature extraction
  • Recurrent layers for sequence modeling
  • Fully connected layers for character classification

Training

To train the OCR model, use the following command:

python train.py --data path/to/dataset --epochs 50

Ensure you have a dataset of labeled images for training.

Evaluation

To evaluate the performance of the OCR model, run:

python evaluate.py --data path/to/testset

This will provide metrics such as accuracy and loss.

Contributing

Contributions are welcome! Please fork the repository and submit a pull request.

License

This project is licensed under the MIT License. See the LICENSE file for details.

About

Multi Langauge OCR models

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published