This repository contains an Optical Character Recognition (OCR) model developed using PyTorch.
This OCR model is designed to recognize and extract text from images. It leverages the power of PyTorch to build and train deep learning models for accurate text recognition.
To install the necessary dependencies, run the following command:
pip install -r requirements.txt
## Folder Structure
OCR-Project/
│── data/ # Dataset folder
│ ├── train/ # Training images
│ ├── val/ # Validation images
│ ├── labels/ # Labels for OCR
│── weights/ # Folder to store trained model weights
│── nn.py # Model definition (CRNN)
│── train.py # Training script
│── utils.py # Helper functions
│── requirements.txt # Dependencies
To use the OCR model, follow these steps:
- Clone the repository:
git clone https://github.com/yourusername/ocr-model.git
- Navigate to the project directory:
cd ocr-model
- Run the OCR script on an image:
python ocr.py --image path/to/your/image.jpg
The OCR model is built using a combination of convolutional neural networks (CNNs) and recurrent neural networks (RNNs) to effectively recognize text in images. The architecture includes:
- Convolutional layers for feature extraction
- Recurrent layers for sequence modeling
- Fully connected layers for character classification
To train the OCR model, use the following command:
python train.py --data path/to/dataset --epochs 50
Ensure you have a dataset of labeled images for training.
To evaluate the performance of the OCR model, run:
python evaluate.py --data path/to/testset
This will provide metrics such as accuracy and loss.
Contributions are welcome! Please fork the repository and submit a pull request.
This project is licensed under the MIT License. See the LICENSE file for details.