Skip to content

yui-mhcp/ocr

Repository files navigation

😋 Optical Character Recognition (OCR)

Check the CHANGELOG file to have a global overview of the latest modifications! 😋

Project structure

├── architectures            : utilities for model architectures
│   ├── layers               : custom layer implementations
│   ├── transformers         : transformer architecture implementations
│   ├── common_blocks.py     : defines common blocks (e.g., Conv + BN + ReLU)
│   ├── crnn_arch.py         : CRNN architecture
│   ├── east_arch.py         : EAST architecture
│   ├── generation_utils.py  : utilities for text and sequence generation
│   ├── hparams.py           : hyperparameter management
│   ├── simple_models.py     : defines classical models such as CNN / RNN / MLP and siamese
│   └── yolo_arch.py         : YOLOv2 architecture
├── custom_train_objects     : custom objects used in training / testing
├── loggers                  : logging utilities for tracking experiment progress
├── models                   : main directory for model classes
│   ├── detection            : detector implementations
│   │   ├── base_detector.py : abstract base class for all detectors
│   │   ├── east.py          : EAST implementation for text detection
│   │   └── yolo.py          : YOLOv2 implementation for general object detection
│   ├── interfaces           : directories for interface classes
│   ├── ocr                  : OCR implementations
│   │   ├── base_ocr.py      : abstract base class for all OCR models
│   │   └── crnn.py          : CRNN implementation for OCR
│   └── weights_converter.py : utilities to convert weights between different models
├── tests                    : unit and integration tests for model validation
├── utils                    : utility functions for data processing and visualization
├── LICENCE                  : project license file
├── ocr.ipynb                : notebook demonstrating model creation + OCR features
├── README.md                : this file
└── requirements.txt         : required packages

Check the main project for more information about the unextended modules / structure / main classes.

Check the detection project for more information about the detection module and the EAST Scene-Text Detection model.

Available features

  • OCR (module models.ocr) :
Feature Function / class Description
OCR ocr Performs OCR on the given image(s)

You can check the ocr notebook for a concrete demonstration.

Available models

Model architectures

Available architectures :

Model weights

Classes Dataset Architecture Trainer Weights

Models must be unzipped in the pretrained_models/ directory!

The pretrained CRNN models come from the EasyOCR library. Weights are automatically downloaded given the language or the model name, and converted to keras! The easyocr library is therefore not required, while pytorch is required for weights loading (for conversion).

The pretrained version of EAST can be downloaded from this project. It should be placed in pretrained_models/pretrained_weights/east_vgg16.pth (torch is required to convert the weights: pip install torch).

Installation and usage

See the installation guide for a step-by-step installation 😄

Here is a summary of the installation procedure, if you have a working python environment :

  1. Clone this repository: git clone https://github.com/yui-mhcp/ocr.git
  2. Go to the root of this repository: cd ocr
  3. Install requirements: pip install -r requirements.txt
  4. Open the ocr notebook and follow the instructions!

TO-DO list:

  • Make the TO-DO list
  • Convert the CRNN architecture / weights from the easyocr library to tensorflow
  • Convert the CRNN + attention architecture from this repo to tensorflow
  • Add examples to initialize pretrained models (both EAST and CRNN)
  • Add an example to perform OCR on image (with text detection)
  • Add an example to perform OCR on camera
  • Allow to combine texts in lines / paragraphs (as EAST detects individual words)
  • Take into account the text rotation in the combination procedure

Notes and references

GitHub projects

The code for the CRNN architecture is highly inspired from the easyocr repo:

The code for the EAST part of this project is highly inspired from this repo:

Papers

Datasets

Tutorials

Contacts and licence

Contacts:

This project is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0). See the LICENSE file for details.

This license allows you to use, modify, and distribute the code, as long as you include the original copyright and license notice in any copy of the software/source. Additionally, if you modify the code and distribute it, or run it on a server as a service, you must make your modified version available under the same license.

For more information about the AGPL-3.0 license, please visit the official website

Citation

If you find this project useful in your work, please add this citation to give it more visibility! 😋

@misc{yui-mhcp
    author  = {yui},
    title   = {A Deep Learning projects centralization},
    year    = {2021},
    publisher   = {GitHub},
    howpublished    = {\url{https://github.com/yui-mhcp}}
}

Releases

No releases published

Packages

No packages published