YOLOv8 Patent Text Region Detection Model
Model Description
patent_text_regions is a YOLOv8 model fine-tuned on a custom dataset of page-level images drawn from historical patent specifications published by the British Patent Office. It has been trained to recognize all text regions located within pages of patent specifications as a single class. We take the initialized weights from the official release of the small YOLOv8s model (yolov8s.pt) and fine tune on our custom dataset.
Usage
This model can be used in the same way as any pre-trained YOLOv8 model by setting the model path to best.pt.
Training Data
The dataset was created by randomly sampling 420 page images from British patent specifications published between 1850-1899. The data was randomly split 80-10-10 (train-val-test) and then standard preprocessing (images were stretched and auto-oriented to 640 x 640 pixels) and the following data augmentations were applied using Roboflow:
- Crop: 0% Minimum Zoom, 20% Maximum Zoom
- Grayscale: Apply to 15% of images
- Saturation: Between -25% and +25%
- Blur: Up to 2.5px
- Noise: Up to 0.1% of pixels
The custom dataset consists of 1,092 labelled images in total, which are made available in this repository.
Hyperparameters
We train the model using default hyperparameters, except from the batch size (128) and the number of epochs (300).
Evaluation
Evals on the test set are reported below:
- mAP50: 0.987
- mAP50-95: 0.892
Citation
If you use our model or custom training/evaluation data in your research, please cite our accompanying paper as follows:
@article{bct2025,
title = {300 Years of British Patents},
author = {Enrico Berkes and Matthew Lee Chen and Matteo Tranchero},
journal = {arXiv preprint arXiv:2401.12345},
year = {2025},
url = {https://arxiv.org/abs/2401.12345}
}
Model tree for gbpatentdata/patent_text_regions
Base model
Ultralytics/YOLOv8