forked from PaddlePaddle/PaddleOCR
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
root
committed
Jun 8, 2020
1 parent
bf34514
commit 5df1f7e
Showing
13 changed files
with
113 additions
and
14 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
## FAQ | ||
|
||
1. **预测报错:got an unexpected keyword argument 'gradient_clip'** | ||
安装的paddle版本不对,目前本项目仅支持paddle1.7,近期会适配到1.8。 | ||
|
||
2. **转换attention识别模型时报错:KeyError: 'predict'** | ||
基于Attention损失的识别模型推理还在调试中。对于中文文本识别,建议优先选择基于CTC损失的识别模型,实践中也发现基于Attention损失的效果不如基于CTC损失的识别模型。 | ||
|
||
3. **关于推理速度** | ||
图片中的文字较多时,预测时间会增,可以使用--rec_batch_num设置更小预测batch num,默认值为30,可以改为10或其他数值。 | ||
|
||
4. **服务部署与移动端部署** | ||
预计6月中下旬会先后发布基于Serving的服务部署方案和基于Paddle Lite的移动端部署方案,欢迎持续关注。 | ||
|
||
5. **自研算法发布时间** | ||
自研算法SAST、SRN、End2End-PSL都将在6-7月陆续发布,敬请期待。 | ||
|
||
6. **如何在Windows或Mac系统上运行** | ||
PaddleOCR已完成Windows和Mac系统适配,运行时注意两点:1、在[快速安装](installation.md)时,如果不想安装docker,可跳过第一步,直接从第二步安装paddle开始。2、inference模型下载时,如果没有安装wget,可直接点击模型链接或将链接地址复制到浏览器进行下载,并解压放置到相应目录。 | ||
|
||
7. **超轻量模型和通用OCR模型的区别** | ||
目前PaddleOCR开源了2个中文模型,分别是8.6M超轻量中文模型和通用中文OCR模型。两者对比信息如下: | ||
- 相同点:两者使用相同的**算法**和**训练数据**; | ||
- 不同点:不同之处在于**骨干网络**和**通道参数**,超轻量模型使用MobileNetV3作为骨干网络,通用模型使用Resnet50_vd作为检测模型backbone,Resnet34_vd作为识别模型backbone,具体参数差异可对比两种模型训练的配置文件。 | ||
|模型|骨干网络|检测训练配置|识别训练配置| | ||
|-|-|-|-| | ||
|8.6M超轻量中文OCR模型|MobileNetV3+MobileNetV3|det_mv3_db.yml|rec_chinese_lite_train.yml| | ||
|通用中文OCR模型|Resnet50_vd+Resnet34_vd|det_r50_vd_db.yml|rec_chinese_common_train.yml| | ||
|
||
8. **是否有计划开源仅识别数字或仅识别英文+数字的模型** | ||
暂不计划开源仅数字、仅数字+英文、或其他小垂类专用模型。PaddleOCR开源了多种检测、识别算法供用户自定义训练,两种中文模型也是基于开源的算法库训练产出,有小垂类需求的小伙伴,可以按照教程准备好数据,选择合适的配置文件,自行训练,相信能有不错的效果。训练有任何问题欢迎提issue或在交流群提问,我们会及时解答。 | ||
|
||
9. **开源模型使用的训练数据是什么,能否开源** | ||
目前开源的模型,数据集和量级如下: | ||
- 检测: | ||
英文数据集,ICDAR2015 | ||
中文数据集,LSVT街景数据集训练数据3w张图片 | ||
- 识别: | ||
英文数据集,MJSynth和SynthText合成数据,数据量上千万。 | ||
中文数据集,LSVT街景数据集根据真值将图crop出来,并进行位置校准,总共30w张图像。此外基于LSVT的语料,合成数据500w。 | ||
|
||
其中,公开数据集都是开源的,用户可自行搜索下载,也可参考[中文数据集](datasets.md),合成数据暂不开源,用户可使用开源合成工具自行合成,可参考的合成工具包括[text_renderer](https://github.com/Sanster/text_renderer)、[SynthText](https://github.com/ankush-me/SynthText)、[TextRecognitionDataGenerator](https://github.com/Belval/TextRecognitionDataGenerator)等。 |
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,59 @@ | ||
## 数据集 | ||
这里整理了常用中文数据集,持续更新中,欢迎各位小伙伴贡献数据集~ | ||
- [ICDAR2019-LSVT](#ICDAR2019-LSVT) | ||
- [ICDAR2017-RCTW-17](#ICDAR2017-RCTW-17) | ||
- [中文街景文字识别](#中文街景文字识别) | ||
- [中文文档文字识别](#中文文档文字识别) | ||
- [ICDAR2019-ArT](#ICDAR2019-ArT) | ||
|
||
除了开源数据,用户还可使用合成工具自行合成,可参考的合成工具包括[text_renderer](https://github.com/Sanster/text_renderer)、[SynthText](https://github.com/ankush-me/SynthText)、[TextRecognitionDataGenerator](https://github.com/Belval/TextRecognitionDataGenerator)等。 | ||
|
||
<a name="ICDAR2019-LSVT"></a> | ||
#### 1、ICDAR2019-LSVT | ||
- **数据来源**:https://ai.baidu.com/broad/introduction?dataset=lsvt | ||
- **数据简介**: 共45w中文街景图像,包含5w(2w测试+3w训练)全标注数据(文本坐标+文本内容),40w弱标注数据(仅文本内容),如下图所示: | ||
data:image/s3,"s3://crabby-images/1d076/1d0766ae51f182a69451027751a7ab213819c00d" alt="" | ||
(a) 全标注数据 | ||
data:image/s3,"s3://crabby-images/4b964/4b9642e7c852e290c07343101220979d1928cd2c" alt="" | ||
(b) 弱标注数据 | ||
- **下载地址**:https://ai.baidu.com/broad/download?dataset=lsvt | ||
|
||
<a name="ICDAR2017-RCTW-17"></a> | ||
#### 2、ICDAR2017-RCTW-17 | ||
- **数据来源**:https://rctw.vlrlab.net/ | ||
- **数据简介**:共包含12,000+图像,大部分图片是通过手机摄像头在野外采集的。有些是截图。这些图片展示了各种各样的场景,包括街景、海报、菜单、室内场景和手机应用程序的截图。 | ||
data:image/s3,"s3://crabby-images/482b9/482b9dad1bd2ae1ceaf10c70ab0442c5192b7f72" alt="" | ||
- **下载地址**:https://rctw.vlrlab.net/dataset/ | ||
|
||
<a name="中文街景文字识别"></a> | ||
#### 3、中文街景文字识别 | ||
- **数据来源**:https://aistudio.baidu.com/aistudio/competition/detail/8 | ||
- **数据简介**:共包括29万张图片,其中21万张图片作为训练集(带标注),8万张作为测试集(无标注)。数据集采自中国街景,并由街景图片中的文字行区域(例如店铺标牌、地标等等)截取出来而形成。所有图像都经过一些预处理,将文字区域利用仿射变化,等比映射为一张高为48像素的图片,如图所示: | ||
data:image/s3,"s3://crabby-images/779b2/779b27fcdacce1a25cde7d446e4575815b362da6" alt="" | ||
(a) 标注:魅派集成吊顶 | ||
data:image/s3,"s3://crabby-images/84e09/84e09bd1238c7d037fbcfd29762e6514f8ee3001" alt="" | ||
(b) 标注:母婴用品连锁 | ||
- **下载地址** | ||
https://aistudio.baidu.com/aistudio/datasetdetail/8429 | ||
|
||
<a name="中文文档文字识别"></a> | ||
#### 4、中文文档文字识别 | ||
- **数据来源**:https://github.com/YCG09/chinese_ocr | ||
- **数据简介**: | ||
- 共约364万张图片,按照99:1划分成训练集和验证集。 | ||
- 数据利用中文语料库(新闻 + 文言文),通过字体、大小、灰度、模糊、透视、拉伸等变化随机生成 | ||
- 包含汉字、英文字母、数字和标点共5990个字符(字符集合:https://github.com/YCG09/chinese_ocr/blob/master/train/char_std_5990.txt ) | ||
- 每个样本固定10个字符,字符随机截取自语料库中的句子 | ||
- 图片分辨率统一为280x32 | ||
|
||
data:image/s3,"s3://crabby-images/cd6a8/cd6a8ed763ad79d8989e9fcb0fe479133b15378d" alt="" | ||
data:image/s3,"s3://crabby-images/db6f1/db6f16067eb4a371b3cafa352d2b484dd16e9e96" alt="" | ||
data:image/s3,"s3://crabby-images/4904c/4904c3ae8d9e276b6d08e60d89fd64fb6c9fb904" alt="" | ||
- **下载地址**:https://pan.baidu.com/s/1QkI7kjah8SPHwOQ40rS1Pw (密码:lu7m) | ||
|
||
<a name="ICDAR2019-ArT"></a> | ||
#### 5、ICDAR2019-ArT | ||
- **数据来源**:https://ai.baidu.com/broad/introduction?dataset=art | ||
- **数据简介**:共包含10,166张图像,训练集5603图,测试集4563图。由Total-Text、SCUT-CTW1500、Baidu Curved Scene Text三部分组成,包含水平、多方向和弯曲等多种形状的文本。 | ||
data:image/s3,"s3://crabby-images/a0513/a0513af3c024b46fa0aa95bb7baf0fafad5fd1af" alt="" | ||
- **下载地址**:https://ai.baidu.com/broad/download?dataset=art |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.