-
Notifications
You must be signed in to change notification settings - Fork 5
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
0 parents
commit 33ea9e0
Showing
22 changed files
with
6,856 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
.DS_Store | ||
.vscode/ | ||
*/__pycache__/* |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,125 @@ | ||
Improving Word Recognition using Multiple Hypotheses and Deep Embeddings | ||
================================================================================= | ||
[data:image/s3,"s3://crabby-images/47ed4/47ed42f8605659fd270a8f2cb45e12db2688365f" alt="arXiv"](http://arxiv.org/abs/2010.14411) | ||
[data:image/s3,"s3://crabby-images/7a4eb/7a4eb7dde90b3c6effc80e7c87d5259e805747df" alt="License: MIT"](LICENSE) | ||
|
||
### [Project page](https://sid2697.github.io/embednet_cab/) | [Paper](https://arxiv.org/pdf/2010.14411.pdf) | Poster (Coming soon!) | ||
|
||
This repository contains code for the paper | ||
|
||
"**Improving Word Recognition using Multiple Hypotheses and Deep Embeddings**" *[Siddhant Bansal](https://sid2697.github.io), [Praveen Krishnan](https://kris314.github.io), [C.V. Jawahar](https://faculty.iiit.ac.in/~jawahar/index.html)* | ||
published in [ICPR 2020](https://www.icpr2020.it). | ||
|
||
## Abstract | ||
We propose a novel scheme for improving the word recognition accuracy using word image embeddings. We use a trained text recognizer, which can predict multiple text hypothesis for a given word image. Our fusion scheme improves the recognition process by utilizing the word image and text embeddings obtained from a trained word image embedding network. We propose EmbedNet, which is trained using a triplet loss for learning a suitable embedding space where the embedding of the word image lies closer to the embedding of the corresponding text transcription. The updated embedding space thus helps in choosing the correct prediction with higher confidence. To further improve the accuracy, we propose a plug-and-play module called Confidence based Accuracy Booster (CAB). The CAB module takes in the confidence scores obtained from the text recognizer and Euclidean distances between the embeddings to generate an updated distance vector. The updated distance vector has lower distance values for the correct words and higher distance values for the incorrect words. We rigorously evaluate our proposed method systematically on a collection of books in the Hindi language. Our method achieves an absolute improvement of around 10% in terms of word recognition accuracy. | ||
|
||
## Word Recognition Results | ||
<!-- ----------- --> | ||
data:image/s3,"s3://crabby-images/63628/6362892a0fe84915e44f61daae57897621c4b74a" alt="Word Recognition" | ||
|
||
Usage | ||
----------- | ||
### Cloning the repository | ||
```sh | ||
git clone https://github.com/Sid2697/Word-recognition-EmbedNet-CAB.git | ||
cd Word-recognition-EmbedNet-CAB | ||
``` | ||
### Install Pre-requisites | ||
- Python >= 3.5 | ||
- PyTorch | ||
- Scikit-learn | ||
- NumPy | ||
- tqdm | ||
|
||
**`requirements.txt`** has been provided for installing Python dependencies. | ||
|
||
```sh | ||
pip install -r requirements.txt | ||
``` | ||
### Generating/using deep embeddings | ||
The deep embeddings used in this work are generated using the End2End network proposed in: | ||
``` | ||
Krishnan, P., Dutta, K., Jawahar, C.V.: Word spotting and recognition using deep embedding. In: 2018 13th IAPR International Workshop on Document Analysis Systems (DAS). pp. 1–6 (April 2018). https://doi.org/10.1109/DAS.2018.70 | ||
``` | ||
Word text and image's deep embeddings for testing this repository are provided in the ```embeddings``` folder. | ||
Text files containing the information about the embeddings are required while running the code. They are in the format<br> | ||
```<img1-path><space><text1-string><space><dummyInt><space>1```<br> | ||
```<img2-path><space><text2-string><space><dummyInt><space>1```<br> | ||
...<br> | ||
Corresponding text files for testing this repository are provided in the ``gen_files`` folder. | ||
|
||
### Performing word recognition (using a pre-trained EmbedNet) | ||
Pre-trained EmbedNet models are saved in the ``models`` folder.<br> | ||
|
||
For running baseline word recognition use the command: | ||
```sh | ||
python src/word_rec_EmbedNet.py | ||
``` | ||
For running word recognition with confidence score use the command: | ||
```sh | ||
python src/word_rec_EmbedNet.py --use_confidence | ||
``` | ||
For running word recognition using a pre-trained EmbedNet use the command: | ||
```sh | ||
python src/word_rec_EmbedNet.py --use_confidence --use_model --hidden_layers 1024 | ||
``` | ||
For running word recognition using a pre-trained EmbedNet and the CAB module use the command: | ||
```sh | ||
python src/word_rec_EmbedNet.py --use_confidence --use_model --hidden_layers 1024 --cab | ||
``` | ||
Other arguments for word recognition experiment are: | ||
```sh | ||
--image_embeds | ||
--topk_embeds | ||
--image_file | ||
--predictions_file | ||
--use_confidence | ||
--cab | ||
--cab_alpha | ||
--cab_beta | ||
--in_features | ||
--out_features | ||
--hidden_layers | ||
--model_path | ||
--testing | ||
--test_split | ||
--k | ||
``` | ||
- `image_embeds` is used to provide path to the image embeddings | ||
- `topk_embeds` is used to provide path to the TopK predictions' embeddings | ||
- `image_file` is used to provide path to the image's text information file | ||
- `predictions_file` is used to provide path to the TopK predictions' text information file | ||
- `use_confidence` if used then confidence score is used for re-ranking the predictions | ||
- `cab` if used then the CAB module is used for improving the word recognition accuracy | ||
- `cab_alpha` hyper-parameter alpha defined for the CAB module | ||
- `cab_beta` hyper-parameter beta defined for the CAB module | ||
- `in_features` size of the input to EmbedNet | ||
- `out_features` size of the output to EmbedNet | ||
- `hidden_layers` list of input size of the hidden layers | ||
- `model_path` path to the pre-trained model to be used for testing | ||
- `testing` if used then only test set is used for evaluation | ||
- `test_split` split for testing the trained EmbedNet on un-seen data | ||
- `k` total number of predictions to test on (max 20) | ||
|
||
### Training EmbedNet | ||
TODO | ||
|
||
License and Citation | ||
--------------------- | ||
|
||
The software is licensed under the MIT License. If you find this work useful, please cite the following paper: | ||
|
||
``` | ||
@misc{bansal2020fused, | ||
title={Fused Text Recogniser and Deep Embeddings Improve Word Recognition and Retrieval}, | ||
author={Siddhant Bansal and Praveen Krishnan and C. V. Jawahar}, | ||
year={2020}, | ||
eprint={2007.00166}, | ||
archivePrefix={arXiv}, | ||
primaryClass={cs.CV} | ||
} | ||
``` | ||
|
||
Contact | ||
----------- | ||
In case of any query contact [Siddhant Bansal](https://sid2697.github.io). |
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,101 @@ | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-66-9-22-5-842-1078-2742-2794-124184.png झर-झर 1 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-66-6-16-6-978-1471-2036-2110-124185.png समझाया-बुझाया 2 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-66-6-16-1-181-396-2014-2089-124186.png कोशिशें 3 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-66-6-13-17-2087-2284-1655-1726-124187.png देखोगी 4 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-66-6-12-11-1545-1725-1554-1631-124188.png कुसुस, 5 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-66-6-11-3-405-624-1417-1491-124189.png पौंछकर 6 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-66-6-9-4-836-993-1178-1253-124190.png ओंधी 7 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-66-5-7-11-1888-2122-962-1016-124191.png दरवाज़ा 8 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-66-5-7-7-1260-1416-964-1040-124192.png हुक्का 9 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-66-5-7-5-934-1166-947-1042-124193.png कुंचनाथ 10 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-66-3-4-10-1408-1553-611-685-124194.png भूता, 11 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-66-3-2-11-1872-2001-357-440-124195.png क्यों, 12 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-66-3-2-5-981-1195-375-429-124196.png बदज़ात 13 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-66-3-2-3-494-804-380-450-124197.png नमकहराम 14 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-43-4-9-5-749-1061-1160-1231-124198.png नमकहराम 15 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-66-3-2-1-189-314-359-433-124199.png जैसा 16 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-43-13-30-12-1979-2171-3642-3694-124200.png चरमर 17 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-43-12-29-3-472-567-3503-3576-124201.png ढले 18 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-43-12-29-1-235-365-3503-3576-124202.png साँचे 19 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-43-12-28-5-739-853-3380-3469-124203.png पाएँ 20 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-43-12-27-10-1520-1737-3260-3344-124204.png छी-छी; 21 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-43-11-25-1-377-561-3028-3117-124205.png “होता 22 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-43-9-22-3-512-692-2701-2776-124206.png फूलना 23 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-43-9-21-1-380-550-2565-2640-124207.png “कौन 24 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-43-8-19-5-948-1120-2331-2403-124208.png “जिसे 25 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-43-6-14-12-1983-2151-1739-1812-124209.png न्यौता 26 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-43-5-11-9-1702-1829-1397-1473-124210.png जूता 27 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-43-4-8-11-1561-1822-1024-1101-124211.png बैठी-बैठी 28 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-43-4-7-11-1548-1789-907-983-124212.png चालाकी 29 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-43-4-7-8-1254-1380-923-1007-124213.png ‘झूठ 30 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-43-4-6-6-1116-1337-795-867-124214.png नलडांगे 31 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-43-4-6-1-386-600-815-896-124215.png “बुखार 32 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-52-15-30-12-2145-2285-3678-3768-124216.png खड़े- 33 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-52-14-27-10-1639-1797-3325-3406-124217.png खोले, 34 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-52-14-26-3-455-598-3199-3272-124218.png ढंकने 35 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-52-14-26-1-169-348-3197-3274-124219.png “भैया 36 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-52-13-23-8-1512-1642-2844-2917-124220.png तरेर 37 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-52-11-20-15-1939-2100-2485-2561-124221.png डिगने 38 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-52-11-20-12-1410-1641-2489-2564-124222.png फिसलने 39 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-52-11-20-4-520-662-2487-2561-124223.png मोती 40 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-52-11-19-10-1463-1618-2367-2437-124224.png कंचल 41 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-52-11-19-4-679-847-2363-2449-124225.png चढ़ती 42 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-52-11-17-1-179-510-2150-2225-124226.png सुना-समझा 43 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-52-11-15-6-1139-1272-1889-1962-124227.png “मेरे 44 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-52-9-12-6-1190-1332-1551-1630-124228.png वृन्दा 45 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-52-9-12-1-329-661-1530-1605-124229.png “अभी-अभी 46 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-52-8-11-1-331-487-1422-1482-124230.png “कल 47 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-52-7-10-10-1804-1918-1292-1366-124231.png था” 48 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-52-7-10-6-1070-1289-1306-1388-124232.png “कुसुम, 49 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-52-2-2-5-834-1042-354-427-124233.png करती” 50 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-40-3-4-7-1146-1297-681-734-124234.png अक्षर 51 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-40-1-1-1-238-319-238-299-124235.png 40 52 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-74-3-26-10-1561-1860-3164-3246-124236.png “वृन्दावन, 53 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-74-2-22-6-1211-1367-2680-2765-124237.png डरेंगे, 54 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-74-2-21-5-693-1050-2566-2637-124238.png पाठशालाआें 55 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-74-2-20-4-617-774-2438-2511-124239.png जन्मों 56 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-74-2-19-6-879-1019-2313-2387-124240.png उठेंगे 57 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-74-2-18-5-1246-1557-2203-2274-124241.png काम-धन्धों 58 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-74-2-18-4-970-1217-2199-2271-124242.png जातिगत 59 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-74-2-18-3-698-937-2197-2283-124243.png छोड़कर, 60 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-74-2-18-1-197-438-2198-2284-124244.png छोड़कर, 61 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-74-2-17-13-2221-2311-2083-2157-124245.png वर्ग 62 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-74-2-14-8-1405-1496-1731-1804-124246.png वर्ग 63 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-74-2-16-1-195-300-1963-2036-124247.png झते 64 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-74-2-15-5-809-1254-1846-1940-124248.png किसान-मजदूरों 65 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-74-2-15-4-516-783-1844-1917-124249.png आशिक्षित 66 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-74-2-13-9-1368-1835-1645-1712-124250.png आचार-व्यवहार 67 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-74-2-12-11-1805-2014-1510-1584-124251.png बालकों 68 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-74-2-11-9-1724-1888-1393-1465-124252.png सीखो 69 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-74-2-11-7-1252-1481-1391-1463-124253.png आत्मीय 70 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-74-2-11-1-194-526-1388-1481-124254.png के-अर्थात् 71 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-74-2-10-10-1514-1689-1273-1356-124255.png केशव, 72 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-74-2-2-4-1098-1274-336-416-124256.png केशव, 73 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-74-2-8-11-1733-2008-1061-1114-124257.png बाप-दादा 74 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-74-2-7-13-2033-2312-925-1008-124258.png पढ़-लिख- 75 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-74-2-6-1-190-371-800-874-124259.png जिसमे 76 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-74-2-4-9-1605-1844-570-644-124260.png मालिकों 77 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-74-2-3-16-2057-2309-457-532-124261.png आत्मीयों 78 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-74-2-2-3-677-1070-331-422-124262.png कहा-“देखो 79 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-74-2-2-1-343-576-355-432-124263.png वृन्दावन 80 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-74-1-1-1-203-283-145-207-124264.png 74 81 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-45-11-25-11-1999-2195-3515-3607-124265.png कहोगी 82 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-45-11-25-8-1317-1730-3511-3594-124266.png सोच-समझकर 83 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-45-11-25-1-364-579-3517-3595-124267.png “इसका 84 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-45-10-24-1-363-758-3382-3457-124268.png “सोच-विचार 85 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-45-9-23-8-1252-1510-3268-3366-124269.png निकलूँगा 86 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-45-9-22-2-365-777-3147-3220-124270.png सोचने-समझने 87 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-45-8-20-7-1049-1139-2913-2996-124271.png गइ 88 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-45-8-19-10-2222-2341-2807-2898-124272.png हँसीं 89 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-45-8-19-4-1120-1238-2792-2889-124273.png लूँ,” 90 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-45-7-18-3-491-704-2676-2765-124274.png सँभाला 91 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-45-7-17-1-347-515-2578-2635-124275.png “सच 92 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-45-6-14-12-1777-2058-2218-2291-124276.png स्वजातीय 93 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-45-5-12-6-1401-1612-1978-2053-124277.png दर्दभरी 94 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-45-4-9-10-1873-2039-1634-1710-124278.png देखेगी 95 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-45-4-9-5-1141-1347-1631-1706-124279.png लाऊँगा 96 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-45-2-6-9-1410-1718-1283-1360-124280.png ढोती-ढोती 97 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-45-2-4-8-1191-1333-1057-1133-124281.png “मां, 98 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-45-2-3-10-1633-1800-954-1006-124282.png शकल 99 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-45-2-2-10-1778-1918-819-892-124283.png बेबस 100 1 | ||
/home/siddhant.b/e2eEmbed/wordImages/0020-45-1-1-1-845-1220-445-551-124284.png सातवाँ 101 1 |
Oops, something went wrong.