Skip to content

added markdown document for ocr engine comparison #577

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Kaan0029
Copy link

This is related to gsoc ocr project by Kaan Erdem.
JabRef/jabref#13313

* **Bad**, because increases support complexity with multiple engines

### Confirmation

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Elaborate on how this is done. I would assume that you have the 100+ PDFs at hand and wrote a test suite?


* Current implementation uses Tesseract 4.x with LSTM engine
* In benchmarks, Google Cloud Vision shows the highest overall accuracy
* Handwriting (categories 2 & 3) is the main differentiator among engines
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where are these catorgies mentioned?


The web resources that informed this ADR:

1. <https://www.mdpi.com/2073-8994/12/5/715>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Link that to each pro/con agrument

@@ -0,0 +1,153 @@
# ADR-002: OCR Engine Selection for JabRef
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try to follow the format given at JabRef's repo - and place it in the JabRef folder. https://github.com/JabRef/jabref/tree/main/docs/decisions

I think, this is AI generated, because I cannot explain otherwise why A) this takes number 0002 - and in the heading.

(And does not follow the MADR format)

@calixtus
Copy link
Member

should go to devdocs: jabref/docs/decisions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants