Public examples of using John Snow Labs' OCR for Apache Spark.
You need Secret key and License key.
Please contact us at [email protected] for get it.
Secret key is a key for download or install python package and jar from https://pypi.johnsnowlabs.com/.
Secret key is specific for each release.
License key is a key for run Spark OCR.
Need set value for license variable:
license = "license key"Linux, Mac OS:
export JSL_OCR_LICENSE=license_keyWindows:
set JSL_OCR_LICENSE=license_key- Go to the https://colab.research.google.com/
- Open notebook:
- File -> Open notebook.
- Switch to
Githubtab. - Enter url: https://github.com/JohnSnowLabs/spark-ocr-workshop.
- Select
SparkOcrSimpleExample.ipynbnotebook.
- Set
secretandlicensevariables to valid values in first cell. - Run all cells: Runtime -> Run all.
- Restart runtime: Runtime -> Restart runtime (Need restart first time after installing new packages).
- Run all cellls again.
- Install jupyter notebooks. More details: https://jupyter.org/install :
python3 -m pip install --user jupyter- Clone Spark OCR workshop repo:
git clone https://github.com/JohnSnowLabs/spark-ocr-workshop
cd spark-ocr-workshop- Run jupyter:
jupyter-notebook- Open jupyter in browser
- Open
jupyter/SparkOcrSimpleExample.ipynbnotebook. - Set
secretandlicensevariables to valid values in first cell. - Run all cells: Cell -> Run all.
- Restart runtime: Kernel -> Restart (Need restart first time after installing new packages).
- Run all cellls again.
It is possible to call functionality from Java project.
./java/ folder contains sample of Java project that could be built with Maven.