Skip to content

use python and anthopic to convert a resume.docx or resume.pdf into json format

License

Notifications You must be signed in to change notification settings

sbecker11/resume-parser-python

Folders and files

NameName
Last commit message
Last commit date

Latest commit

38193d9 · Aug 30, 2024

History

6 Commits
Aug 27, 2024
Aug 30, 2024
Aug 30, 2024
Aug 27, 2024
Aug 30, 2024
Aug 27, 2024
Aug 27, 2024
Aug 27, 2024
Aug 27, 2024
Aug 27, 2024
Aug 27, 2024
Aug 27, 2024
Aug 27, 2024
Aug 27, 2024
Aug 27, 2024

Repository files navigation

resume-parser-python

attempts to read PDF and DOCX files and parse it into common resume sections in a JSON file

Setup

python3 -m venv venv; 
source venv/bin/activate; 
python3 -m pip install --upgrade pip; 
python3 -m pip install -r requirements.txt;

use anthropic claude to extract json structure from a pdf file

python resume_parser.py inputs/proj-mngr.pdf outputs/proj-mngr-pdf.json

use anthropic claude to extract json structure from a docx file

python resume_parser.py inputs/proj-mngr.docx outputs/proj-mngr-docx.json

verify that the extracted json files are -nearly- identical

diff outputs/proj-mngr-pdf.json outputs/proj-mngr-docx.json

compute the json-schema of the json file from the pdf json file

python compute_json_schema.py outputs/proj-mngr-pdf.json outputs/proj-mngr-pdf-schema.json

compute the json-schema of the json file from the docx json file

python compute_json_schema.py outputs/proj-mngr-docx.json outputs/proj-mngr-docx-schema.json

verify that the json-schema files are -nearly- identical

diff outputs/proj-mngr-pdf-schema.json outputs/proj-mngr-docx-schema.json

validate the proj-mngr-pdf.json data object against the resume schema

python validate_data_object.py outputs/resume-pdf.json inputs/resume-schema.json

validate the proj-mngr-docx.json data object against the resume schema

python validate_data_object.py outputs/resume-docx.json inputs/resume-schema.json

About

use python and anthopic to convert a resume.docx or resume.pdf into json format

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published