Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: adding handling for str as filename argument to load/save methods #204

Closed
wants to merge 202 commits into from

Conversation

muhark
Copy link
Contributor

@muhark muhark commented Mar 19, 2025

Ugh, very sorry - I tried to amend the commit email. Please ignore this.

Cesar Berrospi Ramis and others added 30 commits July 15, 2024 12:55
Signed-off-by: Cesar Berrospi Ramis <[email protected]>
* fix(test): set static typing compatible to python 3.10

Signed-off-by: Cesar Berrospi Ramis <[email protected]>

* style: enforce some pre-commit hooks on tests

Enforce pre-commit hooks black, isort, autoflake, and mypy on test modules.

Signed-off-by: Cesar Berrospi Ramis <[email protected]>

* fix(rec): fix definition issues in attribute, predicate, subject

Remove duplicate generic types across base and predicate modules.
Create an identifier class for subject names.
Remove unnecessary type variables in attribute model.

Signed-off-by: Cesar Berrospi Ramis <[email protected]>

* docs: refer to Docling data objects

Signed-off-by: Cesar Berrospi Ramis <[email protected]>

---------

Signed-off-by: Cesar Berrospi Ramis <[email protected]>
* single source of version

Signed-off-by: Michele Dolfi <[email protected]>

* add cicd

Signed-off-by: Michele Dolfi <[email protected]>

* apply pre-commit validators

Signed-off-by: Michele Dolfi <[email protected]>

* Update pyproject.toml

Co-authored-by: Panos Vagenas <[email protected]>
Signed-off-by: Michele Dolfi <[email protected]>

* update lock

Signed-off-by: Michele Dolfi <[email protected]>

---------

Signed-off-by: Michele Dolfi <[email protected]>
Signed-off-by: Michele Dolfi <[email protected]>
Co-authored-by: Panos Vagenas <[email protected]>
…ject#5)

For consistency, set the default value for name field in search.package.Package to doocling-core,
since the field version is set to docling-core version by default.

Signed-off-by: Cesar Berrospi Ramis <[email protected]>
Signed-off-by: Panos Vagenas <[email protected]>
Co-authored-by: Michele Dolfi <[email protected]>
* docs: Update link to Docling

Signed-off-by: Christoph Auer <[email protected]>

* Update citation

Signed-off-by: Christoph Auer <[email protected]>

---------

Signed-off-by: Christoph Auer <[email protected]>
* added the XML export

Signed-off-by: Peter Staar <[email protected]>

* reformatted all

Signed-off-by: Peter Staar <[email protected]>

* fixed tests

Signed-off-by: Peter Staar <[email protected]>

* added the DocumentTokens class

Signed-off-by: Peter Staar <[email protected]>

* updating the to-xml method

Signed-off-by: Peter Staar <[email protected]>

* updating the to-xml method

Signed-off-by: Peter Staar <[email protected]>

* fixed the to-md method

Signed-off-by: Peter Staar <[email protected]>

* added the strict-text in the to-md method

Signed-off-by: Peter Staar <[email protected]>

* added page-tokens

Signed-off-by: Peter Staar <[email protected]>

* updated the location/page tokens

Signed-off-by: Peter Staar <[email protected]>

---------

Signed-off-by: Peter Staar <[email protected]>
* added the XML export

Signed-off-by: Peter Staar <[email protected]>

* reformatted all

Signed-off-by: Peter Staar <[email protected]>

* fixed tests

Signed-off-by: Peter Staar <[email protected]>

* added the DocumentTokens class

Signed-off-by: Peter Staar <[email protected]>

* updating the to-xml method

Signed-off-by: Peter Staar <[email protected]>

* updating the to-xml method

Signed-off-by: Peter Staar <[email protected]>

* fixed the to-md method

Signed-off-by: Peter Staar <[email protected]>

* added the strict-text in the to-md method

Signed-off-by: Peter Staar <[email protected]>

* added page-tokens

Signed-off-by: Peter Staar <[email protected]>

* updated the location/page tokens

Signed-off-by: Peter Staar <[email protected]>

* small fix to have correct special document-tokens

Signed-off-by: Peter Staar <[email protected]>

* reformatted the code

Signed-off-by: Peter Staar <[email protected]>

---------

Signed-off-by: Peter Staar <[email protected]>
github-actions[bot] and others added 24 commits February 27, 2025 10:40
chore: suppress warning for missing fallback case

Signed-off-by: Yusik Kim <[email protected]>
…odels (docling-project#187)

* Adding load_from_document_tokens method to Docling document

Signed-off-by: Maksym Lysak <[email protected]>

* Added skeleton for the unit test

Signed-off-by: Maksym Lysak <[email protected]>

* Processing ordered and unordered lists

Signed-off-by: Maksym Lysak <[email protected]>

* Define new types for DocTags containers

Signed-off-by: Christoph Auer <[email protected]>

* Implement load_from_doctags with container types

Signed-off-by: Christoph Auer <[email protected]>

* Update tests

Signed-off-by: Christoph Auer <[email protected]>

* update test data

Signed-off-by: Christoph Auer <[email protected]>

* Cleanup commented lines

Signed-off-by: Christoph Auer <[email protected]>

* Fix for proper list groups

Signed-off-by: Maksym Lysak <[email protected]>

* Check for number of page doctags must be equal to page images

Signed-off-by: Maksym Lysak <[email protected]>

---------

Signed-off-by: Maksym Lysak <[email protected]>
Signed-off-by: Christoph Auer <[email protected]>
Co-authored-by: Maksym Lysak <[email protected]>
Co-authored-by: Christoph Auer <[email protected]>
* Move model types from docling-parse

Signed-off-by: Christoph Auer <[email protected]>

* Add BoundingRectangle.from_bounding_box method

Signed-off-by: Christoph Auer <[email protected]>

* Add cell ID to model_validator

Signed-off-by: Christoph Auer <[email protected]>

* Add ordering to model_validator

Signed-off-by: Christoph Auer <[email protected]>

* update comments

Signed-off-by: Christoph Auer <[email protected]>

* Rewrite BoundingRectangle.to_bounding_box()

Signed-off-by: Christoph Auer <[email protected]>

* Add PdfCellRenderingMode

Signed-off-by: Christoph Auer <[email protected]>

* Update model_validator

Signed-off-by: Christoph Auer <[email protected]>

* Typing fixes and cleanup

Signed-off-by: Christoph Auer <[email protected]>

---------

Signed-off-by: Christoph Auer <[email protected]>
* chore: move to docling-project gh org

Signed-off-by: Michele Dolfi <[email protected]>

* retrigger ci with DCO

Signed-off-by: Michele Dolfi <[email protected]>

---------

Signed-off-by: Michele Dolfi <[email protected]>
fix favicon url

Signed-off-by: Michele Dolfi <[email protected]>
add caption to the table in load_from_doctags

Signed-off-by: Saidgurbuz <[email protected]>
…ject#188)

* add kv_item support for doctag to docling_document

Signed-off-by: Saidgurbuz <[email protected]>

* use resize_by_scale to save locations

Signed-off-by: Saidgurbuz <[email protected]>

* add kv region to tag_to_doclabel

Signed-off-by: Saidgurbuz <[email protected]>

* add test for doctags_load_for_kv_region

Signed-off-by: Saidgurbuz <[email protected]>

* update the naming to .dt for consistency

Signed-off-by: Saidgurbuz <[email protected]>

---------

Signed-off-by: Saidgurbuz <[email protected]>
Copy link

mergify bot commented Mar 19, 2025

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🟢 Enforce conventional commit

Wonderful, this rule succeeded.

Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/

  • title ~= ^(fix|feat|docs|style|refactor|perf|test|build|ci|chore|revert)(?:\(.+\))?(!)?:

@muhark muhark force-pushed the muhark/docling_issue1189 branch from d61cafc to 0dfdf55 Compare March 19, 2025 14:57
@muhark muhark closed this Mar 19, 2025
@muhark muhark force-pushed the muhark/docling_issue1189 branch from 0dfdf55 to 1217c58 Compare March 19, 2025 15:26
@muhark muhark deleted the muhark/docling_issue1189 branch March 19, 2025 15:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant