Skip to content

feat(msexcel): ignore invisible sheet #1876

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

ealyn
Copy link
Contributor

@ealyn ealyn commented Jul 1, 2025

As WYSIWYG, filter invisible sheets.

Checklist:

  • Documentation has been updated, if necessary.
  • Examples have been added, if necessary.
  • Tests have been added, if necessary.

ealyn added 2 commits July 1, 2025 19:38
Copy link
Contributor

github-actions bot commented Jul 1, 2025

DCO Check Passed

Thanks @ealyn, all your commits are properly signed off. 🎉

Copy link

mergify bot commented Jul 1, 2025

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🟢 Enforce conventional commit

Wonderful, this rule succeeded.

Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/

  • title ~= ^(fix|feat|docs|style|refactor|perf|test|build|ci|chore|revert)(?:\(.+\))?(!)?:

@PeterStaar-IBM
Copy link
Contributor

@ealyn I would be more in favor to not skip the invisible_sheet, but rather updeate the ContentLayer (eg set it to background instead of the default body).

see:

  1. ContentLayer (https://github.com/docling-project/docling-core/blob/cacf4ef67b902d14298b67dc7c84ca2a59be337f/docling_core/types/doc/document.py#L731)
  2. add_text (https://github.com/docling-project/docling-core/blob/cacf4ef67b902d14298b67dc7c84ca2a59be337f/docling_core/types/doc/document.py#L731)

@ealyn
Copy link
Contributor Author

ealyn commented Jul 2, 2025

@PeterStaar-IBM The meaning is slightly different between background and invisible sheet. Offer a new ContentLayer member would be better.

@PeterStaar-IBM
Copy link
Contributor

@PeterStaar-IBM The meaning is slightly different between background and invisible sheet. Offer a new ContentLayer member would be better.

I agree, just added extra content-layers to address this: docling-project/docling-core#345

@PeterStaar-IBM
Copy link
Contributor

@ealyn We just merged another content-layer "INVISIBLE". If you add this and do the code reformatting, we would be happy to merge!

Copy link

codecov bot commented Jul 4, 2025

Codecov Report

❌ Patch coverage is 50.00000% with 2 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
docling/backend/msexcel_backend.py 50.00% 2 Missing ⚠️

📢 Thoughts on this report? Let us know!

@ceberam
Copy link
Contributor

ceberam commented Jul 28, 2025

@ealyn do you have any update on this PR?
As mentioned by @PeterStaar-IBM , docling-core is now supporting the invisible content layer, so the changes needed in the PR should be straightforward.
Let us know if you need any support for finalizing it.

@ealyn
Copy link
Contributor Author

ealyn commented Jul 29, 2025

@ceberam Just because I'm busy in july, I'll refactor this feature in august.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants