-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: translate equations to latex when running MSWord backend #825
base: main
Are you sure you want to change the base?
Conversation
Merge ProtectionsYour pull request matches the following merge protections and will not be merged until they are valid. 🔴 Require two reviewer for test updatesThis rule is failing.When test data is updated, we require two reviewers
🟢 Enforce conventional commitWonderful, this rule succeeded.Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/
|
Signed-off-by: Rafael Teixeira de Lima <[email protected]>
Signed-off-by: Rafael Teixeira de Lima <[email protected]>
Signed-off-by: Rafael Teixeira de Lima <[email protected]>
Signed-off-by: Rafael Teixeira de Lima <[email protected]>
Signed-off-by: Rafael Teixeira de Lima <[email protected]>
aca25ae
to
f503494
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
docling/backend/msword_backend.py
Outdated
@@ -291,7 +312,6 @@ def handle_text_elements(self, element, docx_obj, doc): | |||
doc.add_text( | |||
label=DocItemLabel.PARAGRAPH, parent=self.parents[level - 1], text=text |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inline equation will default to Text (DocItemLabel.TEXT
), while seperate equations need to be marked as DocItemLabel.EQUATION
.
Signed-off-by: Rafael Teixeira de Lima <[email protected]>
Signed-off-by: Rafael Teixeira de Lima <[email protected]>
At the moment, equations present in MSWord documents are not exported. This PR translates the MSWord equations to latex and includes it in the text output. New test files have been added to test this feature.
Checklist: