You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks @kkew3 for reporting this issue.
While it is preferred to use perfectly-formed HTML documents, we understand that some tags are optional according to HTML5 specifications, including body.
We will therefore fix this issue.
Bug
When parsing this html file:
syllabus.html.txt
docling raises:
But the html file can be rendered correctly on Chrome.
Cause of the bug:
docling/docling/backend/html_backend.py
Line 82 in c2ae1cc
The code assumes there's a tag called
body
but there isn't.Steps to reproduce
Following the demo:
Docling version
Python version
The text was updated successfully, but these errors were encountered: