Skip to content

Commit

Permalink
fix: use file extension if filetype fails with PDF
Browse files Browse the repository at this point in the history
Filetype library may not identify some files as PDF. Leverage the file extension
as a simple solution.

Signed-off-by: Cesar Berrospi Ramis <[email protected]>
  • Loading branch information
ceberam committed Jan 28, 2025
1 parent 5139b48 commit 6ca7daf
Showing 1 changed file with 2 additions and 0 deletions.
2 changes: 2 additions & 0 deletions docling/datamodel/document.py
Original file line number Diff line number Diff line change
Expand Up @@ -352,6 +352,8 @@ def _mime_from_extension(ext):
mime = FormatToMimeType[InputFormat.MD][0]
elif ext in FormatToExtensions[InputFormat.JSON_DOCLING]:
mime = FormatToMimeType[InputFormat.JSON_DOCLING][0]
elif ext in FormatToExtensions[InputFormat.PDF]:
mime = FormatToMimeType[InputFormat.PDF][0]
return mime

@staticmethod
Expand Down

0 comments on commit 6ca7daf

Please sign in to comment.