Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

May not be a PDF file (continuing anyway) #7813

Open
txau opened this issue Mar 26, 2025 · 0 comments
Open

May not be a PDF file (continuing anyway) #7813

txau opened this issue Mar 26, 2025 · 0 comments

Comments

@txau
Copy link
Collaborator

txau commented Mar 26, 2025

Every now and then we get this kind of error in the logs:

Error: pdftotext /tmp/1742981023757irtc7zos7e.jpg - failed with code 1
stderr output:
Syntax Warning: May not be a PDF file (continuing anyway)
Syntax Error: Couldn't find trailer dictionary
Syntax Error: Couldn't find trailer dictionary
Syntax Error: Couldn't read xref table
  • If anything, this should be on the level "warn" rather than error
  • Probably we need to filter out what we try to convert to text to prevent the whole flow from being triggered if the extension doesn't match
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants