You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A large PDF was processed by pdftotext, producing a ~170MB text file to be uploaded to ElasticSearch.
This resulted in a TransportError 413 "Request Entity Too Large". The default for uploads to ES is 100MB.
The same file was processed offline using pymupdf4llm (the new PDF parser that will be used for the re-indexing), producing just a ~100kB file.
@joepio I propose not to mess with the upload default maximum of 100MB and accept that this will fail occasionally in the current production version. The problem is not present in the re-indexing branch and so will be gone after swapping the machines after re-indexing.
The text was updated successfully, but these errors were encountered:
A large PDF was processed by
pdftotext
, producing a ~170MB text file to be uploaded to ElasticSearch.This resulted in a
TransportError 413 "Request Entity Too Large"
. The default for uploads to ES is 100MB.The same file was processed offline using
pymupdf4llm
(the new PDF parser that will be used for the re-indexing), producing just a ~100kB file.@joepio I propose not to mess with the upload default maximum of 100MB and accept that this will fail occasionally in the current production version. The problem is not present in the re-indexing branch and so will be gone after swapping the machines after re-indexing.
The text was updated successfully, but these errors were encountered: