Skip to content

Conversation

elena-kalinina
Copy link

Pdf document loader had checks in place to determine if a url is a presigned url. However, this check was not working, first and foremost, because the regex was not capturing the s3 regex correctly. The presinged url failed the check and was processed as a normal url, which resulted in OSError: filename too long. However, just fixing the url would not allow to distinguish between public and presigned s3 buckets. I rewrote the method to correctly determine whether a url is specifically a presigned bucket (to be further processed accordingly).

In my previous commit, I fixed the regex that did not capture s3 bucket url structure and failed to distinguish presigned urls. However, I realized that just fixing the regex is not enough as now it does not distinguish between public and presigned s3 buckets. so I introduced an improved check that only filters presigned buckets.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant