feat!(backend): file sharing - prevent overwrites of uploaded file#6381
feat!(backend): file sharing - prevent overwrites of uploaded file#6381anna-parker wants to merge 8 commits into
Conversation
|
Claude finished @anna-parker's task in 2m 59s —— View job Code Review
The approach is correct: signing the presigned URL with The test updates are thorough and consistent: all A few issues noted via inline comments:
Remaining checklist items: The PR checklist still has open items (docs for disconnected-upload case, manual testing documentation). The disconnected-upload scenario (user gets presigned URL but upload never completes) is worth documenting since clients will now get 412 on retry with the same URL — they'd need to call |
…m response (#6386) Addresses [@anna-parker's review on #6381](#6381 (comment)). ## Summary - Refactors `GIVEN a request for a URL THEN returns a valid presigned URL` to read the required upload headers from the `headers` field on the response JSON instead of hardcoding `If-None-Match: *`. This exercises the same client-side flow that real callers (e.g. the preprocessing pipeline) are expected to follow. - Adds a new test, `GIVEN a presigned URL has been used to upload THEN a second upload to the same URL fails`, that uses the same presigned URL twice and asserts the second PUT is rejected with HTTP 412 — the overwrite-prevention guarantee that motivated #6381. ## Test plan - [x] `./gradlew test --tests 'org.loculus.backend.controller.files.RequestUploadEndpointTest'` — all 16 tests pass, including the new one (412 returned by MinIO on the second PUT). - [x] `./gradlew ktlintFormat` — no changes. This PR is targeted at the `file_sharing_nooverwrite` branch so it can land alongside #6381. 🚀 Preview: Add `preview` label to enable Co-authored-by: theosanderson-agent <theo@theo.io> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
I deployed a preview off this branch to see if the {If-None-Match: *} header prevents second write requests as expected. It seems to be working as intended! Setup
Testing# requesting a file upload as the testuser
➜ loculus git:(s3-garbage-collection) ✗ curl -X POST "https://backend-file-sharing-nooverwrite.loculus.org/files/request-upload?groupId=2&numberFiles=1" -H "Authorization: Bearer $TOKEN"
[{"fileId":"cad7567e-913f-4fbc-bd2e-a2ad44db294d","url":"https://s3-file-sharing-nooverwrite.loculus.org/loculus-preview-private/files/cad7567e-913f-4fbc-bd2e-a2ad44db294d?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20260611T122529Z&X-Amz-SignedHeaders=host%3Bif-none-match&X-Amz-Credential=8LRKJBFQ3G38BIJ9KCHS%2F20260611%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Expires=1800&X-Amz-Signature=d9c32cad6f1936de9a95580fb0deed6cf73d8853df1deedbb7ea058485235859","headers":{"If-None-Match":"*"}}]%
# first: attempt to upload without 'If-None-Match' header; this fails
➜ loculus git:(s3-garbage-collection) ✗ curl -X PUT --upload-file ./test_file.txt "https://s3-file-sharing-nooverwrite.loculus.org/loculus-preview-private/files/cad7567e-913f-4fbc-bd2e-a2ad44db294d?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20260611T122529Z&X-Amz-SignedHeaders=host%3Bif-none-match&X-Amz-Credential=8LRKJBFQ3G38BIJ9KCHS%2F20260611%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Expires=1800&X-Amz-Signature=d9c32cad6f1936de9a95580fb0deed6cf73d8853df1deedbb7ea058485235859"
<?xml version="1.0" encoding="UTF-8"?>
<Error><Code>AccessDenied</Code><Message>There were headers present in the request which were not signed</Message><Key>files/cad7567e-913f-4fbc-bd2e-a2ad44db294d</Key><BucketName>loculus-preview-private</BucketName><Resource>/loculus-preview-private/files/cad7567e-913f-4fbc-bd2e-a2ad44db294d</Resource><RequestId>18B8067B6686D9BA</RequestId><HostId>dd9025bab4ad464b049177c95eb6ebf374d3b3fd1af9251148b658df7ac2e3e8</HostId></Error>%
# second: upload with 'If-None-Match' header; succeeds
➜ loculus git:(s3-garbage-collection) ✗ curl -X PUT --upload-file ./test_file.txt "https://s3-file-sharing-nooverwrite.loculus.org/loculus-preview-private/files/cad7567e-913f-4fbc-bd2e-a2ad44db294d?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20260611T122529Z&X-Amz-SignedHeaders=host%3Bif-none-match&X-Amz-Credential=8LRKJBFQ3G38BIJ9KCHS%2F20260611%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Expires=1800&X-Amz-Signature=d9c32cad6f1936de9a95580fb0deed6cf73d8853df1deedbb7ea058485235859" -H "If-None-Match: *"
# (make sure it succeeded)
➜ loculus git:(s3-garbage-collection) ✗ echo $?
0
# third: try another upload; rejected
➜ loculus git:(s3-garbage-collection) ✗ curl -X PUT --upload-file ./test_file.txt "https://s3-file-sharing-nooverwrite.loculus.org/loculus-preview-private/files/cad7567e-913f-4fbc-bd2e-a2ad44db294d?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20260611T122529Z&X-Amz-SignedHeaders=host%3Bif-none-match&X-Amz-Credential=8LRKJBFQ3G38BIJ9KCHS%2F20260611%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Expires=1800&X-Amz-Signature=d9c32cad6f1936de9a95580fb0deed6cf73d8853df1deedbb7ea058485235859" -H "If-None-Match: *"
<?xml version="1.0" encoding="UTF-8"?>
<Error><Code>PreconditionFailed</Code><Message>At least one of the pre-conditions you specified did not hold</Message><Key>files/cad7567e-913f-4fbc-bd2e-a2ad44db294d</Key><BucketName>loculus-preview-private</BucketName><Resource>/loculus-preview-private/files/cad7567e-913f-4fbc-bd2e-a2ad44db294d</Resource><RequestId>18B8068BCAF0A935</RequestId><HostId>dd9025bab4ad464b049177c95eb6ebf374d3b3fd1af9251148b658df7ac2e3e8</HostId></Error>%
# finally: stat the file to double check it exists on S3
➜ loculus git:(s3-garbage-collection) ✗ kubectl exec -n prev-file-sharing-nooverwrite minio-78d9dbd5b-cq776 -- mc stat "local/loculus-preview-private/files/cad7567e-913f-4fbc-bd2e-a2ad44db294d"
Name : cad7567e-913f-4fbc-bd2e-a2ad44db294d
Date : 2026-06-11 12:27:12 UTC
Size : 20 B
ETag : 4221d002ceb5d3c9e9137e495ceaa647
Type : file
Metadata :
Content-Type: binary/octet-stream |
maverbiest
left a comment
There was a problem hiding this comment.
Looks good overall! It does seem slightly brittle to have to add the headers in a couple of different places, maybe it would be nice to pass them as arguments like suggested in other comments?
|
thanks for the thorough testing! |
|
Confirmed that when I break off the upload to a presigned URL and stat the file in minion it does not exist, and I can upload again, and once that is completed I can see the file in the stat: |
resolves #4056
The backend now adds
"If-None-Match": "*"as a header when requesting presigned URLs on behalf of a user, this prevents writes to the S3 if the S3 already has data - preventing accidental overwrites.Suggested by @tombch, see details in https://docs.aws.amazon.com/AmazonS3/latest/userguide/conditional-writes.html https://security.stackexchange.com/a/286617
Note that this header is not required for multi-part S3 uploads as the request
complete-multipart-uploadprevents future modifications of the S3 using the presigned URL.Breaking change
Clients using presigned URLs (i.e. requested via the
/files/request-uploadendpoint) now need to add"If-None-Match": "*"to the header when submitting data using the presigned URL (this is because AWS and other S3 providers will block uploads to S3 buckets that do not use the same headers as in the created presigned URL.PR Checklist
🚀 Preview: https://file-sharing-nooverwrite.loculus.org