Skip to content

Conversation

@skshetry
Copy link
Contributor

@skshetry skshetry commented Oct 11, 2025

_put_file will automatically increase the chunksize when uploading a large file to remain within 10,000 chunks limit if chunksize is not set or is None.

Fixes #971.

@skshetry skshetry moved this to In Progress in DVC Oct 12, 2025
@skshetry skshetry added this to DVC Oct 12, 2025
@martindurant
Copy link
Member

I am glad you are working on this.

You may also be interested in fsspec/adlfs#508 , which implemented a coroutine pool rather than batching, a feature we could implement across fsspec ( @anjaliratnam-msft ).

@skshetry skshetry closed this Oct 28, 2025
@github-project-automation github-project-automation bot moved this from In Progress to Done in DVC Oct 28, 2025
@skshetry skshetry reopened this Oct 28, 2025
@github-project-automation github-project-automation bot moved this from Done to Backlog in DVC Oct 28, 2025
@skshetry skshetry marked this pull request as ready for review October 28, 2025 12:18
`_put_file` will automatically increase the `chunksize` when uploading a
large file to remain within 10,000 chunks limit.
@skshetry skshetry moved this from Backlog to Review In Progress in DVC Oct 28, 2025
@martindurant
Copy link
Member

martindurant commented Nov 4, 2025

I am happy to merge if you are. I believe this must be untestable in moto, right? (edit: unless we patch the limits)

@skshetry
Copy link
Contributor Author

skshetry commented Nov 5, 2025

I am happy to merge if you are. I believe this must be untestable in moto, right? (edit: unless we patch the limits)

Given that this logic only applies once the file size reaches >488 GiB, it isn’t testable with moto.

I'd prefer not to patch the limits. I have refactored the chunksize calculation and added a unit test.

@martindurant martindurant merged commit 641ecd5 into fsspec:main Nov 5, 2025
21 checks passed
@github-project-automation github-project-automation bot moved this from Review In Progress to Done in DVC Nov 5, 2025
@skshetry skshetry deleted the dynamic=chunksize branch November 5, 2025 16:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

using dynamic chunksize when uploading large files?

2 participants