Open
Description
Describe the feature
In python with Boto3, I can do the following:
from boto3.s3.transfer import TransferConfig
config = TransferConfig(
multipart_threshold=4 * 1024 * 1024 * 1024, # 4GB
max_concurrency=1,
multipart_chunksize=32 * 1024 * 1024, # 32MB
)
# some code here...
self.s3.download_file(
Bucket="commoncrawl",
Key="path_to_file.txt",
Filename="local.txt",
Config=config,
)
A way to do this from aws-rust-sdk, or at least use the locally configured rules, for example
> aws configure set s3.multipart_threshold 4GB
Use Case
Common Crawl's bucket is always rate limited (on requests) but not bandwidth, so avoiding multipart downloads is the only way to reliably download it. The time difference is between 10 seconds and 10 minutes.
Proposed Solution
No response
Other Information
No response
Acknowledgements
- I may be able to implement this feature request
- This feature might incur a breaking change
A note for the community
Community Note
- Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
- Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
- If you are interested in working on this issue, please leave a comment