Skip to content

PB-1024: add large asset download guide for assets > 50 GB#127

Merged
ltclm merged 4 commits intomasterfrom
feat_PB-1024_add_range_download_sample
Mar 19, 2026
Merged

PB-1024: add large asset download guide for assets > 50 GB#127
ltclm merged 4 commits intomasterfrom
feat_PB-1024_add_range_download_sample

Conversation

@ltclm
Copy link
Copy Markdown
Contributor

@ltclm ltclm commented Mar 18, 2026

Add a new page explaining how to download STAC assets larger than 50 GB, which exceed CloudFront's object size limit and return HTTP 400 on regular GET/HEAD requests.

Test link

@ltclm ltclm force-pushed the feat_PB-1024_add_range_download_sample branch from c9dc0df to 5efc56b Compare March 18, 2026 15:06
@ltclm ltclm requested review from GeoPhilo and boecklic March 18, 2026 15:06
@ltclm ltclm force-pushed the feat_PB-1024_add_range_download_sample branch from 5efc56b to 2f7f9da Compare March 18, 2026 15:12
@ltclm ltclm requested a review from rebert March 18, 2026 15:12
@ltclm ltclm force-pushed the feat_PB-1024_add_range_download_sample branch 2 times, most recently from 3091930 to 2c2c2a2 Compare March 18, 2026 15:21
Copy link
Copy Markdown
Contributor

@rebert rebert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Without such a description and an example script, people like me would be lost. Thanks.

@ltclm ltclm force-pushed the feat_PB-1024_add_range_download_sample branch from 2c2c2a2 to 7758331 Compare March 18, 2026 20:36
Copy link
Copy Markdown
Contributor

@GeoPhilo GeoPhilo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the instructions and sample script!

Copy link
Copy Markdown
Contributor

@boecklic boecklic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks!

@boecklic
Copy link
Copy Markdown
Contributor

One more thing: this could be parallelized, so several chunks could be downloaded in parallel to speed the whole thing up. But only makes sense if the user has a fast Internet connection. I would not modify the example script, but you can probably mention it somewhere.

Another potential solution that just comes to my mind (but would need a little modification of service-stac): we could add the possibility to get a presigned URL for the GET request for the object. With that URL requests would be directed to S3 directly, bypassing CF.
So sth like

/collections/{collection_id}/items/{item_id}/assets/{asset_id}/s3_download_url

(or as part of the asset_id payload, but that would probably impact caching).
This is not fully thought through (implications for caching etc.), maybe as a future change/improvement and only for authenticated users...

@ltclm
Copy link
Copy Markdown
Contributor Author

ltclm commented Mar 19, 2026

One more thing: this could be parallelized, so several chunks could be downloaded in parallel to speed the whole thing up. But only makes sense if the user has a fast Internet connection. I would not modify the example script, but you can probably mention it somewhere.

thank you @boecklic . i have added a small comment wrt parallel downloads.

ltclm added 4 commits March 19, 2026 10:36
Add a new page explaining how to download STAC assets larger than 50 GB,
which exceed CloudFront's object size limit and return HTTP 400 on
regular GET/HEAD requests.
HTTP 206 (Partial Content) is a success status code, not an error,
so urllib.request.urlopen() returns it normally without raising an
HTTPError.
@ltclm ltclm force-pushed the feat_PB-1024_add_range_download_sample branch from b36a253 to c09fbba Compare March 19, 2026 09:36
@ltclm ltclm merged commit 733b41a into master Mar 19, 2026
2 checks passed
@ltclm ltclm deleted the feat_PB-1024_add_range_download_sample branch March 19, 2026 09:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants