refactor(datasets): add compress_level parameter to write_image() and set it to 1 #2135
+25
−2
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR steams from the conversation in: #1959
Rationale
Why is compression not critical at this step?
We aim to preserve as much raw image information as possible, as these images are intermediate artifacts. They will later be compressed during video encoding at the end of each episode, where compression efficiency potentially matters more.
How was the compression level chosen?
The optimal compression level depends on the entropy characteristics of the images. However, since our main goal here is speed rather than file size, a low compression level is preferred to minimize CPU overhead during frequent writes.
Why
compress_level=1
instead of0
?Although
0
uses the least CPU for compression, it can paradoxically result in slower overall performance due to the larger output files. Writing significantly larger files increases I/O time, often offsetting any CPU gains.Setting
compress_level=1
provides a better balance between CPU usage and disk throughput.Future Work
As suggested in the original ticket, compression and encoding parameters (e.g., format, compression level, codec options) should eventually be exposed to users for fine-grained control. This will be addressed in a future PR; although it is not currently a priority.