(feat): remove dtype + fill val handling per chunk #124
+203
−204
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Calling
to_native_dtype+__str__came up as one of the only python-CPU-bound things when doing some benchmarking. My use-case is quite contrived (generating thousands ofWithSubsetobjects) but I think it's probably worth investigating getting rid of these calls. Some observations:dtypeandfill_valbe wrapped up in just relying on https://docs.rs/zarrs/latest/zarrs/array/struct.Array.html#method.open and then using the values directly (there are probably other benefits of doing this) but I think this is a separate PRBasicanyway so that chunk handling is independent of the ability. I noticed thatChunkRepresentationrequires ownership over its arguments which means we copy per-chunk. Not sure what would go into making that a reference, but it's no worse than the previous situation where I think we were generating copies repeatedly, but from PyO3 calling pythonThe benefit wasn't crazy ~5% but I think going in this direction is good (see point 1)
TODO:
vlentest error messages / warnings re: what we support.