-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Improve consistency of default engine and return memoryview instead of bytes from to_netcdf() #10656
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This PR introduces a bug fix and a breaking changes: 1. The default backend ``engine`` used by `Dataset.to_netcdf` and `DataTree.to_netcdf` is now chosen consistently with `open_dataset` and `open_datatree`, using whichever netCDF libraries are available and preferring netCDF4 to h5netcdf to scipy. Previously, `DataTree.to_netcdf` was hard-coded to use h5netcdf. 2. The return value of `Dataset.to_netcdf` without ``path`` is now a ``memoryview`` object instead of ``bytes``. This removes an unnecessary memory copy and ensures consistency when using either ``engine="scipy"`` or ``engine="h5netcdf"``. Fixes pydata#10654
@dataclass | ||
class BytesIOProxy(Generic[BytesOrMemory]): | ||
"""Proxy object for a write that returns either bytes or a memoryview.""" | ||
class BytesIOProxy: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: I'm keeping around BytesIOProxy because we'll need it for #10624
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't realize the PR fixing the issue was open already. Thanks for opening it. I have added a comment regarding adding this as a breaking change. I realized about the issue because code that has run successfully in CI since the introduction of DataTree in xarray stopped working with 2025.8.0.
This PR introduces two breaking changes:
engine
used byDataset.to_netcdf
andDataTree.to_netcdf
is now chosen consistently withopen_dataset
andopen_datatree
, using whichever netCDF libraries are available and valid, and preferring netCDF4 to h5netcdf to scipy. Previously,DataTree.to_netcdf
was hard-coded to use scipy for writing to file-like objects or bytes, andDataTree.to_netcdf
was hard-coded to use h5netcdf.Dataset.to_netcdf
withoutpath
is now amemoryview
object instead ofbytes
. This removes an unnecessary memory copy and ensures consistency when using eitherengine="scipy"
orengine="h5netcdf"
.It also includes a minor bug-fix, raising an error when returning a memoryview with
compute=False
Fixes #10654
whats-new.rst