Skip to content

Conversation

dhruvak001
Copy link
Contributor

users could not use SeasonResampler for chunking operations in xarray, despite it being a natural fit for seasonal data analysis. When attempting ds.chunk(time=SeasonResampler(["DJF", "MAMJ", "JAS", "ON"])), users encountered obscure errors because the chunking logic was hardcoded to only work with TimeResampler objects. This limitation prevented efficient seasonal analysis workflows and forced users to use workarounds or manual chunking strategies.

Now Added a generalized chunking approach by adding a resolve_chunks method to the Resampler base class and updating the chunking logic to work with all Resampler objects, not just TimeResampler. We also added a _for_chunking method to SeasonResampler that ensures drop_incomplete=False during chunking operations to prevent silent data loss. The solution maintains full backward compatibility with existing TimeResampler functionality while enabling seamless seasonal chunking

@dhruvak001 dhruvak001 changed the title Support chunking Support rechunking to seasonal frequency with SeasonalResampler Jul 9, 2025
DHRUVA KUMAR KAUSHAL added 2 commits July 10, 2025 00:06
@dhruvak001 dhruvak001 requested a review from dcherian July 18, 2025 08:16
@dhruvak001 dhruvak001 requested a review from dcherian July 19, 2025 03:55
@dcherian
Copy link
Contributor

dcherian commented Jul 21, 2025

Almost there! I have a couple of small requests.

@keewis does this API look like it can work with the DGGSResampler idea we talked about? You should be able to rechunk to a "zoom level" so that a chunk completely contains an integer number of "parent cells".

DHRUVA KUMAR KAUSHAL and others added 4 commits August 4, 2025 02:28
* main: (46 commits)
  use the new syntax of ignoring bots (pydata#10668)
  modification methods on `Coordinates` (pydata#10318)
  Silence warnings from test_tutorial.py (pydata#10661)
  test: update write_empty test for zarr 3.1.2 (pydata#10665)
  Bump actions/checkout from 4 to 5 in the actions group (pydata#10652)
  Add load_datatree function (pydata#10649)
  Support compute=False from DataTree.to_netcdf (pydata#10625)
  Fix typos (pydata#10655)
  In case of misconfiguration of dataset.encoding `unlimited_dims` warn instead of raise (pydata#10648)
  fix ``auto_complex`` for ``open_datatree`` (pydata#10632)
  Fix bug indexing with boolean scalars (pydata#10635)
  Improve DataTree typing (pydata#10644)
  Update Cartopy and Iris references (pydata#10645)
  Empty release notes (pydata#10642)
  release notes for v2025.08.0 (pydata#10641)
  Fix `ds.merge` to prevent altering original object depending on join value (pydata#10596)
  Add asynchronous load method (pydata#10327)
  Add DataTree.prune() method              … (pydata#10598)
  Avoid refining parent dimensions in NetCDF files (pydata#10623)
  clarify lazy behaviour and eager loading chunks=None in open_*-functions (pydata#10627)
  ...
@dcherian dcherian added the plan to merge Final call for comments label Aug 24, 2025
@dcherian dcherian enabled auto-merge (squash) August 25, 2025 15:10
@dcherian dcherian disabled auto-merge August 25, 2025 15:10
@dcherian dcherian merged commit 98732e7 into pydata:main Aug 25, 2025
35 of 37 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support rechunking to seasonal frequency with SeasonalResampler
2 participants