-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Extend rolling_exp
to support pd.Timedelta
objects with window halflife
#10237
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Added validation and calculation functions for halflife operations. Updated docstrings and type hints accordingly. Moved _calculate_deltas literally from pandas/window/core/ewm.py to not rely on internal pandas function.
Introduced new test cases to validate the behavior of rolling_exp when using Timedelta windows, specifically for the halflife window type. Checks for compatibility between window type, window, index, and operation. Check results match pandas.
Thank you for opening this pull request! It may take us a few days to respond here, so thank you for being patient. |
…compatibility with pandas < 2.2.0 pandas ewm can work with non-ns resolution from >= 2.2.0. Here we just test that this PR rolling_exp can work with non-ns resolution.
thanks @abiasiol ! couple of quick questions:
|
Hi @max-sixty ! It works with uneven spacing (the way that Pandas does): times = pd.date_range("2000-01-01", freq="1D", periods=21)
times_delta = pd.to_timedelta(np.random.randint(0, 12, size=len(times)), unit="h")
times = times + times_delta
da = DataArray(
np.random.random((21, 4)),
dims=("time", "x"),
coords=dict(time=times, x=["a", "b", "c", "d"]),
)
np.allclose(
da.rolling_exp(time=pd.Timedelta(hours=2), window_type="halflife").mean().values,
da.to_pandas()
.ewm(halflife=pd.Timedelta(hours=2), times=da.time.values)
.mean()
.values,
) # True |
Reading the docstring of Pandas
But let me take another look, and I'll get back to you. |
ah, great, it uses the numbagg feature which takes an array of I don't fully understand why we're limited to halflife — all the window types are freely convertible to one another; though possibly I'm misunderstanding something. (and same thing with I haven't looked in enough detail at the calcs, but assuming we're well-tested against the pandas implementation, that's sufficient |
whats-new.rst
Description
Extended
rolling_exp
to supportpd.Timedelta
objects for the window size when usingwindow_type="halflife"
along datetime dimensions, similar to pandas'ewm
. This allows expressions likeda.rolling_exp(time=pd.Timedelta(days=1), window_type="halflife").mean()
.Implementation
pd.Timedelta
object"halflife"
mean
nanmean
which allowsalpha
to be an array_calculate_deltas
function rather than relying on pandas' private implementationBehavior Note
One difference from pandas' behavior: when dealing with nan values and a very short timedelta, this implementation returns nan while pandas appears to carry forward the previous value. This behavior seems more appropriate to me (user can fill it later, if they need to).
Example demonstrating the difference: