Open
Description
What is your issue?
If an array contains np.inf
and a rolling operation is applied, all values after this one are nan
if numbagg is used. Take the following example:
import xarray as xr
import numpy as np
xr.set_options(use_numbagg=False)
da=xr.DataArray([1,2,3,np.inf,4,5,6,7,8,9,10], dims=['x'])
da.rolling(x=2).sum()
Output
<xarray.DataArray (x: 11)> Size: 88B
array([nan, 3., 5., inf, inf, 9., 11., 13., 15., 17., 19.])
Dimensions without coordinates: x
With Numbagg:
xr.set_options(use_numbagg=True)
da=xr.DataArray([1,2,3,np.inf,4,5,6,7,8,9,10], dims=['x'])
print(da.rolling(x=2).sum())
Output
<xarray.DataArray (x: 11)> Size: 88B
array([nan, 3., 5., inf, inf, nan, nan, nan, nan, nan, nan])
Dimensions without coordinates: x
What did I expect?
I expected no user-visible changes in the output values if numbagg is activated.
Maybe, this is not a bug, but expected behaviour for numbagg. The following warning was raised from the second call:
.../Local/virtual_environments/xarray_performance/lib/python3.10/site-packages/numbagg/decorators.py:247: RuntimeWarning: invalid value encountered in move_sum
return gufunc(*arr, window, min_count, axis=axis, **kwargs)
If this is expected, I think it would be good to have a page in the documentation which lists the downsides and limitations of the various tool to accelerate xarray. From the current installation docs, I assumed I just need to install numbagg/bottleneck to make xarray faster without any changes in output values.
Environment
xarray==2024.2.0
numbagg==0.8.0