Proposal: Add anti-aliasing option to decimate()

`DecimateRecording` applies simple array slicing without an anti-aliasing (AA) filter:
```python
"""
class DecimateRecording(BasePreprocessor):
  Decimate the recording extractor traces using array slicing
  
  Important: This uses simple array slicing for decimation rather than eg scipy.decimate.
  This might introduce aliasing, or skip across signal of interest.
  Consider  spikeinterface.preprocessing.ResampleRecording for safe resampling.
  ...
"""
```

I'd like to add an "antialias" option to `DecimateRecording`:
- If `True`, it would use `spicy.decimate`. 
- If `True` and `decimation factor > 13`, it would warn the user that `decimate()` should be called multiple times, with decimation factors as balanced as possible (e.g. for a total decimation factor of 48, a pass with `q=8` followed by `q=6` is preferable to `q=12` followed by `q=4`). I can see I there's a good way to handle this _for_ the user, but I think it's impossible for the `DecimateRecording` constructor itself to chain together `DecimateRecordings`. Besides, this would be consistent with the behavior of the underlying `spicy.decimate()`, which warns but does not refuse `q > 13`. 
    - For consistency, I would add the same check and warning to the decimation path in `ResampleRecording`. 
- If `True`, a margin would be used to accommodate the AA filter, exactly as it is in `ResampleRecording`. 
- `ResampleRecording` says that decimation can "misbehave on some cases", and guards against this:
    ```python
    # Decimate can have issues for some cases, returning NaNs
    resampled_traces = signal.decimate(parent_traces, q=q, axis=0)
    # If that's the case, use signal.resample
    if np.any(np.isnan(resampled_traces)):
    resampled_traces = signal.resample(parent_traces, num, axis=0)
    ```
    I would check for `NaN` and warn the user if any are found. But in my experience, decimation only does this when `q > 13`, so it could be  dropped IMO. If there are other cases where this is known to happen, I would like to document them with a comment to justify the every-call overhead of `np.any(np.isnan())`. 
- I would make `antialias=False` the default, to preserve existing behavior. 

One reason I'd like to do this is that antialiased decimation is currently provided by `resample()`, which does not take a decimation factor, just a target resample rate, and checks if `np.mod(self._parent_rate, self._resample_rate) == 0` to determine whether antialiased decimation (vs FFT-based resampling) is done. The issue with this is that many integer decimation factors can't be achieved this way, because of float fragility. For example, with a parent rate of 625 Hz and a target rate of 125 Hz, `np.mod(625, 125) == 0` correctly determines that this is an integer downsample factor of 5. But `np.mod(625, 104.166666667) == 0` fails to detect a downsample factor of 6. 

I would preserve the decimation path within `ResampleRecording()`, since it's still a free optimization, and preserves existing behavior. 

If this sounds good, I'll submit a PR. 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: Add anti-aliasing option to decimate() #4605

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Proposal: Add anti-aliasing option to decimate() #4605

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions