[Advice appreciated] How to implement pre-fetching?

Hi,

I am currently using SpikeInterface to preprocess my recording like shown below:

```
rec = si.read_nwb("recording.nwb")
rec = spre.band_pass(rec)
# a chain of SpikeInterface preprocessing functions ...
```

Recently I need to iterate over the entire recording for some custom processing function. Something like this:

```
results_by_chunk = [my_func(rec.get_traces(start_frame=beg, end_frame=beg+chunksize)) for beg in range(0, rec.get_num_frames(), chunksize)]
```
It seems like, the way I am using these functions, I need to wait for the disk IO and CPU-intensive processing functions sequentially and that is slow on my machine. 

I would like to decouple the IO task and the CPU-intensive tasks. For example, one process for reading data chunks from disk to a queue, and another pool of processes to perform the preprocessing (the output will be way smaller in size than the timeseries data so they can all fit in memory). However I don't know how to start this under the lazy loading framework. How would you view this problem and do you have some suggestions? 

Thank you very much!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Advice appreciated] How to implement pre-fetching? #3883

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Advice appreciated] How to implement pre-fetching? #3883

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions