Skip to content

[Advice appreciated] How to implement pre-fetching? #3883

@oaaij-gnahz

Description

@oaaij-gnahz

Hi,

I am currently using SpikeInterface to preprocess my recording like shown below:

rec = si.read_nwb("recording.nwb")
rec = spre.band_pass(rec)
# a chain of SpikeInterface preprocessing functions ...

Recently I need to iterate over the entire recording for some custom processing function. Something like this:

results_by_chunk = [my_func(rec.get_traces(start_frame=beg, end_frame=beg+chunksize)) for beg in range(0, rec.get_num_frames(), chunksize)]

It seems like, the way I am using these functions, I need to wait for the disk IO and CPU-intensive processing functions sequentially and that is slow on my machine.

I would like to decouple the IO task and the CPU-intensive tasks. For example, one process for reading data chunks from disk to a queue, and another pool of processes to perform the preprocessing (the output will be way smaller in size than the timeseries data so they can all fit in memory). However I don't know how to start this under the lazy loading framework. How would you view this problem and do you have some suggestions?

Thank you very much!

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionGeneral question regarding SI

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions