-
Notifications
You must be signed in to change notification settings - Fork 223
Description
Hi,
I am currently using SpikeInterface to preprocess my recording like shown below:
rec = si.read_nwb("recording.nwb")
rec = spre.band_pass(rec)
# a chain of SpikeInterface preprocessing functions ...
Recently I need to iterate over the entire recording for some custom processing function. Something like this:
results_by_chunk = [my_func(rec.get_traces(start_frame=beg, end_frame=beg+chunksize)) for beg in range(0, rec.get_num_frames(), chunksize)]
It seems like, the way I am using these functions, I need to wait for the disk IO and CPU-intensive processing functions sequentially and that is slow on my machine.
I would like to decouple the IO task and the CPU-intensive tasks. For example, one process for reading data chunks from disk to a queue, and another pool of processes to perform the preprocessing (the output will be way smaller in size than the timeseries data so they can all fit in memory). However I don't know how to start this under the lazy loading framework. How would you view this problem and do you have some suggestions?
Thank you very much!