Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fftwf_plan_dft is not threadsafe #18

Open
kC0pter opened this issue Jun 15, 2020 · 3 comments
Open

fftwf_plan_dft is not threadsafe #18

kC0pter opened this issue Jun 15, 2020 · 3 comments

Comments

@kC0pter
Copy link

kC0pter commented Jun 15, 2020

Hey it's me again.

I found out during testing the xcorr that the fftw implementation is not threadsafe.
More Specific it is the calls to the fftwf_plan_dft functions.
The only functions that are threadsafe itself are the fftwf_execute calls.
Found the info here and here.

I fixed it by passing a mutex to the xcorr call and passing it all the way down to the fftw impl as a member reference and wrapping a lock around the fftwf_plan_dft calls.
It's a bit hacky but does the job for my project.

Example:

inline void dft(const value_type* src, complex_type* dst) {
            std::unique_lock<std::mutex> lock(mutex_);
            if (meta::is_null(plan_)) {
                plan_ = fftwf_plan_dft_r2c_1d(nfft_, internal::fftw_cast(src), internal::fftw_cast(dst),
                                              FFTW_ESTIMATE | FFTW_PRESERVE_INPUT);
            }
            lock.unlock();
            fftwf_execute_dft_r2c(plan_, internal::fftw_cast(src), internal::fftw_cast(dst));
        }
@mohabouje
Copy link
Owner

Sorry to hear that, but in the actual design eDSP is a single thread library, it does not have any support for multithreaded application mostly because it aims to be a base implementation for DSP/low latency applications and I did not want to introduce the penalty of having std::mutex, spin_lock or semaphores everywhere.

You can always build a wrapper on top of it or keep the lock in your implementation. In this scenario, I do normally have an FFT instance per thread.

@kC0pter
Copy link
Author

kC0pter commented Jun 18, 2020

Oh well okay didn't knew the single threaded approach. Then it actually is not really a bug.

On the other hand i use it exactly like you describe it. On two threads i call the xcorr function independently on independent vectors/datasets. Since the xcorr creates a FFT instance itself i have a independet FFT instance per thread and it still does crash.

The FFTW doc i mentioned on thread-safety states "The FFTW planner is intended to be called from a single thread. If you really must call it from multiple threads, you are expected to grab whatever lock makes sense for your application..." and since exactly that is happening it kinda is a bug.
The same doc also mentions that "FFTW is designed for the situation where the only performance-sensitive code is the actual execution of the transform" so wrapping a mutex only around the planning is not really a performance impact. Maybe also including a multithreaded implementation like xcorr_mt would be a nice solution.

@mohabouje
Copy link
Owner

mohabouje commented Jun 18, 2020

Good point! I completely forget about that.

I need to update the official documentation to make that point clear.

I see what you mean. So in my case, I'm using a factory provider. Just a simple: make_fft helper function with a locker. All threads are using this.

I wanted to do something similar to what the standard does with the std::make_tuple or std::make_shared, in my current implementation I have a simple std::mutex in a .cpp file, not the most elegant approach tho.

I'm gonna try to come with a better approach and provide a helper in the utils section in future releases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants