Potential memory order inconsistency on freelist block reusage #404

ayles · 2024-12-17T16:50:47Z

Hi! So I noticed that TSAN complains about races inside concurrentqueue. I was able to reproduce it with the following minimal example:

#include "concurrentqueue.h"

#include <thread>
#include <vector>

int main() {
    moodycamel::ConcurrentQueue<int> q;

    std::vector<std::thread> threads;

    for (size_t i = 0; i < 5; ++i) {
        threads.emplace_back([&q]() {
            while (true) {
                int item;
                q.try_dequeue(item);
            }
        });
    }

    for (size_t i = 0; i < 10; ++i) {
        threads.emplace_back([&q]() {
            while (true) {
                q.try_enqueue(1);
            }
        });
    }

    for (auto& t : threads) {
        t.join();
    }
}

...

WARNING: ThreadSanitizer: data race (pid=1709853)
  Write of size 4 at 0x728c00000718 by thread T14:
    #0 bool moodycamel::ConcurrentQueue<int, moodycamel::ConcurrentQueueDefaultTraits>::ImplicitProducer::enqueue<(moodycamel::ConcurrentQueue<int, moodycamel::ConcurrentQueueDefaultTraits>::AllocationMode)1, int>(int&&) ... concurrentqueue.h:2544:4
    #1 bool moodycamel::ConcurrentQueue<int, moodycamel::ConcurrentQueueDefaultTraits>::inner_enqueue<(moodycamel::ConcurrentQueue<int, moodycamel::ConcurrentQueueDefaultTraits>::AllocationMode)1, int>(int&&) ... concurrentqueue.h:1376:94
    #2 moodycamel::ConcurrentQueue<int, moodycamel::ConcurrentQueueDefaultTraits>::try_enqueue(int&&) ... concurrentqueue.h:1074:15
    #3 main::$_1::operator()() const ... main.cpp:23:19
    
...

  Previous read of size 4 at 0x728c00000718 by thread T5:
    #0 bool moodycamel::ConcurrentQueue<int, moodycamel::ConcurrentQueueDefaultTraits>::ImplicitProducer::dequeue<int>(int&) ... concurrentqueue.h:2596:17
    #1 bool moodycamel::ConcurrentQueue<int, moodycamel::ConcurrentQueueDefaultTraits>::ProducerBase::dequeue<int>(int&) ... concurrentqueue.h:1720:50
    #2 bool moodycamel::ConcurrentQueue<int, moodycamel::ConcurrentQueueDefaultTraits>::try_dequeue<int>(int&) ... concurrentqueue.h:1146:32
    #3 main::$_0::operator()() const ... main.cpp:15:19

...

I believe this race (in terms of C++ memory model) happens like this:
T0: Thread A reads from memory slot S.
T1: Thread A releases memory slot S of block X.
T2: Thread B releases last slot of block X, pushes block into the freelist.
T3: Thread C acquires block X from the freelist, writes to slot S.

I guess release in this and similar places should be acq_rel (or release + acquire fence on the last increment):
https://github.com/cameron314/concurrentqueue/blob/master/concurrentqueue.h#L1607
to build happens-before relation between the slot release (thread A) and the whole block release (thread B) and subsequent push to the freelist.
(Tested it, TSAN stopped complaining).

I am not so good with C++ memory model, so maybe I am missing something here?

The text was updated successfully, but these errors were encountered:

fix issue cameron314#404

hs-vc · 2025-01-22T08:05:23Z

@cameron314 This issue looks quite serious. To address it, I've submitted #406, which is based on this commit. When you have a moment, could you please review the changes? Thank you for your time and consideration.

cameron314 · 2025-01-26T20:35:47Z

I can confirm this TSan report indicates it thinks the block was not handed off cleanly in the FreeList between the two threads. (Line 1607 is irrelevant, as it's in code that's never executed with the default block size.)

It is not clear to me whether or not this is a false positive. More analysis is needed. At the root of the free list implementation are CAS operations on freeListHead, which is already done with release semantics when a block is put on the freelist and acquire semantics when a block is taken off. Therefore my first impression is that this is likely a false positive.

If there is a bug, blindly changing memory ordering until TSan stops complaining does not necessarily indicate the bug is fixed. Actual human logic needs to be involved.

cameron314 · 2025-01-26T21:59:09Z

Think I found one case where the freeListRefs may get temporarily manipulated by another thread than the one in add_knowing_refcount_is_zero, leading to a break in happens-before ordering with respect to that member (note that this should not affect the happens-before of putting a block on the list head and getting it back). Pushed e74b07d.

ayles · 2025-01-27T00:24:19Z

Sorry, but I do not understand how line 1607 is irrelevant if it is executed (on my machine, with the example above).

And, yes, if thead B releases the block (after releasing the last element) and thread C acquires that block, there is definitely a clear happens-before relation between them.

However, when thread A releases a non-last element from that block, it does not interact with freeListRefs/freeListHead in any way. And the only place where threads A and B can establish happens-before relation is when they increment elementsCompletelyDequeued.

In fact, I do not believe in false positives with modern TSAN, unless they are extremly rare or thread fences are in use (ofc when talking about C++ memory model and not about actual races on a particular architecture).

ayles · 2025-01-27T00:54:08Z

Forgot to mention that, as expected, e74b07d does not fix the problem.

cameron314 · 2025-01-27T02:29:21Z

Ah, I misread your initial analysis. I see what you mean now. It's not about the thread freeing the block but about the other dequeue operations on that block before that. Thank you, this is helpful.

It seems I also reread the code too quickly. Line 1607 is skipped with the default block size for explicit producers, but not implicit ones.

Yes, the commit I pushed is for an unrelated potential issue.

… that block's reuse when empty, affecting implicit producers (see #404)

cameron314 · 2025-01-27T04:05:14Z

I don't have TSan set up at the moment, please let me know if the latest commit fixes this particular warning.

Based on #406, I'm guessing there are others that are unrelated to this one.

ayles · 2025-01-27T07:34:17Z

I can't test this commit at the moment, but as I mentioned earlier, one change on line 1607 completely resolves the warnings in the example.

I suppose there might be a similar issue with flags in the if-branch as well.

Maybe it will be possible to introduce TSAN-tests in the future, starting with the example above?

ayles · 2025-01-27T07:48:17Z

Also, in terms of performance, it may be better to use release memory order with acquire-fence on the last increment, as described here https://www.boost.org/doc/libs/1_53_0/doc/html/atomic/usage_examples.html#boost_atomic.usage_examples.example_reference_counters.discussion
With a call to __tsan_acquire in TSAN-builds, ofc.
(But the new conditional branch may cost a little, so it will require some benchmarking)

XiaoXuan42 added a commit to XiaoXuan42/concurrentqueue that referenced this issue Jan 4, 2025

fix issue404: cameron314#404

65bbc7e

hs-vc added a commit to hs-vc/concurrentqueue that referenced this issue Jan 22, 2025

Update concurrentqueue.h

d863f10

fix issue cameron314#404

hs-vc mentioned this issue Jan 22, 2025

Update concurrentqueue.h #406

Open

cameron314 added a commit that referenced this issue Jan 27, 2025

Fix chain of happens-before between dequeue operations in a block and…

0d54c67

… that block's reuse when empty, affecting implicit producers (see #404)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Potential memory order inconsistency on freelist block reusage #404

Potential memory order inconsistency on freelist block reusage #404

ayles commented Dec 17, 2024

hs-vc commented Jan 22, 2025

cameron314 commented Jan 26, 2025

cameron314 commented Jan 26, 2025

ayles commented Jan 27, 2025 •

edited

Loading

ayles commented Jan 27, 2025 •

edited

Loading

cameron314 commented Jan 27, 2025

cameron314 commented Jan 27, 2025

ayles commented Jan 27, 2025

ayles commented Jan 27, 2025 •

edited

Loading

Potential memory order inconsistency on freelist block reusage #404

Potential memory order inconsistency on freelist block reusage #404

Comments

ayles commented Dec 17, 2024

hs-vc commented Jan 22, 2025

cameron314 commented Jan 26, 2025

cameron314 commented Jan 26, 2025

ayles commented Jan 27, 2025 • edited Loading

ayles commented Jan 27, 2025 • edited Loading

cameron314 commented Jan 27, 2025

cameron314 commented Jan 27, 2025

ayles commented Jan 27, 2025

ayles commented Jan 27, 2025 • edited Loading

ayles commented Jan 27, 2025 •

edited

Loading

ayles commented Jan 27, 2025 •

edited

Loading

ayles commented Jan 27, 2025 •

edited

Loading