-
Notifications
You must be signed in to change notification settings - Fork 746
wasm2c: partial support for atomic memory ops #2233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work!
I assume there re some tests in the threading proposal repo that we can start running perhaps?
(Main comment edited to reflect the latest changes) @sbc100 Unfortunately, the tests in the proposal repo have all tests in a single wast including for wait/notify which is not implemented. So, to run a subset of tests, I have made a local copy of the wast with the unsupported tests removed. Let me know if this is reasonable I have also added the missing alignment checks from the last PR @sbc100 @keithw I am not thrilled with this approach of performing alignment checks, as I think it will incur non-trivial overhead. Open to any ideas. |
efc4af2
to
20906eb
Compare
PR now updates to support msvc also |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm % comments.
Bigger picture question: do you think it's going to be feasible to support the proposal in full? There was an earlier conversation at #1766 that seems pretty pessimistic. :-( I haven't thought carefully about it though or looked at the C spec.
@keithw yup, I think the quick summary is that #1766 is a more elaborate version of the statement "lowering wasm atomics into C atomics will cause undefined behavior" that I noted earlier. I believe the current compiler intrinsic approach is safe from UB. In particular, since we are using compiler intrinsics, we aren't violating C's aliasing rules such as pointers of different types should be non-overlapping. Here is the full train of thought.
So overall, I believe we are fine here. However, we can continue to monitor UBSan, to make sure things remain good. I don't think msvc currently supports UBsan, but once that is enabled, it would be good to get that on CI as well. Just for completion, I'll note that the only thing that was a bit weird to handle in wasm2c was MSVC's lack of |
MEMCHECK(mem, addr, t1); \ | ||
ATOMIC_ALIGNMENT_CHECK(addr, t1); \ | ||
t1 result; \ | ||
wasm_rt_memcpy(&result, &mem->data[addr], sizeof(t1)); \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the reason for having this additional memcpy before the atomic load?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, moving data from a source and destination of different types in C (apart from the basic supported things like int to long conversions) should be done by memcpy. This is because casting the data
byte buffer to multiple target types can trigger UB. Put simply, if you have both expressions u32* a = (u32*) &mem->data[addr]
and u64* b = (u64*) &mem->data[addr]
in the same C file output by wasm2c, then the C compiler which compiles this file is allowed to optimize assuming a & b don't alias. However, Wasm makes no such guarantees about aliasing, meaning a & b could alias/overlap. This end result here is UB in the C compilation.
t1 result; \ | ||
wasm_rt_memcpy(&result, &mem->data[addr], sizeof(t1)); \ | ||
result = atomic_load_##t1(&mem->data[addr]); \ | ||
wasm_asm("" ::"r"(result)); \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the reason for having this forced read? Is it always necessary? What happens if it is not performed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wasm2c forces liveness of every load result (not just for atomic ops) because WebAssembly requires a deterministic trap on OOB load/store. We don't want the compiler to optimize the read out (e.g. if the value is never observed) and prevent the SIGSEGV/SIGBUS from getting triggered. (The spec tests will fail if you take these forced livenesses out and run wasm2c's output through an optimizing C compiler.) There's probably more efficient ways we could be doing this, but this is what we've got to avoid a software bounds-check on every load.
@shravanrn Great work! I'm currently implementing similar support in w2c2 and was wondering about some of the code |
@turbolent Also, have a look at the currently open PRs. I am changing this to move away from compiler intrinsics to C11 features. |
@shravanrn I'm curious if you have the following expression types working in wasm2c(AtomicWait, AtomicNotify)? Lines 3973 to 3976 in c24a216
They are currently unimplemented in the |
@squk You can try the patch from this PR #2268 |
@shravanrn Thanks. I also found these branches I'm curious, are the atomic memory operations something that's being actively worked on? |
Given my current availability, I think I'm unlikely to get to this in the next couple of months, especially because some of our plans to use wasm threads got dropped. This may change in the future, in which case, I'll try to prioritize implementation again. For the moment, you're welcome to give the implementation a shot if you have cycles. |
This PR starts the work for support for the threads proposal. In this initial PR, I am only starting with support for the Atomic memory operations part of the spec https://github.com/WebAssembly/threads/blob/cc01bf0d17ba3fb1dc59fb7c5c725838aff18b50/proposals/threads/Overview.md?plain=1#L228
Since there is a large amount of support needed, it seemed sensible to land this in chunks.
Broadly, there are three types of operations
This PR focuses on implementing operations 1 and 2 and relies on atomic builtins provided by gcc, clang and icc https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html
and the MSVC equivalents
https://learn.microsoft.com/en-us/cpp/intrinsics/intrinsics-available-on-all-architectures?view=msvc-170
On godbolt, it appears that this has support from
on x86_64
on arm/arm64
Unfortunately, we can't use C11 operations as casting to/from atomics from/to a regular pointer in the Wasm heap is undefined behavior in C https://stackoverflow.com/questions/55299525/casting-pointers-to-atomic-pointers-and-atomic-sizes --- this means we can't use these ops on pointers to Wasm's linear memory
Next steps
Operation 3 needs further thinking on the best way to implement this, so we can continue to support different compilers and OSes. Currently the best way forward is to implement this, seems to be to implement this on top of atomic ops, similar to [this](There's https://developers.redhat.com/articles/2022/12/06/implementing-c20-atomic-waiting-libstdc#how_can_we_implement_atomic_waiting_). Thoughts welcome.
This code is tested with a subset of the thread atomic tests. These tests are included in-tree, with unsupported tests (relying on wait/notify) commented out. We can start testing with the full testsuite, once wait/notify is also supported.
It would be nice to find more efficient ways to perform atomic alignment checks instead of an explicit conditional