Open
Description
Hi
I am trying more examples with using crypto, and this time problem that I encounter is with integer arithmetic mismatch between GPU and CPU.
There is an example of the code: CPU and GPU
They both have same code of fiat_25519_to_bytes function, however result is slightly different for CPU and GPU
CPU OUT: [192, 72, 16, 54, 192, 98, 172, 116, 44, 128, 112, 112, 150, 42, 195, 95, 129, 14, 47, 50, 18, 198, 117, 255, 32, 79, 57, 78, 137, 92, 244, 98]
GPU OUT: [192, 72, 16, 54, 192, 98, 172, 124, 44, 128, 112, 112, 86, 106, 195, 95, 129, 14, 47, 48, 20, 198, 117, 255, 32, 63, 57, 78, 137, 92, 244, 98]
(note element 7 for example)
Understandably function is fairly large, however I was not able to reduce example. For instance, when trying to narrow down what happen I can see that basically result on this line is incorrect:
fiat_25519_addcarryx_u51(&mut x14, &mut x15, x13, x3, (x11 & 0x7ffffffffffff));
However, when taking this line out of context of large function it works perfectly fine on GPU, so the problem is likely with some kind of optimization that breaks integer arithmetics logic.