You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The rdrand implementation contains three calls to rdrand():
1. One in the main loop, for full words of output.
2. One after the main loop, for the potential partial word of output.
3. One inside the self-test loop.
In the first two cases, each call is unrolled into:
```
rdrand <register>
jb <success>
rdrand <register>
jb <success>
rdrand <register>
jb <success>
rdrand <register>
jb <success>
rdrand <register>
jb <success>
rdrand <register>
jb <success>
rdrand <register>
jb <success>
rdrand <register>
jb <success>
rdrand <register>
jb <success>
rdrand <register>
jb <success>
```
In the third case, the self-test loop, the same unrolling happens, but then
the self-test loop is also unrolled, so the result is a sequence of 160
instructions.
With this change, each call to `rdrand()` now looks like this:
```
rdrand <register>
jb <success>
call retry
test rax, rax
jne ...
jmp ...
```
The loop in `retry()` still gets unrolled though.
Since rdrand will basically never fail, the `jb <success>` in each
call is going to be predicted as succeeding, so the number of
instructions doesn't change. But, instruction cache pressure should
be reduced.
0 commit comments