Skip to content

Add x86 Keccak implementation #2619

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from
Draft

Conversation

manastasova
Copy link
Contributor

@manastasova manastasova commented Aug 15, 2025

Description of changes:

This PR is a prototype of the x86 Keccak code as part of the third party module. Once the code and the proof are merged into s2n-bignum, the s2n-bignum importer script will be used to integrate the implementation.

Testing:

ninja && ./crypto/crypto_test
./tool/bssl speed -filter {SHA3-224, ...}

SHA3 Performance: Assembly vs C Implementation Tables

SHA3-224

Input Size Assembly (MB/s) C (MB/s) Speedup Improvement (%)
16 bytes 54.4 45.2 1.20x 20.4%
256 bytes 445.5 390.9 1.14x 14.0%
1350 bytes 492.0 428.3 1.15x 14.9%
8192 bytes 527.2 449.3 1.17x 17.3%
16384 bytes 533.5 451.1 1.18x 18.3%

SHA3-256

Input Size Assembly (MB/s) C (MB/s) Speedup Improvement (%)
16 bytes 54.4 45.9 1.19x 18.5%
256 bytes 446.6 400.5 1.12x 11.5%
1350 bytes 493.1 433.9 1.14x 13.6%
8192 bytes 496.8 428.6 1.16x 15.9%
16384 bytes 498.1 431.1 1.16x 15.5%

SHA3-384

Input Size Assembly (MB/s) C (MB/s) Speedup Improvement (%)
16 bytes 54.3 48.0 1.13x 13.1%
256 bytes 305.8 283.6 1.08x 7.8%
1350 bytes 384.4 333.7 1.15x 15.2%
8192 bytes 384.5 328.8 1.17x 17.0%
16384 bytes 385.7 329.4 1.17x 17.1%

SHA3-512

Input Size Assembly (MB/s) C (MB/s) Speedup Improvement (%)
16 bytes 54.0 46.4 1.16x 16.4%
256 bytes 234.5 210.5 1.11x 11.4%
1350 bytes 266.8 228.2 1.17x 16.9%
8192 bytes 272.3 229.2 1.19x 18.8%
16384 bytes 271.1 231.6 1.17x 17.1%

SHA3 Performance: Details

ASM Implementation

./tool/bssl speed -filter SHA3-224
Did 3399750 SHA3-224 (16 bytes) operations in 1000077us (3399488.2 ops/sec): 54.4 MB/s
Did 1741000 SHA3-224 (256 bytes) operations in 1000447us (1740222.1 ops/sec): 445.5 MB/s
Did 365000 SHA3-224 (1350 bytes) operations in 1001528us (364443.1 ops/sec): 492.0 MB/s
Did 65000 SHA3-224 (8192 bytes) operations in 1009980us (64357.7 ops/sec): 527.2 MB/s
Did 33000 SHA3-224 (16384 bytes) operations in 1013407us (32563.4 ops/sec): 533.5 MB/s
./tool/bssl speed -filter SHA3-256
Did 3403250 SHA3-256 (16 bytes) operations in 1000045us (3403096.9 ops/sec): 54.4 MB/s
Did 1744750 SHA3-256 (256 bytes) operations in 1000020us (1744715.1 ops/sec): 446.6 MB/s
Did 366000 SHA3-256 (1350 bytes) operations in 1001951us (365287.3 ops/sec): 493.1 MB/s
Did 61000 SHA3-256 (8192 bytes) operations in 1005814us (60647.4 ops/sec): 496.8 MB/s
Did 31000 SHA3-256 (16384 bytes) operations in 1019776us (30398.8 ops/sec): 498.1 MB/s
./tool/bssl speed -filter SHA3-384
Did 3395000 SHA3-384 (16 bytes) operations in 1000226us (3394232.9 ops/sec): 54.3 MB/s
Did 1195000 SHA3-384 (256 bytes) operations in 1000425us (1194492.3 ops/sec): 305.8 MB/s
Did 285000 SHA3-384 (1350 bytes) operations in 1000955us (284728.1 ops/sec): 384.4 MB/s
Did 47000 SHA3-384 (8192 bytes) operations in 1001271us (46940.3 ops/sec): 384.5 MB/s
Did 24000 SHA3-384 (16384 bytes) operations in 1019448us (23542.2 ops/sec): 385.7 MB/s
./tool/bssl speed -filter SHA3-512
Did 3377000 SHA3-512 (16 bytes) operations in 1000075us (3376746.7 ops/sec): 54.0 MB/s
Did 917000 SHA3-512 (256 bytes) operations in 1000998us (916085.7 ops/sec): 234.5 MB/s
Did 198000 SHA3-512 (1350 bytes) operations in 1001963us (197612.1 ops/sec): 266.8 MB/s
Did 34000 SHA3-512 (8192 bytes) operations in 1022690us (33245.7 ops/sec): 272.3 MB/s
Did 17000 SHA3-512 (16384 bytes) operations in 1027485us (16545.3 ops/sec): 271.1 MB/s

C Implementation

./tool/bssl speed -filter SHA3-224
Did 2827000 SHA3-224 (16 bytes) operations in 1000051us (2826855.8 ops/sec): 45.2 MB/s
Did 1528000 SHA3-224 (256 bytes) operations in 1000630us (1527038.0 ops/sec): 390.9 MB/s
Did 318000 SHA3-224 (1350 bytes) operations in 1002449us (317223.1 ops/sec): 428.3 MB/s
Did 55000 SHA3-224 (8192 bytes) operations in 1002827us (54845.0 ops/sec): 449.3 MB/s
Did 28000 SHA3-224 (16384 bytes) operations in 1016880us (27535.2 ops/sec): 451.1 MB/s
./tool/bssl speed -filter SHA3-256
Did 2867500 SHA3-256 (16 bytes) operations in 1000073us (2867290.7 ops/sec): 45.9 MB/s
Did 1564750 SHA3-256 (256 bytes) operations in 1000143us (1564526.3 ops/sec): 400.5 MB/s
Did 322000 SHA3-256 (1350 bytes) operations in 1001788us (321425.3 ops/sec): 433.9 MB/s
Did 53000 SHA3-256 (8192 bytes) operations in 1012940us (52322.9 ops/sec): 428.6 MB/s
Did 27000 SHA3-256 (16384 bytes) operations in 1026228us (26309.9 ops/sec): 431.1 MB/s
./tool/bssl speed -filter SHA3-384
Did 3000000 SHA3-384 (16 bytes) operations in 1000366us (2998902.4 ops/sec): 48.0 MB/s
Did 1108000 SHA3-384 (256 bytes) operations in 1000065us (1107928.0 ops/sec): 283.6 MB/s
Did 248000 SHA3-384 (1350 bytes) operations in 1003285us (247188.0 ops/sec): 333.7 MB/s
Did 41000 SHA3-384 (8192 bytes) operations in 1021609us (40132.8 ops/sec): 328.8 MB/s
Did 21000 SHA3-384 (16384 bytes) operations in 1044409us (20107.1 ops/sec): 329.4 MB/s
./tool/bssl speed -filter SHA3-512
Did 2902000 SHA3-512 (16 bytes) operations in 1000219us (2901364.6 ops/sec): 46.4 MB/s
Did 823000 SHA3-512 (256 bytes) operations in 1000766us (822370.1 ops/sec): 210.5 MB/s
Did 170000 SHA3-512 (1350 bytes) operations in 1005799us (169019.9 ops/sec): 228.2 MB/s
Did 28000 SHA3-512 (8192 bytes) operations in 1000970us (27972.9 ops/sec): 229.2 MB/s
Did 15000 SHA3-512 (16384 bytes) operations in 1061183us (14135.2 ops/sec): 231.6 MB/s

###MLKEM Performance: Assembly vs C SHA3 Tables

ML-KEM Performance: Assembly vs C Implementation

ML-KEM-512

Operation Assembly (ops/sec) C (ops/sec) Speedup Improvement (%)
Keygen 59902.7 57511.2 1.04x 4.2%
Encaps 55389.8 52286.3 1.06x 5.9%
Decaps 45758.1 42746.4 1.07x 7.0%

ML-KEM-768

Operation Assembly (ops/sec) C (ops/sec) Speedup Improvement (%)
Keygen 35753.8 34474.1 1.04x 3.7%
Encaps 36155.6 34309.8 1.05x 5.4%
Decaps 30390.9 28625.7 1.06x 6.2%

ML-KEM-1024

Operation Assembly (ops/sec) C (ops/sec) Speedup Improvement (%)
Keygen 23652.7 22711.6 1.04x 4.1%
Encaps 25448.2 23889.1 1.07x 6.5%
Decaps 21344.2 19922.1 1.07x 7.1%

ASM Implementation

./tool/bssl speed -filter ML-KEM-512
Did 60000 ML-KEM-512 keygen operations in 1001625us (59902.7 ops/sec)
Did 56000 ML-KEM-512 encaps operations in 1011016us (55389.8 ops/sec)
Did 46000 ML-KEM-512 decaps operations in 1005287us (45758.1 ops/sec)
./tool/bssl speed -filter ML-KEM-768
Did 36000 ML-KEM-768 keygen operations in 1006886us (35753.8 ops/sec)
Did 37000 ML-KEM-768 encaps operations in 1023355us (36155.6 ops/sec)
Did 31000 ML-KEM-768 decaps operations in 1020042us (30390.9 ops/sec)
./tool/bssl speed -filter ML-KEM-1024
Did 24000 ML-KEM-1024 keygen operations in 1014683us (23652.7 ops/sec)
Did 26000 ML-KEM-1024 encaps operations in 1021685us (25448.2 ops/sec)
Did 22000 ML-KEM-1024 decaps operations in 1030726us (21344.2 ops/sec)

C Implementation

./tool/bssl speed -filter ML-KEM-512
Did 58000 ML-KEM-512 keygen operations in 1008500us (57511.2 ops/sec)
Did 53000 ML-KEM-512 encaps operations in 1013649us (52286.3 ops/sec)
Did 43000 ML-KEM-512 decaps operations in 1005932us (42746.4 ops/sec)
./tool/bssl speed -filter ML-KEM-768
Did 35000 ML-KEM-768 keygen operations in 1015254us (34474.1 ops/sec)
Did 35000 ML-KEM-768 encaps operations in 1020117us (34309.8 ops/sec)
Did 29000 ML-KEM-768 decaps operations in 1013076us (28625.7 ops/sec)
./tool/bssl speed -filter ML-KEM-1024
Did 23000 ML-KEM-1024 keygen operations in 1012698us (22711.6 ops/sec)
Did 24000 ML-KEM-1024 encaps operations in 1004641us (23889.1 ops/sec)
Did 20000 ML-KEM-1024 decaps operations in 1003908us (19922.1 ops/sec)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and the ISC license.

@codecov-commenter
Copy link

codecov-commenter commented Aug 16, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 78.71%. Comparing base (04875db) to head (cc8f92a).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2619      +/-   ##
==========================================
- Coverage   78.72%   78.71%   -0.02%     
==========================================
  Files         645      645              
  Lines      111086   111112      +26     
  Branches    15690    15688       -2     
==========================================
+ Hits        87453    87458       +5     
- Misses      22941    22962      +21     
  Partials      692      692              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants