polyval: implement Karatsuba multiplication for arm64 #181

ericlagergren · 2023-06-22T05:19:06Z

Improves performance by ~200 MB/s on a 2020 M1.

Improves performance by ~200 MB/s on a 2020 M1. Signed-off-by: Eric Lagergren <[email protected]>

ericlagergren · 2023-06-22T05:21:40Z

The code is taken from https://github.com/ericlagergren/polyval-rs/tree/dev, which also has "wide" implementations (8 blocks at a time), which has significantly better performance (~0.17 cycles per byte instead of ~2).

ericlagergren · 2023-06-22T05:25:20Z

I also have an x86 version I can submit as well if you'd like.

polyval/src/backend/pmull.rs

tarcieri · 2023-06-23T18:28:28Z

Parallel and x86 versions would be appreciated, although perhaps as separate PRs to ease reviewability

Signed-off-by: Eric Lagergren <[email protected]>

tarcieri

Tested locally on an M2 Max, where I observed the reported speedups.

Percentage-wise it's about a 17% speedup.

ericlagergren · 2023-06-24T06:50:07Z

Parallel and x86 versions would be appreciated, although perhaps as separate PRs to ease reviewability

Actually, your x86 implementation only uses 3 clmul instructions, so I don't think the serial version can be improved much.

I'll look at adding parallel implementations. Off hand, do you know if the current API supports it? The input probably needs to be in one contiguous buffer. (Maybe not?) But that's the common case, at least for stuff like non-interleaved AES-GCM-SIV or HCTR2.

tarcieri · 2023-06-24T13:06:01Z

Take a look at poly1305 for an example of a parallel multi-block backend (AVX2): https://github.com/RustCrypto/universal-hashes/blob/0054b30/poly1305/src/backend/avx2.rs#L188-L198

Added - add `new_with_init_block` (RustCrypto#195) Changed - implement Karatsuba multiplication for arm64 (RustCrypto#181)

Added - add `new_with_init_block` (#195) Changed - implement Karatsuba multiplication for arm64 (#181)

polyval: implement Karatsuba multiplication for arm64

704f074

Improves performance by ~200 MB/s on a 2020 M1. Signed-off-by: Eric Lagergren <[email protected]>

tarcieri reviewed Jun 23, 2023

View reviewed changes

polyval/src/backend/pmull.rs Outdated Show resolved Hide resolved

use #[inline] and #[target_feature]

9a56be1

Signed-off-by: Eric Lagergren <[email protected]>

tarcieri approved these changes Jun 23, 2023

View reviewed changes

tarcieri merged commit 973fe29 into RustCrypto:master Jun 23, 2023

baloo added a commit to baloo/universal-hashes that referenced this pull request Mar 3, 2024

polyval v0.6.2

0b1de2e

Added - add `new_with_init_block` (RustCrypto#195) Changed - implement Karatsuba multiplication for arm64 (RustCrypto#181)

baloo mentioned this pull request Mar 3, 2024

polyval v0.6.2 #197

Merged

tarcieri pushed a commit that referenced this pull request Mar 3, 2024

polyval v0.6.2 (#197)

4e7d04f

Added - add `new_with_init_block` (#195) Changed - implement Karatsuba multiplication for arm64 (#181)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

polyval: implement Karatsuba multiplication for arm64 #181

polyval: implement Karatsuba multiplication for arm64 #181

ericlagergren commented Jun 22, 2023

ericlagergren commented Jun 22, 2023

ericlagergren commented Jun 22, 2023

tarcieri commented Jun 23, 2023

tarcieri left a comment

ericlagergren commented Jun 24, 2023

tarcieri commented Jun 24, 2023 •

edited

Loading

polyval: implement Karatsuba multiplication for arm64 #181

polyval: implement Karatsuba multiplication for arm64 #181

Conversation

ericlagergren commented Jun 22, 2023

ericlagergren commented Jun 22, 2023

ericlagergren commented Jun 22, 2023

tarcieri commented Jun 23, 2023

tarcieri left a comment

Choose a reason for hiding this comment

ericlagergren commented Jun 24, 2023

tarcieri commented Jun 24, 2023 • edited Loading

tarcieri commented Jun 24, 2023 •

edited

Loading