-
Notifications
You must be signed in to change notification settings - Fork 271
Enable sha1 and sha2 AArch64 extensions from asm-hashes #97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Thank you for your PRs! IIUC your assembly code uses instructions from the One thing about which I am not sure is usage of One alternative could be to gate your asm on |
I originally wanted to use intrinsics, but I’d like the xmpp-rs project to work on stable too (and be fast at that). I’m fine with doing that work again using intrinsics once a path towards stabilisation of AArch64 intrinsics has been decided. Also note that intrinsics aren’t inherently safer than assembly. What do you mean by “code which does not use the extension instructions”? Something which is approximately as slow as the Rust version?
Yeah no, I’d rather have this enabled the same way as on x86, to simplify its usage. We can migrate to the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, let's go with your approach after adding the appropriate gates and replace it sometime in the future after stabilization of the AArch64 intrinsics.
What do you mean by “code which does not use the extension instructions”? Something which is approximately as slow as the Rust version?
Roughly, yes. IIRC even the current not-so-optimal assembly is a bit faster than the pure Rust version, so I think using OpenSSL assembly will result in a performance boost even when the crypto
extension is not available. See RustCrypto/asm-hashes#5 for more details.
@@ -0,0 +1,11 @@ | |||
// From sys/auxv.h | |||
const AT_HWCAP: u64 = 16; | |||
const HWCAP_SHA2: u64 = 64; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW is it possible that processor supports SHA-2 instructions, but not SHA-1 and/or AES? I thought they are part of the same crypto
extension.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The “crypto extension” is an ARM marketing term for a Cortex-A core bundling these three extensions, and possibly the sha512, sha3, and more in the future.
I’m not aware of any actual CPU exposing e.g. the sha2 flag but not sha1, but given they are different flags I’d rather test them properly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we are using getauxval
it does not matter, but during migration to intrinsics we will have to use a more coarse grained target features, which only know about the crypto
extension. But I guess since LLVM does not introduce separate finer-grained features it should be fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does it work for e.g. the sha3 flag?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly, I don't quite get how LLVM handles features, e.g. in here they write:
Extensions can now have different meanings based on the base architecture they apply to. For example on AArch64, 'crypto' means different things for v8.{1,2,3}-a than v8.4-a. The former adds 'sha2' and 'aes', the latter adds those and 'sm4' and 'sha3' on top.
I wonder why they use crypto
instead of finer-grained sha1
, sha2
, sha512
, sha3
and aes
features...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They apparently noticed they made a mistake and now have aliased crypto to sha2 + aes, see llvm-mirror/llvm@6634844#diff-5ff776706657bdcfc45aa7d7159bfeadR45
We should be able to use each feature separately nowadays.
8737a0f
to
1bcb1af
Compare
I have also added an ARM worker to the CI, it only fails because there has been no sha1-asm and sha2-asm release containing my code yet (so it wouldn’t work anyway even if you merged it). |
227cc2c
to
078c1b4
Compare
This only adds AArch64 support, everything else should be identical, so we can make a patch release instead of a minor one. RustCrypto/hashes#97 depends on that release.
Once RustCrypto/asm-hashes#14 is merged the CI should be green again, after which I think this PR is good to merge. :) |
The feature gate is specific to glibc on Linux, using getauxval() to check the CPU flags, so it’s currently not used on other systems. Patches welcome. :) Enables RustCrypto/asm-hashes#10
The feature gate is specific to glibc on Linux, using getauxval() to check the CPU flags, so it’s currently not used on other systems. Patches welcome. :) The sha2 extension only provides SHA-256 instructions, so keep using the plain Rust implementation for SHA-512 until someone adds sha512 extension assembly (I have no computer to test that with).
FYI: crate updates are published now! |
This other crate is being maintained, it offers slightly better performances (when using the `asm` feature, especially [on AArch64](RustCrypto/hashes#97)). It also allows deduplicating SHA-1 crates in cargo-web.
This other crate is being maintained, it offers better performances (when using the `asm` feature, especially [on AArch64](RustCrypto/hashes#97)). It also allows deduplicating SHA-1 crates in cargo-web.
This other crate is being maintained, it offers better performances (when using the `asm` feature, especially [on AArch64](RustCrypto/hashes#97)). It also allows deduplicating SHA-1 crates in cargo-web.
This other crate is being maintained, and it offers better performances when using the `asm` feature (especially [on AArch64](RustCrypto/hashes#97)). Once websockets-rs/rust-websocket#251 is merged, it will also allow removing this extra crate from the build.
* Replace sha1 dependency with sha-1 This other crate is being maintained, and it offers better performances when using the `asm` feature (especially [on AArch64](RustCrypto/hashes#97)). * Update CHANGES.md with the sha-1 migration * Add a test for hash_key()
This follows RustCrypto/asm-hashes#10
getauxval()
is extremely cheap, as it is just a pointer to the beginning of the stack, so it doesn’t make much sense to use alazy_static!()
to remember the flags (at least I can’t see any difference while profiling).Sorry about #96, I force-pushed from the wrong repository and can’t reopen the PR due to that…
Here are some benchmarks on my Nintendo Switch:
Without this patch (or without
--features=asm
):With this patch: