-
Notifications
You must be signed in to change notification settings - Fork 5k
Add XxHash128 #77944
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add XxHash128 #77944
Conversation
Tagging subscribers to this area: @dotnet/area-system-security, @vcsjones Issue DetailsFixes #77885 Lots of changes in XxHash3 because I had to extract a good chunk of the shared implementation for XXH3 to a shared class For large buffers, performance is on par with You will notice that I have introduced constants for all the secrets (e.g I have added similar tests to XxHash3, by using the native implementation to generate the expected test results. This PR will have to wait for #77881 to be merged (and will have to be rebased, because XxHash3 file will conflict).
cc: @stephentoub, @EgorBo
|
# Conflicts: # src/libraries/System.IO.Hashing/src/System/IO/Hashing/XxHash3.cs
Commit 548c2e8 is fixing the performance difference now on part with the C++ version. Should have done before submitting this PR! 😅 |
I've added the NO MERGE label temporarily since this hasn't yet gone through API review. Once that happens, anybody should feel free to remove that label and continue with the standard PR review. :) (Normal practice is to close PRs where API review hasn't yet taken place, but given that this API is likely to go through as-is I don't think closing this PR is warranted.) |
I ran the tests on a ARM64 and apart the optimized versions that is using vectorized HW intrinsincs, most of the scalar paths are quite slower (from 30% to 2x times slower). I haven't checked the generated code, but the ARM64 JIT is definitely struggling there. That would be a good playground to optimize things further for ARM64 actually?
|
src/libraries/System.IO.Hashing/src/System/IO/Hashing/XxHash128.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.IO.Hashing/src/System/IO/Hashing/XxHash128.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.IO.Hashing/src/System/IO/Hashing/XxHash128.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.IO.Hashing/src/System/IO/Hashing/XxHash128.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.IO.Hashing/src/System/IO/Hashing/XxHash128.cs
Outdated
Show resolved
Hide resolved
Thanks for the PR review! I'm on a business travel this week but I will get back to this PR hopefully next week. |
The API was approved. As were the one-shot helpers in #76279. When you swing back around to working on this, can you please add those HashToUInt128 and GetCurrentHashAsUInt128 members as well? Thanks. |
Co-authored-by: Stephen Toub <[email protected]>
Co-authored-by: Stephen Toub <[email protected]>
Co-authored-by: Stephen Toub <[email protected]>
I'm not sure about how to add the new API to ref. Is it correct to use #if/#endif with public sealed partial class XxHash128 : System.IO.Hashing.NonCryptographicHashAlgorithm
{
public XxHash128() : base (default(int)) { }
public XxHash128(long seed) : base (default(int)) { }
public override void Append(System.ReadOnlySpan<byte> source) { }
protected override void GetCurrentHashCore(System.Span<byte> destination) { }
#if NET7_0_OR_GREATER
[System.CLSCompliantAttribute(false)] public UInt128 GetCurrentHashAsUInt128() { throw null; }
#endif
public static byte[] Hash(byte[] source) { throw null; }
public static byte[] Hash(byte[] source, long seed) { throw null; }
public static byte[] Hash(System.ReadOnlySpan<byte> source, long seed = (long)0) { throw null; }
public static int Hash(System.ReadOnlySpan<byte> source, System.Span<byte> destination, long seed = (long)0) { throw null; }
#if NET7_0_OR_GREATER
[System.CLSCompliantAttribute(false)] public static UInt128 HashToUInt128(ReadOnlySpan<byte> source, long seed = 0) { throw null; }
#endif
public override void Reset() { }
public static bool TryHash(System.ReadOnlySpan<byte> source, System.Span<byte> destination, out int bytesWritten, long seed = (long)0) { throw null; }
} |
We generally split it across files, e.g. |
# Conflicts: # src/libraries/System.IO.Hashing/src/System/IO/Hashing/XxHash3.cs
I have merged latest master and fixed the ref file. (2 pipelines are failing but with a weird error on Azure, but don't think it's coming from this PR) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
src/libraries/System.IO.Hashing/ref/System.IO.Hashing.netcoreapp.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.IO.Hashing/src/System/IO/Hashing/XxHash128.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.IO.Hashing/src/System/IO/Hashing/XxHash128.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.IO.Hashing/src/System/IO/Hashing/XxHash128.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.IO.Hashing/src/System/IO/Hashing/XxHash128.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.IO.Hashing/src/System/IO/Hashing/XxHash128.cs
Outdated
Show resolved
Hide resolved
…pp.cs Co-authored-by: Stephen Toub <[email protected]>
…8.cs Co-authored-by: Stephen Toub <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
@@ -8,6 +8,10 @@ | |||
<Compile Include="System.IO.Hashing.cs" /> | |||
</ItemGroup> | |||
|
|||
<ItemGroup Condition="'$(TargetFramework)' == '$(NetCoreAppCurrent)'"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: It would have been better to condition on a compatibility check / version check:
<ItemGroup Condition="$([MSBuild]::IsTargetFrameworkCompatible('$(TargetFramework)', 'net7.0'))">
to not lose the API on net7.0 when $(NetCoreAppCurrent)
becomes net8.0. This just started failing in the PR that upgrades to net8.0: https://github.com/dotnet/runtime/pull/78354/checks?check_run_id=9846188314.
I will change this in our PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks
Looks like this commit caused CI failures on the big-endian s390x target:
I think the endian fixes provided by @stephentoub in #78084 got lost in the move from |
Ouch, indeed, my apologize for this regression, I pushed the PR #79186 to backport the fix. |
Fixes #77885
Lots of changes in XxHash3 because I had to extract a good chunk of the shared implementation for XXH3 to a shared class
XxHashShared
.Performance wise, it is overall under 10% slower compared to C++, but also sometimes 10% faster. On the large item, it is closer to 30% slower, but I noticed that the warmup was only 10% slower, while the actual run was getting slower, not sure if it is a tiered compilation issue.
You will notice that I have introduced constants for all the secrets (e.g
DefaultSecretUInt64_0
), as they are shared between XXH3-64 and XXH3-128 (before they were inlined and hardcoded).I have added similar tests to XxHash3, by using the native implementation to generate the expected test results.
cc: @stephentoub, @EgorBo