-
-
Notifications
You must be signed in to change notification settings - Fork 2
Couple of perf improvements #275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
4b3e37e to
0e6afa7
Compare
| if nbBits >= MASK.len() as u32 { | ||
| unsafe { std::hint::unreachable_unchecked() }; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so, we'd need to justify this right? probably we just inline this function altogether and then it's obvious?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We did have to recursively inline look_bits and read_bits too then. And even then some of the callers of read_bits take the bit count from a field whose range is bounds checked in an entirely separate part of the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I looked at the wrong thing. So, if we can't justify it, it should be behind the new feature flag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgot to do that. Fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
forgot to push?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, forgot --force-with-lease.
0e6afa7 to
2623836
Compare
Inclusive ranges don't optimize well. Iterating in the forward direction seems to be every so slightly cheaper.
2623836 to
e092ff1
Compare
Codecov Report✅ All modified and coverable lines are covered by tests.
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 12 files with indirect coverage changes 🚀 New features to boost your workflow:
|
e092ff1 to
007e418
Compare
This roughly halves the perf hit compared with the C original.
This also introduces a unsafe-nochecks feature to disable a couple of very hot bounds checks and get rid of a mask operation inserted by rustc for a shift in
get_middle_bits. The latter may be possible to get rid without unsafe code of in the future once rust gets pattern types.