-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
miri: optimize zeroed alloc #136035
base: master
Are you sure you want to change the base?
miri: optimize zeroed alloc #136035
Conversation
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @wesleywiser (or someone else) some time within the next two weeks. Please see the contribution instructions for more information. Namely, in order to ensure the minimum review times lag, PR authors and assigned reviewers should ensure that the review label (
|
Some changes occurred to the CTFE machinery cc @rust-lang/wg-const-eval The Miri subtree was changed cc @rust-lang/miri Some changes occurred to the CTFE / Miri interpreter cc @rust-lang/miri, @rust-lang/wg-const-eval Some changes occurred to the CTFE / Miri interpreter cc @rust-lang/miri |
This comment has been minimized.
This comment has been minimized.
r? miri |
Sorry, I'm not sure how I closed this – misclick? |
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
…c, r=<try> miri: optimize zeroed alloc When allocating zero-initialized memory in MIR interpretation, rustc allocates zeroed memory, marks it as initialized and then re-zeroes it. Remove the last step. I don't expect this to have much of an effect on performance normally, but in my case in which I'm creating a large allocation via mmap miri is unusable without this. There's probably a better way – with less code duplication – to implement this. Maybe adding a zero_init flag to the relevant methods, but then `Allocation::uninit` & co need a new name :)
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
This is not gonna show up in perf. No code path outside miri is changed |
We should definitely explore ways to do this with less code duplication. :) |
Finished benchmarking commit (837b710): comparison URL. Overall result: ✅ improvements - no action neededBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. @bors rollup=never Instruction countThis is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.
Max RSS (memory usage)Results (primary -2.2%, secondary 2.2%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResults (primary -1.6%, secondary -0.2%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 771.261s -> 771.165s (-0.01%) |
Changed 👍 |
@@ -289,7 +291,7 @@ impl<'tcx, M: Machine<'tcx>> InterpCx<'tcx, M> { | |||
|
|||
// For simplicities' sake, we implement reallocate as "alloc, copy, dealloc". | |||
// This happens so rarely, the perf advantage is outweighed by the maintenance cost. | |||
let new_ptr = self.allocate_ptr(new_size, new_align, kind)?; | |||
let new_ptr = self.allocate_ptr(new_size, new_align, kind, zero_init)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we'll memcopy right over the alloc, zeroing is not needed for realloc, just use false here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The memcopy only initializes part of the allocation, but mremap needs everything to be initialized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh I see, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a comment explaining this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also seems worth saying that we assume here that zeroing the entire allocation is more efficient than zeroing just the parts not copied from the old allocation.
@@ -315,8 +320,8 @@ impl<Prov: Provenance, Bytes: AllocBytes> Allocation<Prov, (), Bytes> { | |||
|
|||
/// Try to create an Allocation of `size` bytes, failing if there is not enough memory | |||
/// available to the compiler to do so. | |||
pub fn try_uninit<'tcx>(size: Size, align: Align) -> InterpResult<'tcx, Self> { | |||
Self::uninit_inner(size, align, || { | |||
pub fn try_new<'tcx>(size: Size, align: Align, zero_init: bool) -> InterpResult<'tcx, Self> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bools are a bit icky, as the call sites don't make it clear what the bool means. Maybe an enum with Uninit
and Zeroed
variants would be better? What do you think @RalfJung
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'm not happy with the unlabeled bools. An additional (very simple) possibility could be to use comments – this is used by existing code: mem_copy(ptr, new_ptr.into(), old_size.min(new_size), /*nonoverlapping*/ true)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah a dedicated enum
makes sense.
When allocating zero-initialized memory in MIR interpretation, rustc allocates zeroed memory, marks it as initialized and then re-zeroes it. Remove the last step.
I don't expect this to have much of an effect on performance normally, but in my case in which I'm creating a large allocation via mmap it gets in the way.