-
Notifications
You must be signed in to change notification settings - Fork 556
Avoid double allocation when passing strings via IntoParam
#1713
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 8 commits
63255d4
679a5af
db39bae
8f3dbe1
6efc6e7
4b32a59
3cf2c92
ca3f255
e3bca2b
0ad384c
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -32,20 +32,42 @@ pub unsafe fn heap_free(ptr: RawPtr) { | |
} | ||
} | ||
|
||
/// Copy a slice of `T` into a freshly allocated buffer with an additional default `T` at the end. | ||
/// Copy an iterator of `T` into a freshly allocated buffer with an additional default `T` at the end. | ||
/// | ||
/// Returns a pointer to the beginning of the buffer | ||
/// Returns a pointer to the beginning of the buffer. This pointer must be freed when done using `heap_free`. | ||
/// | ||
/// # Panics | ||
/// | ||
/// This function panics if the heap allocation fails or if the pointer returned from | ||
/// the heap allocation is not properly aligned to `T`. | ||
pub fn heap_string<T: Copy + Default + Sized>(slice: &[T]) -> *const T { | ||
unsafe { | ||
let buffer = heap_alloc((slice.len() + 1) * std::mem::size_of::<T>()).expect("could not allocate string") as *mut T; | ||
assert!(buffer.align_offset(std::mem::align_of::<T>()) == 0, "heap allocated buffer is not properly aligned"); | ||
buffer.copy_from_nonoverlapping(slice.as_ptr(), slice.len()); | ||
buffer.add(slice.len()).write(T::default()); | ||
buffer | ||
/// This function panics if the heap allocation fails, the alignment requirements of 'T' surpass | ||
/// 8 (HeapAlloc's alignment) or if len is less than the number of items in the iterator. | ||
pub fn string_from_iter<I, T>(iter: I, len: usize) -> *const T | ||
where | ||
I: Iterator<Item = T>, | ||
T: Copy + Default, | ||
{ | ||
// alignment of memory returned by HeapAlloc is at least 8 | ||
// Source: https://docs.microsoft.com/en-us/windows/win32/api/heapapi/nf-heapapi-heapalloc | ||
// Ensure that T has sufficient alignment requirements | ||
assert!(std::mem::align_of::<T>() <= 8, "T alignment surpasses HeapAlloc alignment"); | ||
|
||
let len = len + 1; | ||
let ptr = heap_alloc(len * std::mem::size_of::<T>()).expect("could not allocate string") as *mut T; | ||
let mut encoder = iter.chain(core::iter::once(T::default())); | ||
|
||
for i in 0..len { | ||
// SAFETY: ptr points to an allocation object of size `len`, indices accessed are always lower than `len` | ||
unsafe { | ||
core::ptr::write( | ||
ptr.add(i), | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: can you use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Of course! My bad. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Pardon, but how do I do that? The zip iterator consumes both inputs and only yields elements if both iterators have an element, what can I get from the zipped iterator besides None afterwards? |
||
match encoder.next() { | ||
Some(encoded) => encoded, | ||
None => break, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It might be useful to do the following here instead of the assert in code a few lines later: debug_assert!(i == len -1); Essentially, while this code is always safe (i.e., we'll never try to write to unallocated memory), if the iterator's length and There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do you mean There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I meant |
||
}, | ||
); | ||
} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Make this a bit clearer perhaps: for (offset, c) in (0..len).zip(encoder) {
// SAFETY: ptr points to an allocation object of size `len`, indices accessed are always lower than `len`
unsafe { core::ptr::write(ptr.add(offset), c); }
} There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This looks much better! Unfortunately putting the encoder into assert!(encoder.next().is_none(), "encoder returned more characters than expected"); Your version does look much better and I'd take it in a heartbeat, but we'd need a non-consuming version of zip unfortunately. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It would be unusual, but is it invalid to request encoding fewer characters than there are? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Technically in my opinion it is impossible, but since there was uncertainty in the reviews about silent truncation I kept it in. It could be possible that somehow an invalid length gets passed in the future, so it'd be helpful to be aware via a panic rather than silent truncation. I'm not hugely opinionated on this issue, we can go for the more elegant There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I wasn't worried about truncation, I was worried about writing past the allocated memory with the unsafe There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You can store the zipped iterator into a local variable and still do the check afterwards if you want. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The |
||
} | ||
|
||
assert!(encoder.next().is_none(), "encoder returned more characters than expected"); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Won't this be fused because the chaining of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nice catch, I didn't consider that. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd also like to avoid panics in general. I have customers for whom this is inappropriate. If it means we have to embed this inside There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should this panic: There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That should never happen. My concern is that we're trying to harden a function that is only ever used internally so if there's a general concern about the safety of this function then we can either mark it unsafe or just get rid of the function entirely. |
||
|
||
ptr | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: it always bothered me a bit that we're using the term
string
here when this function is more general than that. It might be nice to name it in a way that describes more closely what's actually happening.In fact, it might make sense for this to only copy an iterator and the caller is responsible for adding the trailing null byte. This function would then lose the
Default
bound and the caller would call it like so:The caller is a bit more verbose, but it's way clearer what's actually happening.