RFC: Add a replace_with method to Option #2490

frol · 2018-06-29T07:17:30Z

Add the method Option::replace_with to the core library.

This RFC proposes the addition of Option::replace_with to compliment Option::replace (RFC #2296) and Option::take methods. It replaces the actual value in the option with the value returned from a closure given as a parameter, while the old value is passed into the closure.

Rendered

frol · 2018-06-29T10:03:44Z

It would be ideal to implement a proper optimization instead of introducing this Option::replace_with method, so the naive implementation ideally should produce the optimized code without a need to use special methods like replace_with.

I have initially attempted to dig into the optimizations, but soon I realized that I don't have enough expertise yet to do that. If someone can mentor me, I would eagerly dive into the topic!

UPDATE: I moved my findings into a StackOverflow question.

I would love to learn more details about optimizations, so your help is appreciated!

diwic · 2018-06-29T19:34:21Z

    let mut old_value = unsafe { mem::uninitialized() };
    mem::swap(self, &mut old_value);
    let mut new_value = f(old_value);
    mem::swap(self, &mut new_value);
    mem::forget(new_value);

The proposed implementation does not look panic safe to me, i e, in case f panics, then self will be left uninitialized, which is unsound (and will likely lead to segfaults later on). I guess it could work if old_value was initialized to None instead - but would that give the same performance?

frol · 2018-06-29T19:42:41Z

@diwic Well, initialization with None only adds a single mov qword ptr [r14 + 8], 0 instruction, that does not hurt the performance at all (yet I opted to avoid it using uninitialized, though I admit that I have no experience with the drawbacks uninitialized may introduce).

Centril · 2018-06-30T03:34:37Z

I would rewrite this as:

    *self = f(mem::replace(self, None));

which is safe and depends directly on zero unsafe operations.

diwic · 2018-06-30T06:33:28Z

@Centril and *self = f(self.take()) is even shorter, but since that does not optimize well enough, I'd assume your version won't either...?

Centril · 2018-06-30T06:37:21Z

@diwic No idea; I'll leave it up to @frol to benchmark this :)

*self = with(self.take()) looks great!

frol · 2018-06-30T06:48:27Z

@diwic is right. If *self = f(self.take()) would work, there would be no need in this Option::replace_with at all as it is mentioned in the drawbacks section.

I initially started with the following implementation:

        let mut new_value = f(self.take());
        mem::swap(self, &mut new_value);
        // Since self was None after take(), new_value holds None here after swap(),
        // so we can forget about it.
        mem::forget(new_value);

It is as performant as the proposed implementation but has one extra mov qword ptr [r14 + 8], 0 instruction, which I decided to avoid as well. Thus, even when I call swap for uninitialized new_value, it compiles down to just using the value of self and does not really replace it with uninitialized garbage in exchange, i.e. we may end up with double free in case of panic, but I am unsure about that due to lack of knowledge.

Centril · 2018-06-30T06:52:49Z

@frol So I primarily see the motivation for option.replace_with(|opt| ...) in that it is clearer wrt. intent even if it can be written as *option = (|opt| ...)(option.take()); which is itself short.

frol · 2018-06-30T06:59:14Z

@Centril To be honest, I would prefer fixing optimizer to handle = efficiently instead of this extra method, but I need someone to help me with the starting point if possible. I tried to dive into it myself, but I stuck with understanding whether this optimization belongs to MIR or LLVM or anything else...

scottmcm · 2018-07-03T03:51:10Z

The problem is not in replace_with, it's in take somewhere, probably actually in swap: https://play.rust-lang.org/?gist=4e1c4da9cefb3f17a187a7581e2da938&version=nightly&mode=release

With std's take:

	mov	rax, qword ptr [rdi]
	mov	rcx, rax
	shr	rcx, 32
	xor	edx, edx
	test	eax, eax
	setne	dl
	add	ecx, 1
	mov	dword ptr [rdi], edx
	mov	dword ptr [rdi + 4], ecx
	ret

With a different take:

	xor	eax, eax
	cmp	dword ptr [rdi], 0
	setne	al
	mov	dword ptr [rdi], eax
	add	dword ptr [rdi + 4], 1
	ret

Which is pretty good, other than the usual rust-lang/rust#49420 (comment).

I suspect swap is the culprit here because looking at the LLVM-IR for the former:

start:
  %1 = bitcast { i32, i32 }* %0 to i64*
  %2 = load i64, i64* %1, align 1, !alias.scope !0, !noalias !9

That align 1 is coming from the fact that the current swap implementation always swaps via *mut u8, which I suspect is what's making LLVM unwilling to optimize further here.

TL/DR: replace_with should have the obvious all-safe-code implementation, and any problems in the code that emits should be fixed elsewhere.

Edit: PR for swap that improves take: rust-lang/rust#52051

mem::swap the obvious way for types smaller than the SIMD optimization's block size LLVM isn't able to remove the alloca for the unaligned block in the post-SIMD tail in some cases, so doing this helps SRoA work in cases where it currently doesn't. Found in the `replace_with` RFC discussion. Examples of the improvements: <details> <summary>swapping `[u16; 3]` takes 1/3 fewer instructions and no stackalloc</summary> ```rust type Demo = [u16; 3]; pub fn swap_demo(x: &mut Demo, y: &mut Demo) { std::mem::swap(x, y); } ``` nightly: ```asm _ZN4blah9swap_demo17ha1732a9b71393a7eE: .seh_proc _ZN4blah9swap_demo17ha1732a9b71393a7eE sub rsp, 32 .seh_stackalloc 32 .seh_endprologue movzx eax, word ptr [rcx + 4] mov word ptr [rsp + 4], ax mov eax, dword ptr [rcx] mov dword ptr [rsp], eax movzx eax, word ptr [rdx + 4] mov word ptr [rcx + 4], ax mov eax, dword ptr [rdx] mov dword ptr [rcx], eax movzx eax, word ptr [rsp + 4] mov word ptr [rdx + 4], ax mov eax, dword ptr [rsp] mov dword ptr [rdx], eax add rsp, 32 ret .seh_handlerdata .section .text,"xr",one_only,_ZN4blah9swap_demo17ha1732a9b71393a7eE .seh_endproc ``` this PR: ```asm _ZN4blah9swap_demo17ha1732a9b71393a7eE: mov r8d, dword ptr [rcx] movzx r9d, word ptr [rcx + 4] movzx eax, word ptr [rdx + 4] mov word ptr [rcx + 4], ax mov eax, dword ptr [rdx] mov dword ptr [rcx], eax mov word ptr [rdx + 4], r9w mov dword ptr [rdx], r8d ret ``` </details> <details> <summary>`replace_with` optimizes down much better</summary> Inspired by rust-lang/rfcs#2490, ```rust fn replace_with<T, F>(x: &mut Option<T>, f: F) where F: FnOnce(Option<T>) -> Option<T> { *x = f(x.take()); } pub fn inc_opt(mut x: &mut Option<i32>) { replace_with(&mut x, |i| i.map(|j| j + 1)); } ``` Rust 1.26.0: ```asm _ZN4blah7inc_opt17heb0acb64c51777cfE: mov rax, qword ptr [rcx] movabs r8, 4294967296 add r8, rax shl rax, 32 movabs rdx, -4294967296 and rdx, r8 xor r8d, r8d test rax, rax cmove rdx, rax setne r8b or rdx, r8 mov qword ptr [rcx], rdx ret ``` Nightly (better thanks to ScalarPair, maybe?): ```asm _ZN4blah7inc_opt17h66df690be0b5899dE: mov r8, qword ptr [rcx] mov rdx, r8 shr rdx, 32 xor eax, eax test r8d, r8d setne al add edx, 1 mov dword ptr [rcx], eax mov dword ptr [rcx + 4], edx ret ``` This PR: ```asm _ZN4blah7inc_opt17h1426dc215ecbdb19E: xor eax, eax cmp dword ptr [rcx], 0 setne al mov dword ptr [rcx], eax add dword ptr [rcx + 4], 1 ret ``` Where that add is beautiful -- using an addressing mode to not even need to explicitly go through a register -- and the remaining imperfection is well-known (rust-lang#49420 (comment)). </details>

frol · 2018-07-24T16:01:30Z

@scottmcm FYI, I have tried the latest Rust nightly (6a1c0637c 2018-07-23), which includes the patch from PR rust-lang/rust#52051 and even though I see that your example snippet has been improved with the patch, there is no improvement for my use-case.

Given that the proposed optimization was concluded to be irrelevant for the implementation, and the fact that the proposed method basically duplicates the already existing way to do this operation in an obvious way (a = f(a.take())), I am closing this issue and will dig further into the optimizations space.

frol · 2019-01-31T18:16:01Z

UPDATE: There was a relevant RFC about the ideas I describe below, so feel free to ignore my message.

I have just had a conversation where this replace_with idea had bubbled up in yet another context, that is how to "toggle" enum variants when there is a non-Copy value inside:

#[derive(Debug)]
struct Bar;

#[derive(Debug)]
enum Foo {
	A(Bar),
	B(Bar),
}

#[derive(Debug)]
struct Baz {
	foo: Foo,
}

impl Baz {
    fn switch_variant_unsafe(&mut self) {
        let mut foo_temp: Foo = unsafe { ::std::mem::uninitialized() };
        ::std::mem::swap(&mut self.foo, &mut foo_temp);
        self.foo = match foo_temp {
            Foo::A(bar) => Foo::B(bar),
            Foo::B(bar) => Foo::A(bar),
        }
    }
    
    fn switch_variant_safe(&mut self) {
        self.foo = match self.foo {
            Foo::A(bar) => Foo::B(bar),
            Foo::B(bar) => Foo::A(bar),
        })
    }
}

fn main() {
    let mut baz = Baz { foo: Foo::A(Bar) };
    baz.foo = match baz.foo {
        Foo::A(bar) => Foo::B(bar),
        Foo::B(bar) => Foo::A(bar),
    };
    dbg!(&baz);
    baz.switch_variant_unsafe();
    dbg!(&baz);
    baz.switch_variant_safe();
    dbg!(&baz);
}

Playground

As is, you get a compilation error:

error[E0507]: cannot move out of borrowed content
  --> src/main.rs:26:26
   |
26 |         self.foo = match self.foo {
   |                          ^^^^^^^^
   |                          |
   |                          cannot move out of borrowed content
   |                          help: consider borrowing here: `&self.foo`
27 |             Foo::A(bar) => Foo::B(bar),
   |                    --- data moved here
28 |             Foo::B(bar) => Foo::A(bar),
   |                    --- ...and here
   |
note: move occurs because these variables have types that don't implement the `Copy` trait
  --> src/main.rs:27:20
   |
27 |             Foo::A(bar) => Foo::B(bar),
   |                    ^^^
28 |             Foo::B(bar) => Foo::A(bar),
   |                    ^^^

There is also a similar question on SO.

Here is the helper that I came up with (based on this RFC):

fn replace_with<T, F>(dest: &mut T, mut f: F) 
where
    F: FnMut(T) -> T,
{
    let mut old_value = unsafe { std::mem::uninitialized() };
    std::mem::swap(dest, &mut old_value);  // dest is "uninitialized" (in fact, it is not touched in release mode)
    let mut new_value = f(old_value);
    std::mem::swap(dest, &mut new_value);   // dest holds new_value, and new_value is "uninitialized"
    std::mem::forget(new_value);  // since it is "uninitialized", we forget about it
}

, and then we can implement switch_variant_safe as:

    fn switch_variant_safe(&mut self) {
        replace_with(&mut self.foo, |foo| match foo {
            Foo::A(bar) => Foo::B(bar),
            Foo::B(bar) => Foo::A(bar),
        })
    }

The generated assembly for switch_variant_unsafe and switch_variant_safe (with replace_with helper) is identical in release mode.

As to the unsoundness concerns raised in #2490 (comment), in release mode, replace_with gets compiled to:

example::replace_with:
        mov     al, byte ptr [rdi]
        not     al
        and     al, 1
        mov     byte ptr [rdi], al
        ret

in fact, it gets automatically inlined unless I put #inline(never):

        mov     al, byte ptr [rsp + 7]
        not     al
        and     al, 1

Thus, there is no unsoundness in the release mode. Yet, in debug mode, there is indeed an explicit uninitialized value gets assigned to the self.foo. I have tried to implement replace_with using raw pointers, but failed at the point that f needs to take ownership ("cannot move out of dereference of raw pointer"). I have also been pointed to std::panic::UnwindSafe, yet I haven't figured out what is the proper use of it.

Another way to implement enum variant "toggle" is to pass ownership to self and return Self:

    fn switch_variant_owned(mut self) -> Self {
        self.foo = match self.foo {
            Foo::A(bar) => Foo::B(bar),
            Foo::B(bar) => Foo::A(bar),
        };
        self
    }

, but it requires the API changes all the way down to the method (i.e. you have to use this API style all the way through your codebase if it is a low-level method).

Sidenote, while the generated assembly is mostly the same (there are some variables rearrangements), there is one interesting optimization gets applied when switch_variant_owned is implemented, that is xor al, 1 instead of not al; and al, 1.

I believe, there is a need for safe and sound std::mem::replace_with. What do you think?

P.S. This more generic std::mem::replace_with can easily resolve the need for Option::replace_with covered in this RFC. Instead of the proposed:

let mut some_option: Option<i32> = Some(123);

some_option.replace_with(|old_value| consume_option_i32_and_produce_option_i32(old_value));

I would write

let mut some_option: Option<i32> = Some(123);

std::mem::replace(&mut some_option, |old_value| consume_option_i32_and_produce_option_i32(old_value));

Ixrec · 2019-01-31T19:13:32Z

@frol Since it's unclear whether or not you're already aware of this: I believe that specific use case was a major part of the discussion around #1736

For those who aren't familiar with that discussion: The big sticking point that led to its closure was that apparently the only way to make a replace_with method panic-safe is to have it abort() if the closure ever panics, which is very ungreat.

frol · 2019-01-31T19:15:17Z

@Ixrec I was not aware of it. Thank you for pointing out in the right direction!

frol mentioned this pull request Jun 29, 2018

RFC: Add Option::replace to the core library #2296

Merged

Centril added the T-libs-api Relevant to the library API team, which will review and decide on the RFC. label Jun 29, 2018

RFC: Add a replace_with method to Option

e43de83

frol force-pushed the option-replace-with branch from 31a6959 to e43de83 Compare June 29, 2018 13:55

scottmcm mentioned this pull request Jul 4, 2018

mem::swap the obvious way for types smaller than the SIMD optimization's block size rust-lang/rust#52051

Merged

frol closed this Jul 24, 2018

RFC: Add a replace_with method to Option #2490

RFC: Add a replace_with method to Option #2490

Uh oh!

Conversation

frol commented Jun 29, 2018

Uh oh!

frol commented Jun 29, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

diwic commented Jun 29, 2018

Uh oh!

frol commented Jun 29, 2018

Uh oh!

Centril commented Jun 30, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

diwic commented Jun 30, 2018

Uh oh!

Centril commented Jun 30, 2018

Uh oh!

frol commented Jun 30, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Centril commented Jun 30, 2018

Uh oh!

frol commented Jun 30, 2018

Uh oh!

scottmcm commented Jul 3, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

frol commented Jul 24, 2018

Uh oh!

frol commented Jan 31, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Ixrec commented Jan 31, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

frol commented Jan 31, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

frol commented Jun 29, 2018 •

edited

Loading

Centril commented Jun 30, 2018 •

edited

Loading

frol commented Jun 30, 2018 •

edited

Loading

scottmcm commented Jul 3, 2018 •

edited

Loading

frol commented Jan 31, 2019 •

edited

Loading

Ixrec commented Jan 31, 2019 •

edited

Loading