Skip to content

Add miri support #367

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
RalfJung opened this issue Jul 26, 2018 · 13 comments
Closed

Add miri support #367

RalfJung opened this issue Jul 26, 2018 · 13 comments
Labels
enhancement Something new the playground could do

Comments

@RalfJung
Copy link
Member

RalfJung commented Jul 26, 2018

Discussed briefly on IRC with @shepmaster: It would be great if the playground could run programs in miri. That's a MIR interpreter. One reason why one might want to do this is that miri will stop program execution when there is UB -- the checks are not complete yet, but more are coming. It will never be able to detect all kinds of UB (e.g. relying on how #[repr(Rust)] lays out data is hard to detect), but the idea is it should detect e.g. violation of the pointer aliasing rules.

Since I don't know how the playground works, I'll just give instructions for how to run miri manually. All of this has to happen with a nightly compiler.

Setup:

  • git clone https://github.com/solson/miri && cd miri
  • cargo install xargo
  • xargo/build.sh
  • Now xargo is no longer needed and could be uninstalled; the purpose was just to obtain the files that now live in ~/.xargo/HOST.
  • cargo install --all-features --force

Now, when you are in a crate, instead of cargo run you would do MIRI_SYSROOT=~/.xargo/HOST cargo miri -- -Zmiri-start-fn. That's it.

@shepmaster shepmaster added the enhancement Something new the playground could do label Jul 26, 2018
@shepmaster
Copy link
Member

One reason why one might want to do this is that miri will stop program execution when there is UB

For an arbitrary user coming to the playground, are there any other reasons they would want to run Miri? In the same vein, what's a 1-2 sentence of "marketing copy" that will accompany Miri in the menu:

image

This is all centered around "why should the playground expose Miri in the first place". Our bar has to be a bit higher than "it's cool" or "I like it" just because it's a forward-facing part of the Rust web presence, so I'm doing my due diligence.

(Is it OK to capitalize it as "Miri"?)

@RalfJung
Copy link
Member Author

RalfJung commented Jul 27, 2018

(Is it OK to capitalize it as "Miri"?)

Looking at https://github.com/solson/miri/, that seems to be common.

For an arbitrary user coming to the playground, are there any other reasons they would want to run Miri? In the same vein, what's a 1-2 sentence of "marketing copy" that will accompany Miri in the menu:

I can't think of another reason. I'm also not good at marketing. ;) Did you plan to put it in the tools menu? Since it's a way to run the program, I figured it'd be in the "run" menu. But I don't know how you are distributing tools over these menus.

Maybe something like

Run the program in the miri interpreter to detect many safety violations like out-of-bounds memory access.

Or should it say "to detect undefined behavior"? How strongly do we have to clarify that it cannot detect everything, but also that not every error is UB (it could just be that miri does not support this operation)?

We could probably also improve the error message to make it easier to interpret (I just opened rust-lang/miri#417 for that purpose).

This is all centered around "why should the playground expose Miri in the first place". Our bar has to be a bit higher than "it's cool" or "I like it" just because it's a forward-facing part of the Rust web presence, so I'm doing my due diligence.

Understood and appreciated. I would be willing to let this undergo some kind of process to make sure we have wider consensus. I just have no idea what the process is.^^

@shepmaster
Copy link
Member

Did you plan to put it in the tools menu?

Yes.

a way to run the program

But why would anyone ever want to run their code using Miri? You just stated that the only reason to use Miri is to check for UB. The fact that it runs the code to do so is irrelevant, much like the fact that Clippy uses cargo check to do its thing; these are just implementation details.

I don't know how you are distributing tools over these menus.

It's (always) in the process of being nailed down, of course, but right now anything in the Run menu is "built-in" to the compiler and tools are "add-ons".

some kind of process

Nah, I don't think it needs to be all that. The wrong answer would have been "so I can get bug reports from users", IMO.

@shepmaster
Copy link
Member

improve the error message to make it easier to interpret

In that realm, the output is really a wall of text, very hard to read. I think if you want people to easily drop in and start using Miri, you might need to pay some mind to that:

   Compiling playground v0.0.1 (file:///playground)
error[E0080]: constant evaluation error: pointer computed at offset 1, outside bounds of allocation 416 which has size 0
    --> /root/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libcore/ptr.rs:1383:9
     |
1383 |         intrinsics::offset(self, count) as *mut T
     |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ pointer computed at offset 1, outside bounds of allocation 416 which has size 0
     |
note: inside call to `std::ptr::<impl *mut T><u8>::offset`
    --> /root/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libcore/slice/mod.rs:2079:15
     |
2079 |         &mut *slice.as_mut_ptr().offset(self as isize)
     |               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
note: inside call to `<usize as std::slice::SliceIndex<[T]>><u8>::get_unchecked_mut`
    --> /root/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libcore/slice/mod.rs:405:9
     |
405  |         index.get_unchecked_mut(self)
     |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
note: inside call to `core::slice::<impl [T]><u8>::get_unchecked_mut::<usize>`
    --> src/main.rs:3:15
     |
3    |     unsafe { *a.get_unchecked_mut(1) = 1; }
     |               ^^^^^^^^^^^^^^^^^^^^^^
note: inside call to `main`
    --> /root/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libstd/rt.rs:74:34
     |
74   |     lang_start_internal(&move || main().report(), argc, argv)
     |                                  ^^^^^^
note: inside call to `closure`
    --> /root/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libstd/rt.rs:59:75
     |
59   |             ::sys_common::backtrace::__rust_begin_short_backtrace(move || main())
     |                                                                           ^^^^^^
note: inside call to `closure`
    --> /root/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libstd/sys_common/backtrace.rs:136:5
     |
136  |     f()
     |     ^^^
note: inside call to `std::sys_common::backtrace::__rust_begin_short_backtrace::<[closure@DefId(1/1:1839 ~ std[82ff]::rt[0]::lang_start_internal[0]::{{closure}}[0]::{{closure}}[0]) 0:&dyn std::ops::Fn() -> i32 + std::panic::RefUnwindSafe + std::marker::Sync], i32>`
    --> /root/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libstd/rt.rs:59:13
     |
59   |             ::sys_common::backtrace::__rust_begin_short_backtrace(move || main())
     |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
note: inside call to `closure`
    --> /root/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libstd/panicking.rs:310:40
     |
310  |             ptr::write(&mut (*data).r, f());
     |                                        ^^^
note: inside call to `std::panicking::try::do_call::<[closure@DefId(1/1:1838 ~ std[82ff]::rt[0]::lang_start_internal[0]::{{closure}}[0]) 0:&&dyn std::ops::Fn() -> i32 + std::panic::RefUnwindSafe + std::marker::Sync], i32>`
    --> /root/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libstd/panicking.rs:306:5
     |
306  | /     fn do_call<F: FnOnce() -> R, R>(data: *mut u8) {
307  | |         unsafe {
308  | |             let data = data as *mut Data<F, R>;
309  | |             let f = ptr::read(&mut (*data).f);
310  | |             ptr::write(&mut (*data).r, f());
311  | |         }
312  | |     }
     | |_____^
note: inside call to `std::panicking::try::<i32, [closure@DefId(1/1:1838 ~ std[82ff]::rt[0]::lang_start_internal[0]::{{closure}}[0]) 0:&&dyn std::ops::Fn() -> i32 + std::panic::RefUnwindSafe + std::marker::Sync]>`
    --> /root/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libstd/panic.rs:392:9
     |
392  |         panicking::try(f)
     |         ^^^^^^^^^^^^^^^^^
note: inside call to `std::panic::catch_unwind::<[closure@DefId(1/1:1838 ~ std[82ff]::rt[0]::lang_start_internal[0]::{{closure}}[0]) 0:&&dyn std::ops::Fn() -> i32 + std::panic::RefUnwindSafe + std::marker::Sync], i32>`
    --> /root/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libstd/rt.rs:58:25
     |
58   |           let exit_code = panic::catch_unwind(|| {
     |  _________________________^
59   | |             ::sys_common::backtrace::__rust_begin_short_backtrace(move || main())
60   | |         });
     | |__________^
note: inside call to `std::rt::lang_start_internal`
    --> /root/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libstd/rt.rs:74:5
     |
74   |     lang_start_internal(&move || main().report(), argc, argv)
     |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

error: aborting due to previous error

@shepmaster
Copy link
Member

How strongly do we have to clarify that it cannot detect everything,

I don't think it has to be strong, but it should be present.

but also that not every error is UB

Does Miri detect non-UB errors? If so, what are some examples?

Run the program in the miri interpreter to detect many safety violations like out-of-bounds memory access.

Based on our discussion, I'd tweak this a bit to:

Run the program in the Miri interpreter to detect certain types of undefined behavior like out-of-bounds memory access.

@RalfJung
Copy link
Member Author

RalfJung commented Jul 27, 2018

But why would anyone ever want to run their code using Miri? You just stated that the only reason to use Miri is to check for UB. The fact that it runs the code to do so is irrelevant, much like the fact that Clippy uses cargo check to do its thing; these are just implementation details.

Clippy is static tool. It looks at all code and does its magic.

Miri works dynamically. It runs a program and tells you if it sees anything fishy in that execution. So if e.g. main is empty, it doesn't matter which other code there is.

To me, that's a fundamental difference. But you are probably the better judge for how users would see this. ;)

The wrong answer would have been "so I can get bug reports from users", IMO.

Fair enough. Getting more users is always nice, of course, but I think we can actually deliver some value -- as you have seen, out-of-bounds pointer stuff is already handled, as are unaligned accesses. I have some plans for next week to significantly extend the range of things that are checked during execution.

In that realm, the output is really a wall of text, very hard to read. I think if you want people to easily drop in and start using Miri, you might need to pay some mind to that:

Any advise? This is the stack trace. Maybe everything but the top frame shouldn't be rendered as a span, but just show e.g. function name, file name, line number?

Does Miri detect non-UB errors? If so, what are some examples?

Miri cannot interpret anything everything. So it can error because of an unsupported operation.

Try (Box::into_raw(Box::new(0)) as usize) * 2.

Run the program in the Miri interpreter to detect certain types of undefined behavior like out-of-bounds memory access.

Sounds good! I'll keep you posted when there are more important items to be added to the list. (Violation of data layout invariants is what I want to reliably detect next week.)

@oli-obk anything you think should be mentioned already?

@shepmaster
Copy link
Member

that's a fundamental difference

I agree from the implementer side, but from a user side, it doesn't matter much...

if e.g. main is empty

Except for this. This means that it's important to convey this to the user — only certain classes of UB will be detected, and only if they actually happen during evaluation of the program. I'd try and cram that in as

Run the program in the Miri interpreter to detect if undefined behavior like out-of-bounds memory access occurs during execution.

Miri cannot interpret anything

I'd normally take this to mean "there is nothing that Miri can interpret", which would mean it's pretty useless 😈 Am I correct in assuming you mean "there are certain types of things that Miri cannot interpret?"

(Box::into_raw(Box::new(0)) as usize) * 2

I think this is kind of what I was asking. The thing that's exciting (and scary...?) is that this is pure safe Rust. Is this undefined behavior? I feel like it's not. This seems like we should expand the description to cover whatever this is.

Any advise? This is the stack trace. Maybe everything but the top frame shouldn't be rendered as a span, but just show e.g. function name, file name, line number?

Perhaps. The main thing is that it's very hard for a user to take action on the fact that the call passed through some stdlib things. Likewise, if there's UB in a crate, the average end user isn't going to be able to do much about it. The magical ideal would be to show "just the code that matters", but that's likely up there with the halting problem 😇.

@RalfJung
Copy link
Member Author

RalfJung commented Jul 28, 2018

Except for this. This means that it's important to convey this to the user — only certain classes of UB will be detected, and only if they actually happen during evaluation of the program. I'd try and cram that in as

Right, that's why I think the right verb is to use "execute" or something. I am not sure how many people are familiar with "UB sanitizers", but they work the same way: Compile your program with some extra flags, run it, and then the sanitizers will tell you if this execution does something fishy.

It's important we manage to communicate that miri tests an execution for UB, not a program.

I'd normally take this to mean "there is nothing that Miri can interpret", which would mean it's pretty useless smiling_imp

oops typo corrected.^^ I meant "miri cannot interpret everything".

I think this is kind of what I was asking. The thing that's exciting (and scary...?) is that this is pure safe Rust. Is this undefined behavior? I feel like it's not. This seems like we should expand the description to cover whatever this is.

Yes it's scary that ptr -> int casts are safe. ;) And no this is not UB. From all we know. The rules around ptr <-> int casts are fuzzy at best and typically unknown. We are relying on de-facto LLVM behavior, because there are things you can do in safe Rust that are UB in C, and LLVM does not document those areas very well.

The magical ideal would be to show "just the code that matters", but that's likely up there with the halting problem innocent.

True, but we could likely do better. ;) E.g. we could try to not show spans in libstd/libcore? Or any other crate than the current one? Not sure.

@shepmaster
Copy link
Member

For reference based on our IRC discussion:

779ceb665413        45 hours ago        /bin/sh -c cargo install --all-features --pa…   75.2MB
3bd1c74e4bef        45 hours ago        /bin/sh -c ~/miri/xargo/build.sh                67.1MB
d1be3de5ac24        45 hours ago        /bin/sh -c git clone https://github.com/sols…   4.05MB
0762699e3d7a        45 hours ago        /bin/sh -c rustup component add rust-src        28MB
9ba20a36ba10        45 hours ago        /bin/sh -c cargo install xargo                  7.54MB

@shepmaster
Copy link
Member

E.g. we could try to not show spans in libstd/libcore? Or any other crate than the current one? Not sure.

Heh, not installing rust-src gives what seems like reasonable output:

error[E0080]: constant evaluation error: pointer computed at offset 1, outside bounds of allocation 416 which has size 0
     |
note: inside call to `std::ptr::<impl *mut T><u8>::offset`
note: inside call to `<usize as std::slice::SliceIndex<[T]>><u8>::get_unchecked_mut`
note: inside call to `core::slice::<impl [T]><u8>::get_unchecked_mut::<usize>`
    --> src/main.rs:6:10
     |
6    |         *a.get_unchecked_mut(1) = 1;
     |          ^^^^^^^^^^^^^^^^^^^^^^
note: inside call to `main`
note: inside call to `closure`
note: inside call to `closure`
note: inside call to `std::sys_common::backtrace::__rust_begin_short_backtrace::<[closure@DefId(1/1:1839 ~ std[82ff]::rt[0]::lang_start_internal[0]::{{closure}}[0]::{{closure}}[0]) 0:&dyn std::ops::Fn() -> i32 + std::panic::RefUnwindSafe + std::marker::Sync], i32>`
note: inside call to `closure`
note: inside call to `std::panicking::try::do_call::<[closure@DefId(1/1:1838 ~ std[82ff]::rt[0]::lang_start_internal[0]::{{closure}}[0]) 0:&&dyn std::ops::Fn() -> i32 + std::panic::RefUnwindSafe + std::marker::Sync], i32>`
note: inside call to `std::panicking::try::<i32, [closure@DefId(1/1:1838 ~ std[82ff]::rt[0]::lang_start_internal[0]::{{closure}}[0]) 0:&&dyn std::ops::Fn() -> i32 + std::panic::RefUnwindSafe + std::marker::Sync]>`
note: inside call to `std::panic::catch_unwind::<[closure@DefId(1/1:1838 ~ std[82ff]::rt[0]::lang_start_internal[0]::{{closure}}[0]) 0:&&dyn std::ops::Fn() -> i32 + std::panic::RefUnwindSafe + std::marker::Sync], i32>`
note: inside call to `std::rt::lang_start_internal`

I wonder if "current crate" + "bottommost stack frame" would be useful.

@shepmaster
Copy link
Member

Well, I'm glad I looked into that. Much smaller now

/root/.cargo/bin/cargo-miri	4.54MB	
/root/.cargo/bin/miri	        2.31MB	
/root/.xargo	                67MB	
rustup component add rust-src	28MB	

@shepmaster
Copy link
Member

How about

Detect certain cases of undefined behavior (like out-of-bounds
memory access) that occur when executing this program in the
Miri interpreter.

@RalfJung
Copy link
Member Author

The purpose of the xargo step is to get a libstd that includes MIR for all functions. Usually, we only store MIR for generic functions; the others are already sent to LLVM and have their machine code generated so we don't usually need the MIR any more. Except, of course, miri needs it. So, rust-src should not be needed at the end to get miri to work, it'll just mean we cannot show spans in libstd/libcore. Which maybe we do not even want to?

I wonder if "current crate" + "bottommost stack frame" would be useful.

Something along these lines, yes.

Well, I'm glad I looked into that. Much smaller now

Ah, I forgot about the rust-src component, sorry for that. So we are slightly above 100MiB.

I am wondering why cargo-miri is so big, it hardly does anything... it's just a wrapper to call rustc or miri, respectively.

Detect certain cases of undefined behavior (like out-of-bounds
memory access) that occur when executing this program in the
Miri interpreter.

👍
We could add a sentence like "This is akin to executing the program with a UB sanitizer." or so? Or is that jargon that not enough people know?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Something new the playground could do
Projects
None yet
Development

No branches or pull requests

2 participants