Skip to content

Enable Miri in CI #853

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from
Closed

Conversation

KaiserKarel
Copy link
Contributor

@KaiserKarel KaiserKarel commented Jul 12, 2021

This PR does not fix the possible UB in storage (#850), but does run miri in CI and disables tests using/depending on asm!, which is not supported by Miri.

@HCastano HCastano changed the title [CI] enable miri Enable Miri in CI Jul 12, 2021
@HCastano HCastano added the A-CI Continuous integration work item label Jul 12, 2021
Copy link
Contributor

@HCastano HCastano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should make failing this check a hard error for the CI (yet), can you allow build failures?

Comment on lines +569 to +570
/// # #[cfg(not(miri))]
/// # {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why's this behind a not(miri) flag?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These examples call into functions using asm! and thus do not compile under Miri

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yeah you mentioned that in the PR description, but I didn't see any asm! blocks. A short comment explaining that in the code would be useful imo

@HCastano HCastano requested review from cmichi and Robbepop July 12, 2021 19:36
@KaiserKarel
Copy link
Contributor Author

I don't think we should make failing this check a hard error for the CI (yet), can you allow build failures?

I can do that, however I am hesitant. I haven't looked into the actual failure yet, but UB inside storage may mean a severe risk to contracts.

@HCastano
Copy link
Contributor

I can do that, however I am hesitant. I haven't looked into the actual failure yet, but UB inside storage may mean a severe risk to contracts.

Sure, but I say this for two reasons.

  1. The checks are still experimental

If we look at the CI logs we'll see the following message:

= help: this indicates a potential bug in the program: it performed an invalid operation, but the rules it violated are still experimental

I don't think it's quite "fair" to block a developer's PR on a check that's experimental.

That said, I think there are compile time flags we can use to ignore some of these experimental checks, so that might be one approach to use here.

  1. Allows us to fix potential UB in follow up PRs

If we make it a hard check we'll have to fix the potential issues in this PR. I'd prefer to fix them in follow ups so that the review area is smaller and we can ensure that the fixes are indeed correct.

@KaiserKarel
Copy link
Contributor Author

I don't think it's quite "fair" to block a developer's PR on a check that's experimental.

I agree with that. I could enable those compile time flags, or currently annotate the failing test with an ignore attribute and create a related issue; that way we'll still have these checks on PR, but if a developer submits something which is correct, but unsupported by Miri, they can also annotate it with a comment.

Of course allowing failures there is possible too; but in my experience once you have a check that regularly fails, reviewers (including me) stop paying attention to it.

@HCastano
Copy link
Contributor

Of course allowing failures there is possible too; but in my experience once you have a check that regularly fails, reviewers (including me) stop paying attention to it.

Yeah, I totally agree here. We should definitely avoid having a check which is always failing.

I agree with that. I could enable those compile time flags, or currently annotate the failing test with an ignore attribute and create a related issue; that way we'll still have these checks on PR, but if a developer submits something which is correct, but unsupported by Miri, they can also annotate it with a comment.

I've been looking through the Miri README and it mentions that despite the stacked borrow rule being experimental, disabling the check is "unsound". Playing it safe then, let's go with your second suggestion and ignore things for now, but ensure that issues are created so we can track them and address them in follow up PRs.

@KaiserKarel
Copy link
Contributor Author

KaiserKarel commented Jul 14, 2021

Miri seems to error at the end of the test run during some cleanup routine. I'm unsure where to look to fix this, or if that is even the cause.

Gotcha I think rust-lang/miri#1360 ;)

@KaiserKarel
Copy link
Contributor Author

Cheered too soon. Still unsure why the thread_local panics.

@HCastano
Copy link
Contributor

@KaiserKarel Hey, do you need any help with this, or have you just not had time to get around to looking at this again?

@KaiserKarel
Copy link
Contributor Author

@KaiserKarel Hey, do you need any help with this, or have you just not had time to get around to looking at this again?

Been a bit busy, but could use some help here as well. The thread_local panics at the end of the test run, not during the tests.

@HCastano
Copy link
Contributor

@KaiserKarel Hey, do you need any help with this, or have you just not had time to get around to looking at this again?

Been a bit busy, but could use some help here as well. The thread_local panics at the end of the test run, not during the tests.

I haven't had time to look at this either. I'll try (🤞) to make some time this week to help

@HCastano
Copy link
Contributor

I'm a bit stumped about this too actually.

At first I thought it could be the ink_storage doctests causing the problem, since from the miri output those aren't run. However, removing them didn't change anything.

I find it curious that all 320 storage tests pass (well, 315 pass and 5 are ignored, which we expect) and then the failure happens 🤔

@KaiserKarel
Copy link
Contributor Author

KaiserKarel commented Jul 29, 2021

I'm a bit stumped about this too actually.

At first I thought it could be the ink_storage doctests causing the problem, since from the miri output those aren't run. However, removing them didn't change anything.

I find it curious that all 320 storage tests pass (well, 315 pass and 5 are ignored, which we expect) and then the failure happens

I think it happens in the Drop of the thread_local, which causes the env to drop with the associated Engine.

@HCastano
Copy link
Contributor

I think it happens in the Drop of the thread_local, which causes the env to drop with the associated Engine.

Yeah, I agree with you here.

I'll go a step further and say that I'm pretty sure it happens when we Drop the DynamicAllocator. I found that I can reproduce this by doing: cargo miri test -p ink_storage alloc (although nothing more specific than that unfortunately).

@KaiserKarel
Copy link
Contributor Author

KaiserKarel commented Aug 4, 2021

So the docs for LocalKey::with state: "(panics) If the key has been destroyed (which may happen if this is called in a destructor)". Perhaps we are hitting that case? Although abort is called not on the AccessError being returned.

@KaiserKarel
Copy link
Contributor Author

cargo miri test -- --Z unstable-options --exclude-should-panic still triggers the abort.

Copy link
Collaborator

@Robbepop Robbepop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the state of this PR? It has been stale for quite a while now.

@HCastano
Copy link
Contributor

I wasn't able to figure out how to get this working, but I agree that it would be nice to include this as part of the CI.

You're more knowledgeable around unsafe and Miri, maybe you can take a look @Robbepop 😅

@Robbepop
Copy link
Collaborator

I wasn't able to figure out how to get this working, but I agree that it would be nice to include this as part of the CI.

You're more knowledgeable around unsafe and Miri, maybe you can take a look @Robbepop sweat_smile

Generally the PR looks alright but I am concerned about the failing miri CI job.

@KaiserKarel
Copy link
Contributor Author

I can't figure out what the panic at the end of the test run is. I think it is not UB; however we shouldn't ignore of course. Seems to be caused during test teardown?

@HCastano
Copy link
Contributor

@KaiserKarel This would be nice to have, but tbh I was stumped back then and nobody on the team has time to circle back to this. Also, with #852 being closed this isn't strictly required anymore.

Let us know if you disagree with this, and thanks again for your time!

@HCastano HCastano closed this Nov 24, 2021
@Robbepop
Copy link
Collaborator

The main problem with miri in our CI is that is is super slow unless we disable most tests which makes this less useful.
For example none of our extensive UI tests can be run with miri and most of our examples cannot be run either - any example that uses more elaborate off-chain features to be precise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-CI Continuous integration work item
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants