Skip to content

Relax the block check in OperatorsReader<'_> #2202

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

alexcrichton
Copy link
Member

This commit relaxes a check added in #2134 which maintains a stack of frame kinds in the operators reader, in addition to the validator. The goal of #2134 was to ensure that spec-wise-syntactically-invalid-modules are caught in the parser without the need of the validator, but investigation in #2180 has shown that this is a source of at least some of a performance regression. The change here is to relax the check to still be able to pass spec tests while making such infrastructure cheaper.

The reader now maintains just a small depth: u32 counter instead of a stack of kinds. This means that the reader can still catch invalid modules such as instructions-after-end, but the validator is required to handle situations such as else outside of an if block.

This required some adjustments to tests as well as some workarounds for the upstream spec tests that assert legacy exception-handling instructions are malformed, not invalid.

This commit relaxes a check added in bytecodealliance#2134 which maintains a stack of
frame kinds in the operators reader, in addition to the validator. The
goal of bytecodealliance#2134 was to ensure that spec-wise-syntactically-invalid-modules
are caught in the parser without the need of the validator, but
investigation in bytecodealliance#2180 has shown that this is a source of at least some
of a performance regression. The change here is to relax the check to
still be able to pass spec tests while making such infrastructure
cheaper.

The reader now maintains just a small `depth: u32` counter instead of a
stack of kinds. This means that the reader can still catch invalid
modules such as instructions-after-`end`, but the validator is required
to handle situations such as `else` outside of an `if` block.

This required some adjustments to tests as well as some workarounds for
the upstream spec tests that assert legacy exception-handling
instructions are malformed, not invalid.
@alexcrichton
Copy link
Member Author

cc @keithw I'm curious for your thoughts on this, I'm not wed to this approach and would be happy to close without merging, but if you don't have an objection reducing the cost of the parsing I think would be good.

cc @Robbepop in relation to #2180

@Robbepop
Copy link
Collaborator

@alexcrichton thank you for trying to come up with a solution for this! :)

@keithw
Copy link
Contributor

keithw commented May 22, 2025

Hmm... Honestly, I've really tried to replicate a slowdown in parsing individual large modules, and to date my measurements (#2180 (comment)) were that parsing alone doesn't seem to have gotten slower, when we're talking about parsing a single huge module, e.g. https://github.com/turbolent/w2c2/blob/main/examples/clang/clang.wasm, vs. something like the testsuite which is about parsing thousands of tiny modules.

So my preference would be to find and be able to replicate the problem first because I think that will help inform how to fix it, and e.g. where it's worth complicating the story and spec adherence. My view is that the spec defines a notion of binary well-formedness in chapter 5, and validity in chapter 3, and it would be nice if the parser enforced chapter 5 by itself, and the validator enforced chapter 3, rather than just saying that the parser+validator is going to jointly enforce the requirements of chapters 5+3. This is particularly relevant to me because we're building a development environment that enforces the syntactic well-formedness on entry (and shows validation errors but doesn't prevent them), so the distinction matters to us for that reason.

All things being equal, chapter 5 does seem pretty clear that the binary module corresponding to (func else) is malformed, and it would be nice if wasmparser could continue to detect that by failing the parse...

@keithw
Copy link
Contributor

keithw commented May 22, 2025

To be a little more concrete, as far as I can tell so far, the slowdowns seem to be in "parsing and validating thousands of tiny modules" but not in "parsing one gigantic module." But I'm not sure if we're all on the same page about this? That's certainly what I was trying to figure out in #2180. :-)

If we are on the same page about where the slowdown is, then I guess my thinking would be that it's worth it to reduce the startup overhead of the Parser (maybe in being able to share its heap allocations across runs like the Validator can do?), but I'd be less eager to use startup overhead as a reason to reclassify a module like (func else) as well-formed if chapter 5 says otherwise.

@Robbepop
Copy link
Collaborator

Hmm... Honestly, I've really tried to replicate a slowdown in parsing individual large modules

@alexcrichton Did you also use cargo bench -p wasmparser instead of using the wasm-tools CLI tool as @alexcrichton has commented earlier?

I ran the benchmarks on an Apple M2 Pro, you ran them on a AMD Ryzen 7 PRO 4750U. What did you use @alexcrichton? Both alex and me saw could reproduce the regressions via cargo bench -p wasmparser.

[..] parsing alone doesn't seem to have gotten slower, when we're talking about parsing a single huge module [..]
[..] To be a little more concrete, as far as I can tell so far, the slowdowns seem to be in "parsing and validating thousands of tiny modules" but not in "parsing one gigantic module." [..]

I also wondered about this. It was a hypothesis so far since I could not proof if it really behaves like this. If true then indeed we should see nice speed-ups by introducing ParserAllocations similar to ValidatorAllocations. Thought, the parser's state looks fairly small in contrast to the validator's state.

My view is that the spec defines a notion of binary well-formedness in chapter 5, and validity in chapter 3, and it would be nice if the parser enforced chapter 5 by itself, and the validator enforced chapter 3 [..]

I fully agree with you. It would certainly be better and we should ideally find a set of changes that allow for this while keeping good performance.

If we are on the same page about where the slowdown is, then I guess my thinking would be that it's worth it to reduce the startup overhead of the Parser [..]

@alexcrichton has shown in a prior post which parts of the parser causes which slowdowns: #2180 (comment)

I am not saying that the mentioned fixes should be implemented but they provide a clue to what is actually causing the regressions and at least in this comments it doesn't look like the regressions are stemming entirely from the startup of the parser.

@alexcrichton
Copy link
Member Author

FWIW I do agree with @keithw that I'm not really certain what the regression/benchmark being optimized for is here. I don't dispute that criterion is printing regressions in wasmparser/wasmi, but the numbers currently don't make much sense. The main number I at least personally care about is the time it takes to parse + validate a module. In wamsparser using a variety of modules the regression is in the 5% range, while wasmi shows it's in the 30% range. I don't think any of us know at this time where such a difference comes from.

I'm the same as @keithw where I've been unable to show anything more than 5% regression when simply validating a module (or within that ballpark at least). I would also agree that the benchmark of just iterating over all tests in this repository isn't necesarily representative so while it can be useful to track I don't think it's worthwhile to live and die by that number.


@keithw after thinking about this for awhile I'm now at the point where I think I have a more crisp articulation of my thinking. Right now we've effectively got two competing goals:

  1. The wasm specification has strict definitions of what is a syntactically valid module and what is a semantically valid module.
  2. Wasmparser, for speed/maintenance reasons, does not perfectly match the specification about whether an invalid module is syntactically invalid or semantically invalid.

My understanding though is that this distinction of whether an invalid module is syntactically invalid or semantically invalid is basically an artifact of the way the spec is authored and/or the spec interpreter. I'm also under the impression that web engines (and maybe no other engine?) also do not worry about distinguishing between these two kinds of modules. To me this creates a tension because it means that there's not much recourse to providing a form of pressure on the specification itself to handle a concern such as this.

This puts wasmparser in a bit of an unfortunate spot where I'd like it to follow the spec exactly ideally but it's pretty unmaintainable from a 100% compliance perspective (insofar as exactly matching whether invalid modules are semantically invalid or syntactically invalid). We already have some affordances for where wasmparser differs from the spec, and this PR could arguably be classified as such a change too.

Ok that's all a bunch of background which leads me to my articulation: I'm not sure where to draw the line of where wasmparser can/should be different from the spec. Is the current state of "just a little different" ok? Is this PR's state of "ok a bit more significantly different" ok? Personally I don't really feel equipped to answer these questions myself as it feels like we're stuck between a rock and a hard place with the way the spec is developed and tested.

Do you think that it would be worthwhile to bring these concerns to the CG perhaps? For example these are two possible (and radical) states to shift the development of the spec hypothetically:

  • Remove the distinction between assert_{malformed,invalid} and instead have one single predicate of "a module is valid or not". Precisely how it's valid/invalid is irrelevant for the tests, engines, etc. This would amount to dropping the assert_malformed assertion and possible updating the wording of the specification in various places (maybe? unsure.)
  • Massage the specification to taking into account performance/maintainability concerns on behalf of engines in exactly how the AST of a module is constructed. For example I'd like block-like instructions to look more like normal instructions for this instead of having stricter syntactic rules. I'm not sure if this is even possible though.

In any case that's at least some more thoughts (sorry about so many words on this somewhat niche topic). I'll clarify again I personally have no predetermined outcome I'd like to reach at this time, so if everyone's willing this is definitely a topic I'd like to dig into and explore as opposed to rush to any particular solution. Basically I'm not trying to land this PR ASAP or anything like that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants