-
Notifications
You must be signed in to change notification settings - Fork 282
Relax the block check in OperatorsReader<'_>
#2202
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This commit relaxes a check added in bytecodealliance#2134 which maintains a stack of frame kinds in the operators reader, in addition to the validator. The goal of bytecodealliance#2134 was to ensure that spec-wise-syntactically-invalid-modules are caught in the parser without the need of the validator, but investigation in bytecodealliance#2180 has shown that this is a source of at least some of a performance regression. The change here is to relax the check to still be able to pass spec tests while making such infrastructure cheaper. The reader now maintains just a small `depth: u32` counter instead of a stack of kinds. This means that the reader can still catch invalid modules such as instructions-after-`end`, but the validator is required to handle situations such as `else` outside of an `if` block. This required some adjustments to tests as well as some workarounds for the upstream spec tests that assert legacy exception-handling instructions are malformed, not invalid.
@alexcrichton thank you for trying to come up with a solution for this! :) |
Hmm... Honestly, I've really tried to replicate a slowdown in parsing individual large modules, and to date my measurements (#2180 (comment)) were that parsing alone doesn't seem to have gotten slower, when we're talking about parsing a single huge module, e.g. https://github.com/turbolent/w2c2/blob/main/examples/clang/clang.wasm, vs. something like the testsuite which is about parsing thousands of tiny modules. So my preference would be to find and be able to replicate the problem first because I think that will help inform how to fix it, and e.g. where it's worth complicating the story and spec adherence. My view is that the spec defines a notion of binary well-formedness in chapter 5, and validity in chapter 3, and it would be nice if the parser enforced chapter 5 by itself, and the validator enforced chapter 3, rather than just saying that the parser+validator is going to jointly enforce the requirements of chapters 5+3. This is particularly relevant to me because we're building a development environment that enforces the syntactic well-formedness on entry (and shows validation errors but doesn't prevent them), so the distinction matters to us for that reason. All things being equal, chapter 5 does seem pretty clear that the binary module corresponding to |
To be a little more concrete, as far as I can tell so far, the slowdowns seem to be in "parsing and validating thousands of tiny modules" but not in "parsing one gigantic module." But I'm not sure if we're all on the same page about this? That's certainly what I was trying to figure out in #2180. :-) If we are on the same page about where the slowdown is, then I guess my thinking would be that it's worth it to reduce the startup overhead of the Parser (maybe in being able to share its heap allocations across runs like the Validator can do?), but I'd be less eager to use startup overhead as a reason to reclassify a module like |
@alexcrichton Did you also use I ran the benchmarks on an
I also wondered about this. It was a hypothesis so far since I could not proof if it really behaves like this. If true then indeed we should see nice speed-ups by introducing
I fully agree with you. It would certainly be better and we should ideally find a set of changes that allow for this while keeping good performance.
@alexcrichton has shown in a prior post which parts of the parser causes which slowdowns: #2180 (comment) I am not saying that the mentioned fixes should be implemented but they provide a clue to what is actually causing the regressions and at least in this comments it doesn't look like the regressions are stemming entirely from the startup of the parser. |
FWIW I do agree with @keithw that I'm not really certain what the regression/benchmark being optimized for is here. I don't dispute that criterion is printing regressions in wasmparser/wasmi, but the numbers currently don't make much sense. The main number I at least personally care about is the time it takes to parse + validate a module. In wamsparser using a variety of modules the regression is in the 5% range, while wasmi shows it's in the 30% range. I don't think any of us know at this time where such a difference comes from. I'm the same as @keithw where I've been unable to show anything more than 5% regression when simply validating a module (or within that ballpark at least). I would also agree that the benchmark of just iterating over all tests in this repository isn't necesarily representative so while it can be useful to track I don't think it's worthwhile to live and die by that number. @keithw after thinking about this for awhile I'm now at the point where I think I have a more crisp articulation of my thinking. Right now we've effectively got two competing goals:
My understanding though is that this distinction of whether an invalid module is syntactically invalid or semantically invalid is basically an artifact of the way the spec is authored and/or the spec interpreter. I'm also under the impression that web engines (and maybe no other engine?) also do not worry about distinguishing between these two kinds of modules. To me this creates a tension because it means that there's not much recourse to providing a form of pressure on the specification itself to handle a concern such as this. This puts wasmparser in a bit of an unfortunate spot where I'd like it to follow the spec exactly ideally but it's pretty unmaintainable from a 100% compliance perspective (insofar as exactly matching whether invalid modules are semantically invalid or syntactically invalid). We already have some affordances for where wasmparser differs from the spec, and this PR could arguably be classified as such a change too. Ok that's all a bunch of background which leads me to my articulation: I'm not sure where to draw the line of where wasmparser can/should be different from the spec. Is the current state of "just a little different" ok? Is this PR's state of "ok a bit more significantly different" ok? Personally I don't really feel equipped to answer these questions myself as it feels like we're stuck between a rock and a hard place with the way the spec is developed and tested. Do you think that it would be worthwhile to bring these concerns to the CG perhaps? For example these are two possible (and radical) states to shift the development of the spec hypothetically:
In any case that's at least some more thoughts (sorry about so many words on this somewhat niche topic). I'll clarify again I personally have no predetermined outcome I'd like to reach at this time, so if everyone's willing this is definitely a topic I'd like to dig into and explore as opposed to rush to any particular solution. Basically I'm not trying to land this PR ASAP or anything like that. |
This commit relaxes a check added in #2134 which maintains a stack of frame kinds in the operators reader, in addition to the validator. The goal of #2134 was to ensure that spec-wise-syntactically-invalid-modules are caught in the parser without the need of the validator, but investigation in #2180 has shown that this is a source of at least some of a performance regression. The change here is to relax the check to still be able to pass spec tests while making such infrastructure cheaper.
The reader now maintains just a small
depth: u32
counter instead of a stack of kinds. This means that the reader can still catch invalid modules such as instructions-after-end
, but the validator is required to handle situations such aselse
outside of anif
block.This required some adjustments to tests as well as some workarounds for the upstream spec tests that assert legacy exception-handling instructions are malformed, not invalid.