Skip to content

JIT: stop copying EH tab info into inlinee compiler #83622

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

AndyAyersMS
Copy link
Member

Having this info show up for inlinees just confuses things -- if the inlinee looks in this table and finds a block reference, it will be to a block that doesn't exist in the inlinee's flow graph.

This has been tripping up profile synthesis, which wants to attribute some flow to EH entries; inlinees don't have any, but currently look like they do.

There is one place in inlinee importation where we might ask about EH properties -- inline pinvoke checks. We're already careful to redirect this query to the call site block, so in the correspnonding lookup we must make sure to answer this via the root compiler's EH tab.

If we ever decide to support inlining of methods with EH we'll have to reconcile all this differently.

Having this info show up for inlinees just confuses things -- if
the inlinee looks in this table and finds a block reference, it will
be to a block that doesn't exist in the inlinee's flow graph.

This has been tripping up profile synthesis, which wants to attribute
some flow to EH entries; inlinees don't have any, but currently look
like they do.

There is one place in inlinee importation where we might ask about EH
properties -- inline pinvoke checks. We're already careful to redirect
this query to the call site block, so in the correspnonding lookup
we must make sure to answer this via the root compiler's EH tab.

If we ever decide to support inlining of methods with EH we'll have
to reconcile all this differently.
@ghost ghost added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Mar 17, 2023
@ghost ghost assigned AndyAyersMS Mar 17, 2023
@ghost
Copy link

ghost commented Mar 17, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch, @kunalspathak
See info in area-owners.md if you want to be subscribed.

Issue Details

Having this info show up for inlinees just confuses things -- if the inlinee looks in this table and finds a block reference, it will be to a block that doesn't exist in the inlinee's flow graph.

This has been tripping up profile synthesis, which wants to attribute some flow to EH entries; inlinees don't have any, but currently look like they do.

There is one place in inlinee importation where we might ask about EH properties -- inline pinvoke checks. We're already careful to redirect this query to the call site block, so in the correspnonding lookup we must make sure to answer this via the root compiler's EH tab.

If we ever decide to support inlining of methods with EH we'll have to reconcile all this differently.

Author: AndyAyersMS
Assignees: AndyAyersMS
Labels:

area-CodeGen-coreclr

Milestone: -

@AndyAyersMS
Copy link
Member Author

@EgorBo PTAL
cc @dotnet/jit-contrib

There is more we could (and perhaps should) do here, like:

  • tagging all inlinee blocks with some flag so we can distinguish them (then removing the tag when we successfully inline)
  • setting the EH region info early when we create the blocks

@AndyAyersMS AndyAyersMS requested a review from EgorBo March 17, 2023 22:41
Copy link
Member

@EgorBo EgorBo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM assuming test failure are unrelated. I had a branch somewhere where I tried to enable inlining + EH

@AndyAyersMS
Copy link
Member Author

Failures? Must have gotten automatically retried or something...

@AndyAyersMS
Copy link
Member Author

No Diffs but SPMI sees some modest TP regressions.

I can't imagine we call ehGetDsc or ehGetIndex all that often, so I'm puzzled.

@AndyAyersMS
Copy link
Member Author

AndyAyersMS commented Mar 18, 2023

I can't imagine we call ehGetDsc or ehGetIndex all that often, so I'm puzzled.

Using checked windows x64 jit on coreclr_tests (worst TP impact) there are 29783 methods out of 520968 that call one or both of these.

checked

ehGetDsc is called a total of 3,630,753,209 times, and the distribution of calls is highly skewed to just two method instances.
So will have to look more closely at those.

ehGetDsc ehGetIndex ContextNo
1680260893 16732 197871
1680260893 16732 432482
99413773 4068 197870
99413773 4068 432481
2272625 418120 246307
1869308 15 212876
1869308 15 436155
1722721 75417 247207
1722721 75417 457078
1693830 72967 247206
1693830 72967 457074
1099481 50 235703
1091413 50 235696

There are some sizeable regressions elsewhere (eg asp.net) so once I've looked into these pathological cases I'll look over there too.

release

632,931,084 total calls. MCs that make more than 1,000,000 calls shown below. Same set of suspects, checked just amplifies the problem.

ehGetDsc ehGetIndex Context
280177341 16732 197871
280177341 16732 432482
16601509 4068 197870
16601509 4068 432481
2024752 388963 246307
1836791 15 212876
1836791 15 436155
1209731 75368 247207
1209731 75368 457078
1195843 72918 247206
1195843 72918 457074

@AndyAyersMS
Copy link
Member Author

aspnet similar data. 10K methods that call, 74M calls, avg 7K, top callers are more evenly distributed:

ehGetDsc ehGetIndex Context
480598 9102 47080
480598 9102 62065
461942 9735 47076
461942 9735 60777
461503 35708 92416
380567 377 105371
374135 15586 37082
371987 15593 33629
370215 16365 38859
363367 461 88461
363367 461 92338
363367 461 103823

@jakobbotsch
Copy link
Member

jakobbotsch commented Mar 18, 2023

ehGetDsc is called a total of 3,630,753,209 times, and the distribution of calls is highly skewed to just two method instances.

Could maybe be #78267 (comment) ?

@AndyAyersMS
Copy link
Member Author

ehGetDsc is called a total of 3,630,753,209 times, and the distribution of calls is highly skewed to just two method instances.

Could maybe be #78267 (comment) ?

Looks like it is; here's a release profile:

image

@AndyAyersMS
Copy link
Member Author

Seems like we could revamp impImportLeave to walk up the enclosing EH scopes directly instead of searching like it does now. But that won't fix the regression here, as generally that search will mostly fail and when failing, doesn't call the methods I changed.

The extra cost here seems to come from successful searches that end up calling fgNewBBInRegion -- we have to induce a step block for each region the leave exits so if leaving deeply nested EH or even shallow EH with lots of leaves we do work figuring out where to put the step blocks.

There is also likely some redundancy in these step block chains we could try and remove, eg if a region has two leaves with identical targets, or a region and its enclosing region share a leave target, etc. This would require more substantial reworking as we'd first want to build a graph of all the leave sources and targets and then compute the sets of step blocks needed to get everything to the right places. This could also be factored if we wanted to introduce some runtime state, eg if we have two deeply nested inner leaves with different outer targets they could at share the middle parts of the step block chain and just diverge at the last step based on the state. I suspect doing this would remove the need for fgMergeFinallyChains which tries to clean some of this up after the fact.

That might reduce some of the impact here as we'd probably be creating fewer step blocks in some cases. But it's an even more radical reworking of leave handling. And not clear it's worth the effort overall as the vast majority of methods won't have huge numbers of leaves. But still looking the TP impact of this change on ASP.NET perhaps we have some motivation.

@AndyAyersMS
Copy link
Member Author

Seems like we could revamp impImportLeave to walk up the enclosing EH scopes directly instead of searching like it does now.

Tried this and it was more complex than the above would make it seem, for a few reasons:

  • mutual protect EH regions need to be logically handled as if they are a single try
  • leave may be in catch and target in associated try
  • leave and target may be in sibling regions
  • difference between eh region nums used in blocks and compiler APIs (0 => root, N ==> table entry N-1) and those stored in EH tables (n ==> table entry n, NO_ENCLOSING_REGION (ushrt max) ==> root) and ehBlkDsc apis.
  • entirely separate implementation for x86 so any fix has to be done twice

Still seems worth pursuing perhaps, but not as surgical as I'd hoped. Might be simpler in the short run to factor out some of the flow-inducing loop invariants

@BruceForstall
Copy link
Contributor

Why do you need the changes to ehGetDsc or ehGetIndex at all? (Also, I'm not sure I would depend on all code using these.)

We don't inline functions containing EH, right? So the EH table can just remain empty? Even if we did, wouldn't the inlinee get its own EH table that would then get incorporated into the main function at the same time the blocks get incorporated (and renumbered)?

@AndyAyersMS
Copy link
Member Author

AndyAyersMS commented Mar 23, 2023

Why do you need the changes to ehGetDsc or ehGetIndex at all? (Also, I'm not sure I would depend on all code using these.)

We don't inline functions containing EH, right? So the EH table can just remain empty? Even if we did, wouldn't the inlinee get its own EH table that would then get incorporated into the main function at the same time the blocks get incorporated (and renumbered)?

We ask about some EH properties in the inlinee compiler -- the only one I know of for certain is whether a PInvoke call site is within a handler (since we might be inlining into a handler region of the root method, and there are/were some odd restrictions on how PInvokes should be handled in such cases). Currently we cope with this by copying the root compiler EH info into the inlinee compiler. Instead, we could explicitly redirect that query back to the root compiler instance. Assuming that's the only reason we copied the EH info to the inlinee then we could assert in the ehGet.. methods that we're always calling them from the root compiler instance. That should ferret out any other cases where we're making queries from an inlinee compiler.

Seems worth a try.

@BruceForstall
Copy link
Contributor

With #83610, the block numbers in the inlinee are in the same "namespace" as the inliner. Do any of these EH-related queries from the inlinee use this block number, and thus need to "translate" to the inliner block number namespace?

@AndyAyersMS
Copy link
Member Author

We don't inline functions containing EH, right? So the EH table can just remain empty? Even if we did, wouldn't the inlinee get its own EH table that would then get incorporated into the main function at the same time the blocks get incorporated (and renumbered)?

We ask about some EH properties in the inlinee compiler -- the only one I know of for certain is whether a PInvoke call site is within a handler (since we might be inlining into a handler region of the root method, and there are/were some odd restrictions on how PInvokes should be handled in such cases). Currently we cope with this by copying the root compiler EH info into the inlinee compiler. Instead, we could explicitly redirect that query back to the root compiler instance. Assuming that's the only reason we copied the EH info to the inlinee then we could assert in the ehGet.. methods that we're always calling them from the root compiler instance. That should ferret out any other cases where we're making queries from an inlinee compiler.

Seems worth a try.

Remembered a bit more about why I didn't try this -- the query for pinvokes is block->HasHndIndex() which we can't properly resolve for inlinees, since blocks don't know if they belong to the root instance an inlinee instance, and even if they did, they don't have any reference to the instance.

So if say we want block->HasHndIndex() to be legal for inlinees we'll have to pass them the owning compiler instance (which will be almost always the current compiler instance); then if it's an inlinee compiler it will redirect the query to impInlineInfo->iciBlock as currently all inlinee blocks inherit the EH properties of their call site.

Anyways let me add the asserts mentioned above and see where else we might be asking.

@AndyAyersMS
Copy link
Member Author

With #83610, the block numbers in the inlinee are in the same "namespace" as the inliner. Do any of these EH-related queries from the inlinee use this block number, and thus need to "translate" to the inliner block number namespace?

I don't know for sure, but I suspect there aren't any that key off of bbNum.

Other structural properties of EH may not hold though. For instance, if an inlinee compiler managed to get a hold of a try region and thought it could enumerate all the blocks in the try by walking from begin to end while still in the process of inlining it would not see the inlinee blocks. But I don't think we do this sort of thing.

@AndyAyersMS
Copy link
Member Author

Going to close this as this whole subject needs more thinking and isn't currently a problem.

@ghost ghost locked as resolved and limited conversation to collaborators May 10, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants