-
Notifications
You must be signed in to change notification settings - Fork 5k
JIT: local assertion prop and redundant zeroing #74967
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
In local assertion prop, optimize out stores of zeros to locations that are already known to be zero. Don't do this for integer locals as it can lead to excessive loop cloning. If we succeed in a LCL_FLD copy prop, try and find a knock-on zero prop. Update VN to handle the `struct LCL_FLD = 0` case better.
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch Issue DetailsIn local assertion prop, optimize out stores of zeros to locations that are If we succeed in a LCL_FLD copy prop, try and find a knock-on zero prop. Update VN to handle the
|
@dotnet/jit-contrib PTAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could also enable VN-based propagation of VNF_ZeroObj
, seeing as optVNConstantPropCurStmt
has access to the user.
src/coreclr/jit/assertionprop.cpp
Outdated
canOptimize = true; | ||
} | ||
|
||
// LHS is zero. Is RHS known to be zero from prior assertions? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would running the ZEROOBJ
propagation before this work (it seems unfortunate to duplicate constant propagation logic like this)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be, let me try this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems to work out.
src/coreclr/jit/valuenum.cpp
Outdated
{ | ||
// Note that it is possible to see pretty much any kind of type for the local | ||
// (not just TYP_STRUCT) here because of the ASG(BLK(ADDR(LCL_VAR/FLD)), 0) form. | ||
initObjVN = (lhsVarDsc->TypeGet() == TYP_STRUCT) ? vnStore->VNForZeroObj(lhsVarDsc->GetStructHnd()) | ||
: vnStore->VNZeroForType(lhsVarDsc->TypeGet()); | ||
} | ||
else if (isZeroInit) | ||
{ | ||
initObjVN = (lhs->TypeGet() == TYP_STRUCT) ? vnStore->VNForZeroObj(layout->GetClassHandle()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not all layouts will have handles. The right(-ish, it's not actually optimal, but better than the current state) fix for this is to simply replace the first argument of VNF_ZeroObj
with a VNForIntPtrCon(reinterpret_cast<ssize_t>(layout))
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand what you are suggesting here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To replace VN of CORINFO_CLASS_HANDLE
with VN of ClassLayout*
as the argument to VNF_ZeroObj
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, use the layout instance to classify the zeros instead of the handle, and use VNForIntPtrCon to encode the layout instance.
I wonder if this would lead us (at times) to reconsider what expansion we'd choose. I think I'll keep this to mainly be a local AP change. |
In terms of promotion, this should not affect things, as we decide whether to decompose in global morph. (In fact, we must decide this before SSA, to mark things DNER appropriately, at least in the current design)
Of course; was just noting this as one "obvious" area of possible further improvements. |
Native AOT failure looks similar to #75005 (timeout). |
SuperPMI replay failures, e.g.
|
We should improve the rules for supported IR shapes in the post-phase IR checking. Otherwise, it's a bit of a guessing game. |
I see the code which propagates |
src/coreclr/jit/valuenum.cpp
Outdated
ValueNum initObjVN = ValueNumStore::NoVN; | ||
if (isEntire && rhs->IsIntegralConst(0)) | ||
ValueNum initObjVN = ValueNumStore::NoVN; | ||
bool const isZeroInit = rhs->IsIntegralConst(0); | ||
if (isZeroInit && isEntire) | ||
{ | ||
// Note that it is possible to see pretty much any kind of type for the local | ||
// (not just TYP_STRUCT) here because of the ASG(BLK(ADDR(LCL_VAR/FLD)), 0) form. | ||
initObjVN = (lhsVarDsc->TypeGet() == TYP_STRUCT) ? vnStore->VNForZeroObj(lhsVarDsc->GetStructHnd()) | ||
initObjVN = (lhsVarDsc->TypeGet() == TYP_STRUCT) ? vnStore->VNForZeroObj(lhsVarDsc->GetLayout()) | ||
: vnStore->VNZeroForType(lhsVarDsc->TypeGet()); | ||
} | ||
else if (isZeroInit) | ||
{ | ||
// Non-entire zeroing | ||
initObjVN = (lhs->TypeGet() == TYP_STRUCT) ? vnStore->VNForZeroObj(layout) | ||
: vnStore->VNZeroForType(lhs->TypeGet()); | ||
} | ||
else | ||
{ | ||
// Non-zero init, or non-entire block zero init | ||
// | ||
// Non-zero block init is very rare so we'll use a simple, unique VN here. | ||
initObjVN = vnStore->VNForExpr(compCurBB, lhsVarDsc->TypeGet()); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the layout change, this can be simplified:
ValueNum initObjVN = ValueNumStore::NoVN;
if (rhs->IsIntegralConst(0))
{
initObjVN = (lhs->TypeGet() == TYP_STRUCT) ? vnStore->VNForZeroObj(layout)
: vnStore->VNZeroForType(lhs->TypeGet());
}
else
{
// Non-zero block init is very rare so we'll use a simple, unique VN here.
initObjVN = vnStore->VNForExpr(compCurBB, lhs->TypeGet());
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/coreclr/jit/assertionprop.cpp
Outdated
// Rescan the table looking for possible constants | ||
// | ||
onlyConstProp = true; | ||
index = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably worth to set the first valid index for which !curAssertion->CanPropLclVar()
is true and use that to set the index
here to avoid some unnecessary lookups?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Turns out this "second chance" assertion prop no longer kicks in so I just removed it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Turns out this "second chance" assertion prop no longer kicks in so I just removed it.
Note that morph repeatedly invokes assertion prop so we would naturally have caught this case:
runtime/src/coreclr/jit/morph.cpp
Lines 14069 to 14087 in 3d630a8
/* Before morphing the tree, we try to propagate any active assertions */ | |
if (optLocalAssertionProp) | |
{ | |
/* Do we have any active assertions? */ | |
if (optAssertionCount > 0) | |
{ | |
GenTree* newTree = tree; | |
while (newTree != nullptr) | |
{ | |
tree = newTree; | |
/* newTree is non-Null if we propagated an assertion */ | |
newTree = optAssertionProp(apFull, tree, nullptr, nullptr); | |
} | |
assert(tree != nullptr); | |
} | |
} | |
PREFAST_ASSUME(tree != nullptr); | |
} |
// The latter part of the if below is a heuristic. | ||
// | ||
// If we elimiate a zero assignment for integral lclVars it can lead to | ||
// unnecessary cloning. We need to make sure `optExtractInitTestIncr` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it can lead to unnecessary cloning
Not sure if I follow this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Say we remove a Vxx = 0
because assertion prop already knows Vxx
is zero, and Vxx
is the control variable for some loop.
With the redundant zeroing removed, optExtractdInitTestIncr
may fail to find the initial value for Vxx
because it just does a simple local search. When this happens, the cloner will fail to prove that the loop cloning conditions are always satisfied. and so do extra cloning that it otherwise would not do.
What's a little odd here is that we can't then later on (using more powerful VN/global AP) prove that the loop cloning conditions are indeed satisfied and clean things back up as if we'd never cloned. I didn't dig into this because knowingly enabling unnecessary cloning and relying on later phases to clean up seemed like a bad plan.
@@ -3821,14 +3840,92 @@ GenTree* Compiler::optAssertionProp_LclFld(ASSERT_VALARG_TP assertions, GenTreeL | |||
GenTree* Compiler::optAssertionProp_Asg(ASSERT_VALARG_TP assertions, GenTreeOp* asg, Statement* stmt) | |||
{ | |||
GenTree* rhs = asg->gtGetOp2(); | |||
if (asg->OperIsCopyBlkOp() && varTypeIsStruct(rhs)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the varTypeIsStruct(rhs)
condition removal, along with the assert in optZeroObjAssertionProp
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because optZeroObjAssertionProp
was extended to handle LCL_FLD.
Some kind of CI issue downloading things:
|
Retrying the failed legs |
SPMI replay failures are redundant artifacts:
|
Going to assume the small TP regressions are artifacts of the compiler update. @kunalspathak can you approve since you already are somewhat familiar with the changes? |
I will take a look tomorrow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
In local assertion prop, optimize out stores of zeros to locations that are
already known to be zero. Don't do this for integer locals as it can lead to
excessive loop cloning.
If we succeed in a LCL_FLD copy prop, try and find a knock-on zero prop.
Update VN to handle the
struct LCL_FLD = 0
case better.diffs