Skip to content

JIT: local assertion prop and redundant zeroing #74967

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Sep 8, 2022

Conversation

AndyAyersMS
Copy link
Member

@AndyAyersMS AndyAyersMS commented Sep 1, 2022

In local assertion prop, optimize out stores of zeros to locations that are
already known to be zero. Don't do this for integer locals as it can lead to
excessive loop cloning.

If we succeed in a LCL_FLD copy prop, try and find a knock-on zero prop.

Update VN to handle the struct LCL_FLD = 0 case better.

diffs

In local assertion prop, optimize out stores of zeros to locations that are
already known to be zero. Don't do this for integer locals as it can lead to
excessive loop cloning.

If we succeed in a LCL_FLD copy prop, try and find a knock-on zero prop.

Update VN to handle the `struct LCL_FLD = 0` case better.
@ghost ghost added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Sep 1, 2022
@ghost ghost assigned AndyAyersMS Sep 1, 2022
@ghost
Copy link

ghost commented Sep 1, 2022

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

In local assertion prop, optimize out stores of zeros to locations that are
already known to be zero. Don't do this for integer locals as it can lead to
excessive loop cloning.

If we succeed in a LCL_FLD copy prop, try and find a knock-on zero prop.

Update VN to handle the struct LCL_FLD = 0 case better.

Author: AndyAyersMS
Assignees: -
Labels:

area-CodeGen-coreclr

Milestone: -

@AndyAyersMS
Copy link
Member Author

@dotnet/jit-contrib PTAL

Copy link
Contributor

@SingleAccretion SingleAccretion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also enable VN-based propagation of VNF_ZeroObj, seeing as optVNConstantPropCurStmt has access to the user.

canOptimize = true;
}

// LHS is zero. Is RHS known to be zero from prior assertions?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would running the ZEROOBJ propagation before this work (it seems unfortunate to duplicate constant propagation logic like this)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be, let me try this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to work out.

{
// Note that it is possible to see pretty much any kind of type for the local
// (not just TYP_STRUCT) here because of the ASG(BLK(ADDR(LCL_VAR/FLD)), 0) form.
initObjVN = (lhsVarDsc->TypeGet() == TYP_STRUCT) ? vnStore->VNForZeroObj(lhsVarDsc->GetStructHnd())
: vnStore->VNZeroForType(lhsVarDsc->TypeGet());
}
else if (isZeroInit)
{
initObjVN = (lhs->TypeGet() == TYP_STRUCT) ? vnStore->VNForZeroObj(layout->GetClassHandle())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not all layouts will have handles. The right(-ish, it's not actually optimal, but better than the current state) fix for this is to simply replace the first argument of VNF_ZeroObj with a VNForIntPtrCon(reinterpret_cast<ssize_t>(layout)).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what you are suggesting here.

Copy link
Contributor

@SingleAccretion SingleAccretion Sep 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To replace VN of CORINFO_CLASS_HANDLE with VN of ClassLayout* as the argument to VNF_ZeroObj.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, use the layout instance to classify the zeros instead of the handle, and use VNForIntPtrCon to encode the layout instance.

@AndyAyersMS
Copy link
Member Author

We could also enable VN-based propagation of VNF_ZeroObj, seeing as optVNConstantPropCurStmt has access to the user.

I wonder if this would lead us (at times) to reconsider what expansion we'd choose.

I think I'll keep this to mainly be a local AP change.

@SingleAccretion
Copy link
Contributor

SingleAccretion commented Sep 2, 2022

I wonder if this would lead us (at times) to reconsider what expansion we'd choose.

In terms of promotion, this should not affect things, as we decide whether to decompose in global morph.

(In fact, we must decide this before SSA, to mark things DNER appropriately, at least in the current design)

I think I'll keep this to mainly be a local AP change.

Of course; was just noting this as one "obvious" area of possible further improvements.

@AndyAyersMS
Copy link
Member Author

Native AOT failure looks similar to #75005 (timeout).

@BruceForstall
Copy link
Contributor

SuperPMI replay failures, e.g.

ISSUE: <ASSERT> #217629 D:\a\_work\1\s\src\coreclr\jit\lower.cpp (1089) - Assertion failed 'arg->OperIs(GT_OBJ, GT_FIELD_LIST) || arg->OperIsLocalRead()' in 'Microsoft.CodeAnalysis.VisualBasic.Formatting.NodeBasedFormattingRule:AddIndentBlockOperationsSlow(System.Collections.Generic.List`1[[Microsoft.CodeAnalysis.Formatting.Rules.IndentBlockOperation, Microsoft.CodeAnalysis.Workspaces, Version=4.4.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]],Microsoft.CodeAnalysis.SyntaxNode,byref):this' during 'Lowering nodeinfo' (IL size 1001; hash 0x83392791; FullOpts)

@AndyAyersMS
Copy link
Member Author

We should improve the rules for supported IR shapes in the post-phase IR checking. Otherwise, it's a bit of a guessing game.

@SingleAccretion
Copy link
Contributor

I see the code which propagates ZEROOBJ on local fields is still there. Likely that's the source of asserts. If we want to propagate zeroes into local fields, it can be done in optZeroObjAssertionProp.

Comment on lines 8328 to 8349
ValueNum initObjVN = ValueNumStore::NoVN;
if (isEntire && rhs->IsIntegralConst(0))
ValueNum initObjVN = ValueNumStore::NoVN;
bool const isZeroInit = rhs->IsIntegralConst(0);
if (isZeroInit && isEntire)
{
// Note that it is possible to see pretty much any kind of type for the local
// (not just TYP_STRUCT) here because of the ASG(BLK(ADDR(LCL_VAR/FLD)), 0) form.
initObjVN = (lhsVarDsc->TypeGet() == TYP_STRUCT) ? vnStore->VNForZeroObj(lhsVarDsc->GetStructHnd())
initObjVN = (lhsVarDsc->TypeGet() == TYP_STRUCT) ? vnStore->VNForZeroObj(lhsVarDsc->GetLayout())
: vnStore->VNZeroForType(lhsVarDsc->TypeGet());
}
else if (isZeroInit)
{
// Non-entire zeroing
initObjVN = (lhs->TypeGet() == TYP_STRUCT) ? vnStore->VNForZeroObj(layout)
: vnStore->VNZeroForType(lhs->TypeGet());
}
else
{
// Non-zero init, or non-entire block zero init
//
// Non-zero block init is very rare so we'll use a simple, unique VN here.
initObjVN = vnStore->VNForExpr(compCurBB, lhsVarDsc->TypeGet());
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the layout change, this can be simplified:

ValueNum initObjVN = ValueNumStore::NoVN;
if (rhs->IsIntegralConst(0))
{
    initObjVN = (lhs->TypeGet() == TYP_STRUCT) ? vnStore->VNForZeroObj(layout)
                                               : vnStore->VNZeroForType(lhs->TypeGet());
}
else
{
    // Non-zero block init is very rare so we'll use a simple, unique VN here.
    initObjVN = vnStore->VNForExpr(compCurBB, lhs->TypeGet());
}

Copy link
Member

@kunalspathak kunalspathak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting diffs.

image

image

// Rescan the table looking for possible constants
//
onlyConstProp = true;
index = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably worth to set the first valid index for which !curAssertion->CanPropLclVar() is true and use that to set the index here to avoid some unnecessary lookups?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out this "second chance" assertion prop no longer kicks in so I just removed it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out this "second chance" assertion prop no longer kicks in so I just removed it.

Note that morph repeatedly invokes assertion prop so we would naturally have caught this case:

runtime/src/coreclr/jit/morph.cpp

Lines 14069 to 14087 in 3d630a8

/* Before morphing the tree, we try to propagate any active assertions */
if (optLocalAssertionProp)
{
/* Do we have any active assertions? */
if (optAssertionCount > 0)
{
GenTree* newTree = tree;
while (newTree != nullptr)
{
tree = newTree;
/* newTree is non-Null if we propagated an assertion */
newTree = optAssertionProp(apFull, tree, nullptr, nullptr);
}
assert(tree != nullptr);
}
}
PREFAST_ASSUME(tree != nullptr);
}

// The latter part of the if below is a heuristic.
//
// If we elimiate a zero assignment for integral lclVars it can lead to
// unnecessary cloning. We need to make sure `optExtractInitTestIncr`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it can lead to unnecessary cloning

Not sure if I follow this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Say we remove a Vxx = 0 because assertion prop already knows Vxx is zero, and Vxx is the control variable for some loop.

With the redundant zeroing removed, optExtractdInitTestIncr may fail to find the initial value for Vxx because it just does a simple local search. When this happens, the cloner will fail to prove that the loop cloning conditions are always satisfied. and so do extra cloning that it otherwise would not do.

What's a little odd here is that we can't then later on (using more powerful VN/global AP) prove that the loop cloning conditions are indeed satisfied and clean things back up as if we'd never cloned. I didn't dig into this because knowingly enabling unnecessary cloning and relying on later phases to clean up seemed like a bad plan.

@@ -3821,14 +3840,92 @@ GenTree* Compiler::optAssertionProp_LclFld(ASSERT_VALARG_TP assertions, GenTreeL
GenTree* Compiler::optAssertionProp_Asg(ASSERT_VALARG_TP assertions, GenTreeOp* asg, Statement* stmt)
{
GenTree* rhs = asg->gtGetOp2();
if (asg->OperIsCopyBlkOp() && varTypeIsStruct(rhs))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the varTypeIsStruct(rhs) condition removal, along with the assert in optZeroObjAssertionProp?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because optZeroObjAssertionProp was extended to handle LCL_FLD.

@AndyAyersMS
Copy link
Member Author

Some kind of CI issue downloading things:

__DistroRid: linux-arm
Downloading 'https://dotnet.microsoft.com/download/dotnet/scripts/v1/dotnet-install.sh'
curl: (92) HTTP/2 stream 1 was not closed cleanly: CANCEL (err 8)
Curl failed; dumping some information about dotnet.microsoft.com for later investigation

@AndyAyersMS
Copy link
Member Author

Retrying the failed legs

@AndyAyersMS
Copy link
Member Author

SPMI replay failures are redundant artifacts:

##[error]Artifact SuperPMI_BuildLogs_x64_checked already exists for build 7551.

@AndyAyersMS
Copy link
Member Author

Going to assume the small TP regressions are artifacts of the compiler update.

@kunalspathak can you approve since you already are somewhat familiar with the changes?

@kunalspathak
Copy link
Member

I will take a look tomorrow.

Copy link
Member

@kunalspathak kunalspathak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@AndyAyersMS AndyAyersMS merged commit 3d630a8 into dotnet:main Sep 8, 2022
@ghost ghost locked as resolved and limited conversation to collaborators Oct 9, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants