-
Notifications
You must be signed in to change notification settings - Fork 6.2k
8350864: C2: verify structural invariants of the Ideal graph #26362
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
8350864: C2: verify structural invariants of the Ideal graph #26362
Conversation
👋 Welcome back mchevalier! A progress list of the required criteria for merging this PR into |
❗ This change is not yet ready to be integrated. |
@marc-chevalier The following label will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command. |
Webrevs
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice!
Some high-level comments:
- IMO it's better to have node-specific invariant checks co-located with corresponding node (as
Node::verify()
maybe?); it would make it clearer what are the expectations when changing the implementation. - on naming: IMO
VerifyIdealGraph
would clearly describe what the logic does, fits existing conventions well, and easy to find
I understand the motivation, but I'm not sure what to do in every case. For instance, when the pattern is not so small (like strip mining), it's hard to associate the invariant with a single node: many are involved, and it's not really describable as the expectation of a single node. Of course, we could split the pattern into a lot of sub-patterns, centered each around each node type, but then, we lose the overview of the structure, and it becomes context-free (e.g. a IfFalse must have a CountedLoopEnd input only when it comes before a safepoint before a OuterStripMinedLoopEnd, but not in general). Another problematic case is the control successor check that has special handling for some kinds, that could be relocated to the said node types, but for the general case, it simply tests whether the node is a CFG node. One could still do that in I also have a readability concern. Even if we sort them by node types, then we mix the implementation of all invariants of a given node in a single method, making it extra-hard to understand the big picture, and when looking for overrides, I will find some, but maybe they won't be about the invariant I'm interested in. And a code/maintenance concern. If I have a default implementation of the control successor check in Overall, it seems to me that it's beneficial to move checks to the node types if:
For instance, |
Sure, fine with me! I'd be curious to see if somebody has other ideas. |
I am fine with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work, and great explanation as well! The invariants that are already implemented seem quite useful already, and it seems there is a lot of potential.
Having recently worked on a few missed optimizations related to PhaseIterGVN::add_users_of_use_to_worklist
, I agree that it would be interesting to use such patterns for automatic notifications. The way I see it, we would need to somehow "reverse" the patterns, as they would be expressed from the point of view of the node on which the optimizations is applied, and would require notification when dependencies changes. Probably quite non-trivial, but interesting nonetheless.
I only have a few basic remarks/questions.
@vnkozlov: That is true. I think the idea was to use the in tests (typically stress tests) once integrated. Or at least, use it in the issues I found thanks to this flag. That should make it not totally at least. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, very nice work @marc-chevalier !
Cool that you tried a pattern-matching approach. I really do wonder if we could use that more widely 😊
// An invariant that needs only a local view of the graph, around a given node. | ||
class LocalGraphInvariant : public ResourceObj { | ||
public: | ||
static constexpr int OutputStep = -1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please add a quick comment what this is for? After all, it is a public static constant ;)
bool is_node_dead(const Node*); | ||
private: | ||
void fill(); | ||
Unique_Node_List live_nodes; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the hotspot convention is to have fields with an underscore _live_nodes
. Especially if they are private.
/* Check whether the invariant is true around the node [center]. The argument [steps] and [path] are initially empty. | ||
* | ||
* If the check fails steps and path must be filled with the path from the center to the failing node (where it's relevant to show). | ||
* Given a list of node |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* Given a list of node | |
* Given a list of nodes |
* - path must have length k, and contain rk ... r1 where ri is: | ||
* - a non-negative integer p for each step such that N{i-1} has Ni as p-th input (we need to follow an input edge) | ||
* - the OUTPUT_STEP value in case N{i-1} has Ni as an output (we need to follow an output edge) | ||
* The list are reversed to allow to easily fill them lazily on failure. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* The list are reversed to allow to easily fill them lazily on failure. | |
* The lists are reversed to allow to easily fill them lazily on failure. |
* The parameter [live_nodes] is used to share the lazily computed set of CFG nodes reachable from root. This is because some | ||
* checks don't apply to dead code, suppress their error if a violation is detected in dead code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does that mean we only cache the result if it is reachable, but not if it is not reachable? Does that mean we may check reachability for non-reachable nodes many many times?
return result; | ||
} | ||
assert(counted_loop != nullptr, "sanity"); | ||
if (is_long) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you cache the value? Seems is_long
is only used once ... and center
should not change pointers around.
ResourceMark rm; | ||
|
||
if (_checks.is_empty()) { | ||
return true; | ||
} | ||
|
||
VectorSet enqueued; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would move the ResourceMark to the beginning of the allocations, and do the fast bail-out first.
// Sometimes, we get weird structure in dead code that will be cleaned up later. It typically happens | ||
// when data dies, but control is not cleanup right away, possibly kept alive by un unreachable loop. | ||
// Since we don't want to eagerly traverse the whole graph to remove dead code in IGVN, we can accept | ||
// weird structure in dead code. | ||
// For CFG-related errors, we will compute the set of reachable CFG nodes and decide whether to keep | ||
// the issue if the problematic node is reachable. This set of reachable node is thus computed lazily | ||
// (and it seems not to happen often in practice), and shared across checks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Sometimes, we get weird structure in dead code that will be cleaned up later. It typically happens | |
// when data dies, but control is not cleanup right away, possibly kept alive by un unreachable loop. | |
// Since we don't want to eagerly traverse the whole graph to remove dead code in IGVN, we can accept | |
// weird structure in dead code. | |
// For CFG-related errors, we will compute the set of reachable CFG nodes and decide whether to keep | |
// the issue if the problematic node is reachable. This set of reachable node is thus computed lazily | |
// (and it seems not to happen often in practice), and shared across checks. | |
// Sometimes, we get weird structures in dead code that will be cleaned up later. It typically happens | |
// when data dies, but control is not cleaned up right away, possibly kept alive by an unreachable loop. | |
// Since we don't want to eagerly traverse the whole graph to remove dead code in IGVN, we can accept | |
// weird structures in dead code. | |
// For CFG-related errors, we will compute the set of reachable CFG nodes and decide whether to keep | |
// the issue if the problematic node is reachable. This set of reachable nodes is thus computed lazily | |
// (and it seems not to happen often in practice), and shared across checks. |
if (in != nullptr && !enqueued.test_set(in->_idx)) { | ||
worklist.push(in); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not make a Unique_Node_List
? It would already have a VectorSet
included, and you could just push without checking if we already pushed the node. Very nice for BFS traversals.
You would then not even pop nodes, but just traverse over the worklist, as it grows.
ttyLocker ttyl; | ||
tty->print("%d failure%s for node\n", failures, failures == 1 ? "" : "s"); | ||
center->dump(); | ||
tty->print_cr("%s", ss.base()); | ||
ss.reset(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you really want to use the ttyLocker
here? I thought we were trying to get away from it because it sometimes leads to lock-priority issues / dead-locks.
Why not just use yet another stringStream ss3
, and do it all via that one?
Yes. What you need is at least a "Hello World" test, where the flag is enabled. |
Some crashes are consequences of earlier misshaped ideal graphs, which could be detected earlier, closer to the source, before the possibly many transformations that lead to the crash.
Let's verify that the ideal graph is well-shaped earlier then! I propose here such a feature. This runs after IGVN, because at this point, the graph, should be cleaned up for any weirdness happening earlier or during IGVN.
This feature is enabled with the develop flag
VerifyIdealStructuralInvariants
. Open to renaming. No problem with me! This feature is only available in debug builds, and most of the code is even not compiled in product, since it uses some debug-only functions, such asNode::dump
orNode::Name
.For now, only local checks are implemented: they are checks that only look at a node and its neighborhood, wherever it happens in the graph. Typically: under a
If
node, we have aIfTrue
and aIfFalse
. To ease development, each check is implemented in its own class, independently of the others. Nevertheless, one needs to do always the same kind of things: checking there is an output of such type, checking there is N inputs, that the k-th input has such type... To ease writing such checks, in a readable way, and in a less error-prone way than pile of copy-pasted code that manually traverse the graph, I propose a set of compositional helpers to write patterns that can be matched against the ideal graph. Since these patterns are... patterns, so not related to a specific graph, they can be allocated once and forever. When used, one provides the node (called center) around which one want to check if the pattern holds.On top of making the description of pattern easier, these helpers allows nice printing in case of error, by showing the path from the center to the violating node. For instance (made up for the purpose of showing the formatting), a violation with a path climbing only inputs:
or with outputs:
So far a small set of checks are implemented:
If
nodes have aIfTrue
andIfFalse
Phi
nodes have aRegion
node of the same arity as 0th inputOuterStripMinedLoopEnd
MultiBranch
,outcnt
is smaller than or equal torequired_outcnt
(it is legitimate to have a smaller number of output, especially after some optimizations).Some of these checks have an additional subtlety: it's ok to have some wrong shape in dead code, for instance
IfProjections
. After a lot of investigation, it seems that some dead loops are not always detected eagerly and can make some control path survive longer, until being removed before loop opts. This seems to be by design to avoid traversing the whole graph everytime a region lose an input. It seems such misshape is harmless because they are not reachable from the inputs, and the cost of removing them would be prohibitive. To deal with such cases, when such a check fails, we check whether it happened in dead code. The dead of unreachable control nodes is lazily computed to answer that, and it's shared across checkers. While computing unreachable nodes is somewhat expensive, it seems to happen rarely in practice.This verification has found JDK-8359344 and JDK-8359121. It has been run on tiers 1 to 3, plus some internal testing and, after fixing the above-mentioned, it seems all passing!
Related future: add more checks, should be easy.
Less related future: could we imagine using similar patterns (without the error reporting mechanism) to use for optimizations, instead of manual traversing? It could make the code clearer to understand. We could also imagine optionally using such things in idealization to declare which patterns nodes are looking for, and if they have depth greater than 1, automatically adapting the enqueuing strategy without having to pimp
PhaseIterGVN::add_users_of_use_to_worklist
everytime. Could at least cover some basic (but numerous) cases.Progress
Issue
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/26362/head:pull/26362
$ git checkout pull/26362
Update a local copy of the PR:
$ git checkout pull/26362
$ git pull https://git.openjdk.org/jdk.git pull/26362/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 26362
View PR using the GUI difftool:
$ git pr show -t 26362
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/26362.diff
Using Webrev
Link to Webrev Comment