Skip to content

GH-130415: Improve the JIT's unneeded uop removal pass #132333

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Apr 21, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions Lib/test/test_capi/test_opt.py
Original file line number Diff line number Diff line change
Expand Up @@ -1283,7 +1283,7 @@ class Bar:
load_attr_top = opnames.index("_LOAD_ATTR_NONDESCRIPTOR_WITH_VALUES", 0, call)
load_attr_bottom = opnames.index("_LOAD_ATTR_NONDESCRIPTOR_WITH_VALUES", call)
self.assertEqual(opnames[:load_attr_top].count("_GUARD_TYPE_VERSION"), 1)
self.assertEqual(opnames[call:load_attr_bottom].count("_CHECK_VALIDITY"), 1)
self.assertEqual(opnames[call:load_attr_bottom].count("_CHECK_VALIDITY"), 2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have two _CHECK_VALIDITYs now?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There used to be one validity check, after the setattr call. This adds an additional one after the _POP_TOP following the call, since in theory the function could return an object with a destructor that invalidates the JIT code when it's popped.

Because of the old structure of this optimization pass, we weren't actually treating _POP_TOP as escaping, even though it is (this is handled in the default case, which we fall through to now).


def test_guard_type_version_removed_escaping(self):

Expand All @@ -1306,7 +1306,7 @@ class Foo:
load_attr_top = opnames.index("_LOAD_ATTR_NONDESCRIPTOR_WITH_VALUES", 0, call)
load_attr_bottom = opnames.index("_LOAD_ATTR_NONDESCRIPTOR_WITH_VALUES", call)
self.assertEqual(opnames[:load_attr_top].count("_GUARD_TYPE_VERSION"), 1)
self.assertEqual(opnames[call:load_attr_bottom].count("_CHECK_VALIDITY"), 1)
self.assertEqual(opnames[call:load_attr_bottom].count("_CHECK_VALIDITY"), 2)

def test_guard_type_version_executor_invalidated(self):
"""
Expand Down Expand Up @@ -1601,7 +1601,7 @@ def testfunc(n):
self.assertIsNotNone(ex)
uops = get_opnames(ex)
self.assertNotIn("_COMPARE_OP_INT", uops)
self.assertIn("_POP_TWO_LOAD_CONST_INLINE_BORROW", uops)
self.assertNotIn("_POP_TWO_LOAD_CONST_INLINE_BORROW", uops)

def test_to_bool_bool_contains_op_set(self):
"""
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
Improve the JIT's ability to remove unused constant and local variable
loads, and fix an issue where deallocating unused values could cause JIT
code to crash or behave incorrectly.
53 changes: 38 additions & 15 deletions Python/optimizer_analysis.c
Original file line number Diff line number Diff line change
Expand Up @@ -555,28 +555,47 @@ remove_unneeded_uops(_PyUOpInstruction *buffer, int buffer_size)
}
break;
case _POP_TOP:
case _POP_TOP_LOAD_CONST_INLINE:
case _POP_TOP_LOAD_CONST_INLINE_BORROW:
case _POP_TWO_LOAD_CONST_INLINE_BORROW:
optimize_pop_top_again:
{
_PyUOpInstruction *last = &buffer[pc-1];
while (last->opcode == _NOP) {
last--;
}
if (last->opcode == _LOAD_CONST_INLINE ||
last->opcode == _LOAD_CONST_INLINE_BORROW ||
last->opcode == _LOAD_FAST ||
last->opcode == _LOAD_FAST_BORROW ||
last->opcode == _COPY
) {
last->opcode = _NOP;
buffer[pc].opcode = _NOP;
}
if (last->opcode == _REPLACE_WITH_TRUE) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we not optimizing this any more? I would have thought it would go in the same case as _POP_TOP_LOAD_CONST_INLINE_BORROW.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch. Though these should probably just be turned into _POP_TOP_LOAD_CONST_INLINE_BORROW by the optimizer instead.

I'll leave it this way and open up another PR for that. Having the hardcoded constant makes things more difficult, since we'd need to start writing operands instead of operating purely on opcodes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I just added that change as part of this PR.

last->opcode = _NOP;
switch (last->opcode) {
case _POP_TWO_LOAD_CONST_INLINE_BORROW:
last->opcode = _POP_TOP;
break;
case _POP_TOP_LOAD_CONST_INLINE:
case _POP_TOP_LOAD_CONST_INLINE_BORROW:
last->opcode = _NOP;
goto optimize_pop_top_again;
case _COPY:
case _LOAD_CONST_INLINE:
case _LOAD_CONST_INLINE_BORROW:
case _LOAD_FAST:
case _LOAD_FAST_BORROW:
case _LOAD_SMALL_INT:
last->opcode = _NOP;
if (opcode == _POP_TOP) {
opcode = buffer[pc].opcode = _NOP;
}
else if (opcode == _POP_TOP_LOAD_CONST_INLINE) {
opcode = buffer[pc].opcode = _LOAD_CONST_INLINE;
}
else if (opcode == _POP_TOP_LOAD_CONST_INLINE_BORROW) {
opcode = buffer[pc].opcode = _LOAD_CONST_INLINE_BORROW;
}
else {
assert(opcode == _POP_TWO_LOAD_CONST_INLINE_BORROW);
opcode = buffer[pc].opcode = _POP_TOP_LOAD_CONST_INLINE_BORROW;
goto optimize_pop_top_again;
}
}
break;
_Py_FALLTHROUGH;
}
case _JUMP_TO_TOP:
case _EXIT_TRACE:
return pc + 1;
default:
{
/* _PUSH_FRAME doesn't escape or error, but it
Expand All @@ -591,7 +610,11 @@ remove_unneeded_uops(_PyUOpInstruction *buffer, int buffer_size)
buffer[last_set_ip].opcode = _SET_IP;
last_set_ip = -1;
}
break;
}
case _JUMP_TO_TOP:
case _EXIT_TRACE:
return pc + 1;
}
}
Py_UNREACHABLE();
Expand Down
1 change: 1 addition & 0 deletions Python/optimizer_bytecodes.c
Original file line number Diff line number Diff line change
Expand Up @@ -913,6 +913,7 @@ dummy_func(void) {
}

op(_REPLACE_WITH_TRUE, (value -- res)) {
REPLACE_OP(this_instr, _POP_TOP_LOAD_CONST_INLINE_BORROW, 0, (uintptr_t)Py_True);
res = sym_new_const(ctx, Py_True);
}

Expand Down
1 change: 1 addition & 0 deletions Python/optimizer_cases.c.h

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading