Skip to content

GC can clear the locals of a live generator when its frame wrapper is GC-collected #150753

@KowalskiThomas

Description

@KowalskiThomas

Bug report

Bug description:

frame_traverse has a guard that skips traversal of the locals and value stack when the underlying _PyInterpreterFrame is not owned by the PyFrameObject itself:

cpython/Objects/frameobject.c

Lines 1964 to 1977 in c5516e7

frame_traverse(PyObject *op, visitproc visit, void *arg)
{
PyFrameObject *f = PyFrameObject_CAST(op);
Py_VISIT(f->f_back);
Py_VISIT(f->f_trace);
Py_VISIT(f->f_extra_locals);
Py_VISIT(f->f_locals_cache);
Py_VISIT(f->f_overwritten_fast_locals);
if (f->f_frame->owner != FRAME_OWNED_BY_FRAME_OBJECT) {
return 0;
}
assert(f->f_frame->frame_obj == NULL);
return _PyFrame_Traverse(f->f_frame, visit, arg);
}

frame_tp_clear has the same structure but is missing this guard:

cpython/Objects/frameobject.c

Lines 1979 to 1999 in c5516e7

static int
frame_tp_clear(PyObject *op)
{
PyFrameObject *f = PyFrameObject_CAST(op);
Py_CLEAR(f->f_trace);
Py_CLEAR(f->f_extra_locals);
Py_CLEAR(f->f_locals_cache);
Py_CLEAR(f->f_overwritten_fast_locals);
/* locals and stack */
_PyStackRef *locals = _PyFrame_GetLocalsArray(f->f_frame);
_PyStackRef *sp = f->f_frame->stackpointer;
assert(sp >= locals);
while (sp > locals) {
sp--;
PyStackRef_CLEAR(*sp);
}
f->f_frame->stackpointer = locals;
Py_CLEAR(f->f_frame->f_locals);
return 0;
}

I believe this can happen when a PyFrameObject is created via the C API (with e.g. PyGen_New). The original frame object is tracked for GC with owner FRAME_OWNED_BY_FRAME_OBJECT. Its data is copied into gen->gi_iframe, and f->f_frame is redirected to &gen->gi_iframe. After this, we have f->f_frame->owner == FRAME_OWNED_BY_GENERATOR and gi_iframe.frame_obj == NULL.

Then, because gi_iframe.frame_obj == NULL, gen_traverse does not visit the frame object, so the frame can be in a cycle detectable by the GC and the generator will not "save it" from that. When the GC collects that cycle and calls frame_tp_clear, the function walks f->f_frame->localsplus and calls PyStackRef_CLEAR on every slot. As a result, the generator's locals are cleared while the generator is suspended.

(I wasn't able to come up with a simple / self-contained reproducer.)

CPython versions tested on:

CPython main branch

Operating systems tested on:

macOS

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    interpreter-core(Objects, Python, Grammar, and Parser dirs)type-bugAn unexpected behavior, bug, or error
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions