gdb/testsuite/gdb.rocm: extract reusable multi-inferior driver helpers#166
gdb/testsuite/gdb.rocm: extract reusable multi-inferior driver helpers#166spatrang wants to merge 2 commits into
Conversation
Move the shared non-stop multi-inferior driver logic into two helper procs in lib/rocm.exp: rocm_multi_inferior_run_to_kernels (set up the session, run the parent to the pre-fork breakpoint, resume, and collect one kernel stop per child) and rocm_multi_inferior_drain (continue each child to a clean exit and run the parent to completion). Convert multi-inferior-gpu.exp to use them. The extracted driver is intentionally stricter than the inlined original: it deduplicates GPU stops by inferior, uses literal-matched regexes, and fails loudly on timeout or a non-zero child exit instead of hanging. A follow-up test reuses the same helpers.
lancesix
left a comment
There was a problem hiding this comment.
I have not really thought deeply about it, but the thing which tickles me here is that this helper implicitly relies on properties of the source file (the markers where to insert breakpoints).
If the functions built around those source assumptions are common, I'd expect the source file to be common as well. If we get to a point where we have multiple tests using those helpers, it will get harder to keep the source / tcl bits in sync.
It really feels like the .cpp of multi-inferior-gpu should also be made generic if we go this way.
Done. Rather than add a separate program, I generalized |
Generalize multi-inferior-gpu.cpp into a shared driver for the multi-inferior tests so the breakpoint markers the lib/rocm.exp helpers rely on live in one program instead of being duplicated. The child count is taken from argv when given and otherwise defaults to the number of GPU devices found at runtime, and each child re-execs itself through a "child" argv dispatch. Give rocm_multi_inferior_run_to_kernels default argument values so callers that want runtime discovery can omit them.
| N comes from argv[1] when given; otherwise it defaults to the number | ||
| of GPU devices found at runtime (one child per device). The | ||
| companion .exp helpers in lib/rocm.exp plant breakpoints on the | ||
| pre-fork and post-join source markers and on the kernel. */ |
There was a problem hiding this comment.
s/post-join/post-waitpid.
| return "others" | ||
| } | ||
| } | ||
|
|
There was a problem hiding this comment.
This entire rocm.exp change/code should be in gdb.rocm/ rather than lib/rocm.exp. You just need the common part as a TCL file and then include the file in whatever test that wants to use it. lib/rocm.exp changes are supposed to be for infrastructure purposes or very basic testsuite functionality.
|
|
||
| #include <sys/types.h> | ||
| #include <sys/wait.h> | ||
| #include <unistd.h> |
There was a problem hiding this comment.
Cosmetic: Group the headers. First sys/, then hip/ then the rest. Or another order, but keep it organized since we're touching this anyway.
Why this PR
This is preparatory refactoring split out of #131 at reviewer request.
While reviewing #131 (which adds a new multi-inferior stress test), it
was noted that the new test shares most of its driver logic with the
existing
gdb.rocm/multi-inferior-gpu.exp. Rather than duplicate thatlogic, the common parts are extracted here into shared helpers first, so
that #131 can reuse them and its diff reduces to just what is genuinely
new.
The tests are kept separate (only the driver logic is shared).
Dependent PR
depends on this PR and will be rebased on top of it once this merges.
Summary
Extract the shared non-stop multi-inferior driver logic out of
gdb.rocm/multi-inferior-gpu.expinto two reusable helper procs ingdb/testsuite/lib/rocm.exp, and convert the existing test to use them.Helpers added (
lib/rocm.exp)rocm_multi_inferior_run_to_kernels {args_list expected}— load theprogram, enable non-stop with
detach-on-fork off/follow-fork parent, plant the breakpoints, run the parent to its pre-forkbreakpoint, resume in the background, and collect one kernel
breakpoint stop per child inferior. Returns the list of stopped GPU
thread ids. The child count can be passed explicitly or discovered at
runtime.
rocm_multi_inferior_drain {threads}— continue each stopped GPUinferior to a clean exit, wait for the parent to reach its
post-
waitpidbreakpoint, and run the parent to completion.Behavior note
The extracted driver is intentionally stricter than the inlined
original: it deduplicates GPU stops by inferior, uses literal-matched
regexes, and fails loudly on timeout or a non-zero child exit instead of
hanging. Coverage of the converted test is otherwise unchanged.
Files changed
gdb/testsuite/lib/rocm.exp— add the two helpers.gdb/testsuite/gdb.rocm/multi-inferior-gpu.exp— convert to use them.