Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow filtering to list of pid (additionally via executable name) / tid for --exportTo #524

Open
GitMensch opened this issue Oct 12, 2023 · 7 comments

Comments

@GitMensch
Copy link
Contributor

Is your feature request related to a problem? Please describe.
After running system-wide recordings (actually limited to a service user already) I often get huge amount of warning messages like "Invalid memory read requested by dwf" and "failed to attach state Couldn't find architecture of any ELF", "No DWARF information found 2", "No DWARF information found 5" during open/export.

Then I'm in Hotspot seeing a lot of processes I'm not interested in at all (bash, grep, ls, and others), so I need to always filter when inspecting the recording.

Describe the solution you'd like
Provide an option to filter when converting to perfparser file like it is possible in the timeline view ([list of] name/pid).
It may be possible to skip several things then, that not match the filter during import, possibly also reducing the amount of warning messages (for example adding a filter "only processes that contain myApp") - but in any case the .perfparser output file would be much smaller, could be opened faster in Hotspot, would not need the filter to be post-applied each time it is opened.

@milianw
Copy link
Member

milianw commented Oct 13, 2023

While possible, this would require special code as the current filtering is applied to the analyzed hotspot specific data, not the raw perfparser data. Meaning this isn't that trivial to implement.

@GitMensch

This comment was marked as outdated.

@GitMensch
Copy link
Contributor Author

GitMensch commented Jan 11, 2025

More important seem These are the one that perf report have which I'd like to see implemented:

       --pid=
           Only show events for given process ID (comma separated list).

       --tid=
           Only show events for given thread ID (comma separated list).

This would allow to use things like perf report --tasks to look into the perf file, find the PIDs that are relevant, then give this as a list to perfparser (via hotspot --exportTo).

Additional to this something like

       --executable=
           Filter on PIDs that exactly match the given name (comma separated list).

would be good, because this often removes the need to get the list of PIDs at all and perfectly match the scenario of "I record the user/system/bunch of executables below a script but am only interested in one or two executables which I can adjust and have full debug data for".

@milianw
Copy link
Member

milianw commented Jan 12, 2025

all of these filters are supported interactively in hotspot, which is much better. doing such kind of filtering as a pre-processing step could be done, but is more work. Most notably, how do you filter on a singular symbol or dso - do you include the full backtrace then if the symbol or dso is contained, or do you ignore other frames then? do you take into account whether the symbol/dso is the leaf self cost or whether it's on the inclusive stack?

there are various ways to support this, and you need to define how it should behave (and/or implement it yourself).

also note that I doubt anything will be opened faster - the bulk of the time is spent within perfparser and the filtering you suggest here would - except for the tid/pid filter - still require the full unwinding which is slow and thus not really make things faster.

I also believe that the warnings should be handled differently if anyone wants to spent time on this - reducing them by doing pre-processing / filtering is the wrong approach imo.

@GitMensch
Copy link
Contributor Author

GitMensch commented Jan 13, 2025

Thank you for the explanation. This makes it clear that the only reasonable filter during parse would be pid+tid (comma separated list), and that should speed up the perfparser part a lot (and in case of the export would also make the result much smaller).

I personally feel that this would also allow a bigger recording - this is mostly about system/user-wide or "all processes spawned by this start program/script" recording, where the PIDs are not know up-front - to be split for analysis (perf report --tasks allows us to inspect the list quickly "for free"). I'm changing the FR's title to have this covered.

This would filter on the PIDs - so no sub-process spawned from it.

@GitMensch GitMensch changed the title allow filtering for --exportTo allow filtering to list of pid/tid for --exportTo Jan 13, 2025
@GitMensch GitMensch changed the title allow filtering to list of pid/tid for --exportTo allow filtering to list of pid (additionally via executable name) / tid for --exportTo Jan 13, 2025
@milianw
Copy link
Member

milianw commented Jan 13, 2025

I'm not opposed to adding a pid filter, but I generally question why you'd do that - why not simply runtime attach to those pids directly during record? That way you already save a ton of overhead at that time, and all tools downstream will benefit too.

This is how I do it personally. If I want to profile startup times, I can still look at an individual app. But if I'm interested on the interaction of multiple apps, that's usually something that I can investigate by attaching after they have started (and I thus know their pid).

@GitMensch
Copy link
Contributor Author

The issue is that multiple processes are started from an original starting script (partially starting multiple processes from a script again), so to actually gather the PIDs takes some effort (at least for me; but a script that uses ps could possibly find those and start the recording; with some of the PIDs living ~10 seconds this doesn't get easier.

An option (in theory) is to start each program under perf recording without sub-programs (possibly using the exact timestamp as part of the output file), but then I get hundreds of recordings. (is there a way to combine those?)

To see the relation of the PIDs (they use IPC between them) and where the time general is spent over the complete complex (most programs are running the same code, but each with a (different) data-driven code-path), so recording the whole complex, then filtering is the most easy way to do.
Do you have a suggestion how to adjust my recording for this scenario?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants