-
Notifications
You must be signed in to change notification settings - Fork 148
perf/s390: Regression: Move uid filtering to BPF filters #9368
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Upstream branch: 5345e64 |
b5fabf3
to
c883dca
Compare
Upstream branch: 5b4c54a |
6b69b66
to
5d732cf
Compare
c883dca
to
c82c120
Compare
Upstream branch: cd7c97f |
5d732cf
to
b50feed
Compare
c82c120
to
d97a5de
Compare
Upstream branch: e8d780d |
b50feed
to
2251d67
Compare
d97a5de
to
ba11888
Compare
Upstream branch: 821c9e5 |
2251d67
to
b673c89
Compare
ba11888
to
7b5771c
Compare
Upstream branch: a6923c0 |
V1 --> V2: Added Jiri Olsa's suggestion and introduced member bpf_perf_event_opts::no_ioctl_enable. On linux-next commit b4c658d ("perf target: Remove uid from target") introduces a regression on s390. In fact the regression exists on all platforms when the event supports auxiliary data gathering. Command # ./perf record -u 0 -aB --synth=no -- ./perf test -w thloop [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.011 MB perf.data ] # ./perf report --stats | grep SAMPLE # does not generate samples in the perf.data file. On x86 command # sudo perf record -e intel_pt// -u 0 ls is broken too. Looking at the sequence of calls in 'perf record' reveals this behavior: 1. The event 'cycles' is created and enabled: record__open() +-> evlist__apply_filters() +-> perf_bpf_filter__prepare() +-> bpf_program.attach_perf_event() +-> bpf_program.attach_perf_event_opts() +-> __GI___ioctl(..., PERF_EVENT_IOC_ENABLE, ...) The event 'cycles' is enabled and active now. However the event's ring-buffer to store the samples generated by hardware is not allocated yet. This happens now after enabling the event: 2. The event's fd is mmap() to create the ring buffer: record__open() +-> record__mmap() +-> record__mmap_evlist() +-> evlist__mmap_ex() +-> perf_evlist__mmap_ops() +-> mmap_per_cpu() +-> mmap_per_evsel() +-> mmap__mmap() +-> perf_mmap__mmap() +-> mmap() This allocates the ring-buffer for the event 'cycles'. With mmap() the kernel creates the ring buffer: perf_mmap(): kernel function to create the event's ring | buffer to save the sampled data. | +-> ring_buffer_attach(): Allocates memory for ring buffer. | The PMU has auxiliary data setup function. The | has_aux(event) condition is true and the PMU's | stop() is called to stop sampling. It is not | restarted: | if (has_aux(event)) | perf_event_stop(event, 0); | +-> cpumsf_pmu_stop(): Hardware sampling is stopped. No samples are generated and saved anymore. 3. After the event 'cycles' has been mapped, the event is enabled a second time in: __cmd_record() +-> evlist__enable() +-> __evlist__enable() +-> evsel__enable_cpu() +-> perf_evsel__enable_cpu() +-> perf_evsel__run_ioctl() +-> perf_evsel__ioctl() +-> __GI___ioctl(., PERF_EVENT_IOC_ENABLE, .) The second ioctl(fd, PERF_EVENT_IOC_ENABLE, 0); is just a NOP in this case. The first invocation in (1.) sets the event::state to PERF_EVENT_STATE_ACTIVE. The kernel functions perf_ioctl() +-> _perf_ioctl() +-> _perf_event_enable() +-> __perf_event_enable() returns immediately because event::state is already set to PERF_EVENT_STATE_ACTIVE. This happens on s390, because the event 'cycles' offers the possibility to save auxilary data. The PMU call backs setup_aux() and free_aux() are defined. Without both call back functions, cpumsf_pmu_stop() is not invoked and sampling continues. To remedy this, remove the first invocation of ioctl(..., PERF_EVENT_IOC_ENABLE, ...). in step (1.) Create the event in step (1.) and enable it in step (3.) after the ring buffer has been mapped. Make the change backward compatible and introduce a new structure member bpf_perf_event_opts::no_ioctl_enable. It defaults to false and only bpf_program__attach_perf_event() sets it to true. This way only perf tool invocation do not enable the sampling event. Output after: # ./perf record -aB --synth=no -u 0 -- ./perf test -w thloop 2 [ perf record: Woken up 3 times to write data ] [ perf record: Captured and wrote 0.876 MB perf.data ] # ./perf report --stats | grep SAMPLE SAMPLE events: 16200 (99.5%) SAMPLE events: 16200 # The software event succeeded before and after the patch: # ./perf record -e cpu-clock -aB --synth=no -u 0 -- \ ./perf test -w thloop 2 [ perf record: Woken up 7 times to write data ] [ perf record: Captured and wrote 2.870 MB perf.data ] # ./perf report --stats | grep SAMPLE SAMPLE events: 53506 (99.8%) SAMPLE events: 53506 # Fixes: 63f2f5e ("libbpf: add ability to attach/detach BPF program to perf event") To: Andrii Nakryiko <[email protected]> To: Ian Rogers <[email protected]> To: Ilya Leoshkevich <[email protected]> Signed-off-by: Thomas Richter <[email protected]> Suggested-by: Jiri Olsa <[email protected]>
b673c89
to
92e0fba
Compare
7b5771c
to
ccc6c10
Compare
At least one diff in series https://patchwork.kernel.org/project/netdevbpf/list/?series=986451 expired. Closing PR. |
Pull request for series with
subject: perf/s390: Regression: Move uid filtering to BPF filters
version: 2
url: https://patchwork.kernel.org/project/netdevbpf/list/?series=986451