Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpftime trace: tracepipe support for bpf_printk #253

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

aneeshdamle11
Copy link
Contributor

@aneeshdamle11 aneeshdamle11 commented Mar 21, 2024

Description

bpf_printk helper currently outputs to the target program. The kernel eBPF behaviour outputs the result to a named tracepipe (/sys/kernel/debug/tracing/trace_pipe). This PR tries to emulate this behaviour: bpf_printk outputs to a tracepipe, this pipe is accessed by "bpftime trace" option.
The trace pipe is created at "$HOME/.bpftime/tracepipe" or "(bpftime installation location) + /tracepipe"
Dependency: bpftime trace needs to be run in a separate terminal before running bpftime start option.

Fixes #157

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

  • Test A
  • Test B

Test Configuration:

  • Firmware version:
  • Hardware:
  • Toolchain:
  • SDK:

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • My changes generate no new warnings
  • Any dependent changes have been merged and published in downstream modules
  • I have checked my code and corrected any misspellings

@aneeshdamle11
Copy link
Contributor Author

I have tested this feature thus:
I opened 3 terminals, in given order:

  1. bpftime load ./example/malloc/malloc
  2. bpftime trace
  3. bpftime start ./example/malloc/victim

Below are the output screenshots:

  1. bpftime load
    bpftime_load

  2. bpftime trace
    bpftime_trace

  3. bpftime start (called 3 times, screenshot of 2 calls due to screenshot space constraints)
    bpftime_start

@Officeyutong
Copy link
Contributor

Officeyutong commented Mar 22, 2024

Thank you! But note that we also need to keep the ability to get bpf_trace_printk prints things to stdout. don't remove such feature

@aneeshdamle11
Copy link
Contributor Author

Okay, could you help with clarifying these doubts before I make the changes in the next commit?

  1. Should the continuous reading of tracepipe be carried out in the bpftime load option (like load_sock_ops in previous Linux versions ref:https://elixir.bootlin.com/linux/v4.18/source/samples/bpf/load_sock_ops.c#93)? Then, the bpf_printk would output contents to the "bpftool load" process.
    OR
  2. Should the feature work thus: load, start, attach work as they normally should (Here, bpf_printk outputs things to tracepipe, and is not seen in the target output). When "bpftool trace" option is called, the tracepipe contents are read and printed.
    It would help immensely if you could clear these doubts, thank you!

@Officeyutong
Copy link
Contributor

Officeyutong commented Mar 22, 2024

Okay, could you help with clarifying these doubts before I make the changes in the next commit?

  1. Should the continuous reading of tracepipe be carried out in the bpftime load option (like load_sock_ops in previous Linux versions ref:https://elixir.bootlin.com/linux/v4.18/source/samples/bpf/load_sock_ops.c#93)? Then, the bpf_printk would output contents to the "bpftool load" process.
    OR
  2. Should the feature work thus: load, start, attach work as they normally should (Here, bpf_printk outputs things to tracepipe, and is not seen in the target output). When "bpftool trace" option is called, the tracepipe contents are read and printed.
    It would help immensely if you could clear these doubts, thank you!

I think we could use a option such as --print-to-trace-pipe, -p to control whether bpf_printk should print things to trace pipe. If the option is not set, bpf_printk should print things to stdout of the injected program. if set, it should print things to a pipe, and we can use bpftime trace to read things from it.

The option could be set through bpftime start, e.g bpftime start -p ./victim. We can use environment variable to pass the option to libbpftime-agent.so.

For further, we could also support set that option when using bpftime load. But in this way the option should be stored in shared memory, and agent should load that option from shared memory and determine where to print things printed by bpf_printk

@pull-request-size pull-request-size bot added size/L and removed size/M labels Mar 23, 2024
@aneeshdamle11
Copy link
Contributor Author

The code now runs with --print-to-trace-pipe option, or -p option for bpftime start, and normally printing on the target program's stdout otherwise. "TRACEPIPE_PATH=" is the environment variable

Test results:
bpftime load ./example/malloc/malloc run normally in a separate terminal.
bpftime trace:
bpftime_trace2
bpftime start -p ./example/malloc/victim
bpftime_start2

@Officeyutong
Copy link
Contributor

Thank you! But please make the CI happy

@aneeshdamle11
Copy link
Contributor Author

I tried poring over the code, and tried some checks. But I'm not able to point out the exact check fail. Could you please help in pointing out some areas in code that could be the cause?

@Officeyutong
Copy link
Contributor

I tried poring over the code, and tried some checks. But I'm not able to point out the exact check fail. Could you please help in pointing out some areas in code that could be the cause?

You may try running target bpftime_runtime_tests on your local machine. In yout newest commit, this check fails

@aneeshdamle11
Copy link
Contributor Author

The check was failing at normal bpftime attach. The error was due to direct conversion from char * getenv to string. On adding a separate condition, the attach worked properly.

@aneeshdamle11
Copy link
Contributor Author

aneeshdamle11 commented Mar 26, 2024

Now there's a different failing check. On checking a different PR where this check passes, the following command is run:

image

When I run it on my Ubuntu machine, it gives the following success output:
Specs: Kubuntu 23.10, kernel 6.5.0-26-generic
image

Since I am not able to regenerate the issue, I am at a blocker.
Could you please help with any inputs on what the error might be, or what I should concentrate on correcting?

@aneeshdamle11
Copy link
Contributor Author

Also, I forgot to mention one thing: when bpftime start is used with the --print-to-trace-pipe option, it has to have bpftime trace running already, or else the bpf_printk output might block. Would this be an issue; should I document it?

@Officeyutong
Copy link
Contributor

Also, I forgot to mention one thing: when bpftime start is used with the --print-to-trace-pipe option, it has to have bpftime trace running already, or else the bpf_printk output might block. Would this be an issue; should I document it?

It should be documented, or the better is that it could be fixed

@Officeyutong
Copy link
Contributor

I'm investigating on that. seems that the CI script (run_example.py) was stuck at somewhere pending for IO, I guess it would be related to the flush of stdio fds (When executed by run_example.py, stdout/stderr seems necessary to be manually flushed, like

fflush(stdout);
). I need more time to figure out what's the root cause, but you may be also try to flush stdout/stderr

@Kailian-Jacy
Copy link
Contributor

Gone through the code in PR, it seems modifications has nothing to do with stdio during the test example.

I tried to run test locally, printing debugger told me the program hangs on the line

await asyncio.gather(server.communicate(), agent.communicate())

And at that time these two proc-s have exited according to htop.
image

Are we sure that communicate() with exit process really works?

@yunwei37 yunwei37 force-pushed the master branch 3 times, most recently from 1354316 to a6a132b Compare September 11, 2024 18:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEATURE] Add trace_pipe support for bpf_printk
3 participants