-
Notifications
You must be signed in to change notification settings - Fork 6
feat: profile memory of command #164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
ca571b9 to
4bdc191
Compare
d168920 to
cffcf37
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This pull request adds memory profiling capability to the codebase through a new Memory executor that uses eBPF-based tracking via a custom heaptrack crate. The implementation allows tracking memory allocations (malloc, free, calloc, realloc, etc.) in benchmarked programs.
Key Changes:
- Introduces a new
MemoryExecutorthat uses eBPF uprobes to track memory allocations - Adds a
heaptrackcrate with BPF programs for monitoring malloc/free and related syscalls - Refactors FIFO communication handling into shared code to support multiple executor types
- Updates test snapshots to use Git LFS for large binary data
Reviewed changes
Copilot reviewed 52 out of 54 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
src/run/runner/memory/executor.rs |
Implements the new Memory executor using heaptrack IPC for control |
src/run/runner/shared/fifo.rs |
Extracts shared FIFO message handling logic for reuse across executors |
src/run/runner/wall_time/perf/mod.rs |
Refactors to use shared FIFO handling code |
crates/heaptrack/* |
New crate providing eBPF-based memory tracking with BPF programs and IPC |
src/run/uploader/upload.rs |
Adds Memory executor case (currently unimplemented) |
flake.nix |
Adds Nix development environment configuration |
.github/workflows/ci.yml |
Updates CI to install dependencies and run BPF tests separately |
| Snapshot files | Migrates large test snapshots to Git LFS |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review for a chance to win a $100 gift card. Take the survey.
| loop { | ||
| if self.ctl_fifo.read_exact(&mut buffer).await.is_ok() { | ||
| break; | ||
| } | ||
| } |
Copilot
AI
Nov 25, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This infinite loop will never exit if read_exact continuously fails. The loop should either have a timeout, a maximum retry count, or handle/propagate the error. This could lead to the process hanging indefinitely.
9aaf842 to
28a793b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CI does not build, and the CI change commit scares me a bit, going forward not gonna lie
Just so I know, have you had the protobuf decision challenged ?
Could be good to have a small design document just to make sure the whole team is aligned behind the choice.
7f99647 to
3e3e2be
Compare
b782293 to
ab9e7f8
Compare
0e32eca to
9917752
Compare
b05d317 to
100f423
Compare
|
@GuillaumeLagrange As discussed, I updated the benchmark result types to use |
GuillaumeLagrange
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really cool! Let me know if you want to discuss naming
|
|
||
| #[derive(Debug, Clone, Copy, Serialize, Deserialize)] | ||
| pub struct HeaptrackEvent { | ||
| pub pid: u32, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use pid_t
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's finally time to export a pid_t from runner-shared and use it from here
| use serde::{Deserialize, Serialize}; | ||
|
|
||
| #[derive(Debug, Clone, Serialize, Deserialize)] | ||
| pub struct MarkerResult { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get the initial idea, but MarkerResult does feel a bit weird, especially since this struct has both the markers and the uri by timestamp.
How about ExecutionTimestamps or something like this ?
| Brk { size: u64 }, | ||
| } | ||
|
|
||
| pub trait BenchmarkResultExt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what we are defining here is "A benchmark execution output that we persist to file"
I like Artifact to convey the idea of persisting this to file rather than Result, WDYT ?
And I think Execution rather than Benchmark is more neutral, and technically this is the output of executor.
Since we are persisting these let's try and nail the naming before releasing
| .source("src/bpf/heaptrack.bpf.c") | ||
| .clang_args([ | ||
| "-I", | ||
| &vmlinux::include_path_root().join(arch).to_string_lossy(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly, this builds using bpf headers for the current kernel. Wont that be a huge issue going forward that basically make shipping pre-built heaptrack is going to be impossible ?
| } calloc_size SEC(".maps"); | ||
|
|
||
| SEC("uprobe") | ||
| int uprobe_calloc(struct pt_regs *ctx) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same for this file, if I understand correctly it's not generated, we could make this much less repetitive by using proc macros. WDYT ?
This would also ease integration with custom allocator in the future as we could expose the proc macros and clear and documented arguments to each, and the burden of itnegration would be quite small IMO
100f423 to
863e36a
Compare
863e36a to
5f87e90
Compare
Changes in this PR:
TODO:
Parse the backend responseskipped for now, we'll add this later