Memory Profiling in DevTools

### 🚀 The feature, motivation and pitch

Measuring inference and load-time memory consumption is a common task when looking to use ExecuTorch. It's useful to determine if a model can actually run on consumer hardware and to check parity against other inference solutions. Users can profile their entire app, but it takes a bit of work to figure out what is owned by ET, and it doesn't provide any detailed breakdown of where the memory usage is coming from.

We have performance profiling in the ExecuTorch devtools. It would be good to also have memory profiling support. What exactly is captured at runtime needs some discussion, but as a user, I'd ideally want to know the following:
- What is the peak memory used by the framework (including delegates) during inference?
  - How much is owned by ET?
  - How much is owned by delegates?

Some nice to haves:
- How much (peak) memory did we use during model load?
- Can we take RSS delta measurements on supported systems?

There is some overlap between this and memory planning visualization. In my opinion, memory profiling tooling fills an important role that static memory planning visualization can't when it comes to delegates that dynamically allocate memory, and for tracking other dynamic memory allocations (such as from kernels or dynamic unbound tensors).

### Alternatives

_No response_

### Additional context

_No response_

### RFC (Optional)

_No response_

cc @Jack-Khuu

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Memory Profiling in DevTools #8911

🚀 The feature, motivation and pitch

Alternatives

Additional context

RFC (Optional)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Memory Profiling in DevTools #8911

Description

🚀 The feature, motivation and pitch

Alternatives

Additional context

RFC (Optional)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions