Skip to content

Memory Profiling in DevTools #8911

Open
@GregoryComer

Description

@GregoryComer

🚀 The feature, motivation and pitch

Measuring inference and load-time memory consumption is a common task when looking to use ExecuTorch. It's useful to determine if a model can actually run on consumer hardware and to check parity against other inference solutions. Users can profile their entire app, but it takes a bit of work to figure out what is owned by ET, and it doesn't provide any detailed breakdown of where the memory usage is coming from.

We have performance profiling in the ExecuTorch devtools. It would be good to also have memory profiling support. What exactly is captured at runtime needs some discussion, but as a user, I'd ideally want to know the following:

  • What is the peak memory used by the framework (including delegates) during inference?
    • How much is owned by ET?
    • How much is owned by delegates?

Some nice to haves:

  • How much (peak) memory did we use during model load?
  • Can we take RSS delta measurements on supported systems?

There is some overlap between this and memory planning visualization. In my opinion, memory profiling tooling fills an important role that static memory planning visualization can't when it comes to delegates that dynamically allocate memory, and for tracking other dynamic memory allocations (such as from kernels or dynamic unbound tensors).

Alternatives

No response

Additional context

No response

RFC (Optional)

No response

cc @Jack-Khuu

Metadata

Metadata

Assignees

Labels

module: devtoolsIssues related to developer tools and code under devtools/triagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

Type

No type

Projects

Status

Backlog

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions