Skip to content

Enable ExecuTorch Tensor to store arbitrary resources #8939

@SS-JIA

Description

@SS-JIA

🚀 The feature, motivation and pitch

Currently, ExecuTorch Tensors (henceforth referred to as ETensor) store the pointer to a CPU array containing the Tensor's data. Technically, since ETensor only stores a raw pointer, the pointer could be leading to any resource, but there is an understanding that the data pointer will be pointing to a CPU array.

The consequence of this is that backends such as Vulkan (which performs compute on the GPU) will need to copy the contents of input/output ETensors to/from some kind of specialized memory/representation before and after inference. This adds a copy overhead when using certain delegates.

The copy overhead is unavoidable if the overall inference pipeline produces input data on the CPU, and the outputs must be consumed on the CPU. However, in some cases it is possible to produce/consume data on the same memory type/device used by a delegate, for example using Vulkan delegate for inference on a Vulkan-based rendering platform. In this case the "restriction" that ETensor should only store a CPU buffer adds even more overhead, since inputs/outputs will have to be copied to the CPU to be wrapped with an ETensor only to be copied again to the original memory type/device for inference.

To alleviate the copy overhead in these use-cases, it would be great to provide a mechanism to specify what kind of data structure is being referenced by the raw pointer stored by an ETensor and thus allow ETensor to wrap arbitrary opaque data structures that can be interpreted by delegates.

One possible solution comes from @JacobSzwejbka suggested adding a Device tag to ETensor, which will signal to consumers of the ETensor how the raw pointer should be interpreted.

Alternatives

No response

Additional context

No response

RFC (Optional)

@JacobSzwejbka wrote an internal RFC for adding a Device tag to ETensor. Unfortunately this document is not available externally at the moment.

cc @JacobSzwejbka

Metadata

Metadata

Assignees

Labels

module: runtimeIssues related to the core runtime and code under runtime/triagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

Projects

Status

Backlog

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions