Skip to content

How to Apply Different Quantization Settings Per Layer in ExecuTorch? #6846

Open
@crinex

Description

@crinex

Dear @kimishpatel @jerryzh168 @shewu-quic

I want to split a model(eg, Llama-3.2-3B) into multiple layers and apply different quantization settings(qnn_8a8w, qnn_16a4w...) to each layer.
Has such a method been tested in ExecuTorch?
If not, could you suggest how this can be achieved?

Thank you

Metadata

Metadata

Assignees

Labels

module: quantizationIssues related to quantizationpartner: qualcommFor backend delegation, kernels, demo, etc. from the 3rd-party partner, QualcommtriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions