Question Regarding Code Execution in the Calibration Process.

Dear @cccclai 

I’m reviewing the code while following the [guide](https://github.com/pytorch/executorch/blob/main/examples/demo-apps/android/LlamaDemo/docs/delegates/qualcomm_README.md)(Export with Spinquant) you provided for converting the Llama3.2-3B-Instruct model with Qualcomm SpinQuant. When I execute the `_export_llama` function in the `export_llama_lib.py` file, the `pt2e_quantize(quantizers)` function is called. Within this function, the `pt2e_calibrate` function is executed before the `convert_pt2e` function. Why is `pt2e_calibrate` performed before` convert_pt2e` here? 
Generally, wouldn't it make more sense to perform calibration after quantization?

Thank you

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Question Regarding Code Execution in the Calibration Process. #6629

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Question Regarding Code Execution in the Calibration Process. #6629

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions