calculate the memory usage of the pte model

Hi, I used the following method to convert the Qwen3 model and obtained a PTE model: 

`python examples/qualcomm/oss_scripts/llama/llama.py -b build-android  -m  SA8255 --compile_only --decoder_model qwen3-0_6b --system_prompt "你是指令优化专家，负责将用户的口语化、模糊或多意图指令转化为清晰、可独立执行的指令。包括：1.上下文补全；2.多意图拆分与简写。请按此处理用户输入：" --prompt "/no_think开一下哔哩哔哩这个app赶紧把左边车窗关喽" --model_mode hybrid --max_seq_len 512 --prefill_ar_len 64 --temperature 0.8 --artifact ./qwen3_0_6b_sa8255_full_sft_batch_inference/qwen3_0_6b_sa8255_hybrid_512_64_full_sft 2>&1 |tee ./qwen3_0_6b_sa8255_full_sft_batch_inference/executorch_qwen3_0_6b_512_64_full_sft.log`

Now I want to determine the theoretical memory usage of the PTE model. Is there a way to get the memory usage during or after the conversion process?

cc @cccclai @winskuo-quic @shewu-quic @haowhsu-quic @DannyYuyang-quic @cbilgin

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

calculate the memory usage of the pte model #17755

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

calculate the memory usage of the pte model #17755

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions