Skip to content

QLora and SpintQuant recipes fail to export on CI #8154

Open
@guangy10

Description

@guangy10

🐛 Describe the bug

The job starts failing since last Thursday: https://hud.pytorch.org/hud/pytorch/executorch/main/1?per_page=50&name_filter=export-models%20(meta-llama&mergeLF=true

It looks like the root cause is this PR: #7927

Stacktrace:

2025-01-31T00:34:23.3001996Z + DOWNLOADED_PATH=/var/lib/ci-user/.cache/huggingface/hub/models--meta-llama--Llama-3.2-1B-Instruct-QLORA_INT4_EO8/snapshots/3fdf98b6bc1069f632a468b0676299a0a1b65071
2025-01-31T00:34:23.3007472Z + python -m examples.models.llama.export_llama --model llama3_2 --checkpoint /var/lib/ci-user/.cache/huggingface/hub/models--meta-llama--Llama-3.2-1B-Instruct-QLORA_INT4_EO8/snapshots/3fdf98b6bc1069f632a468b0676299a0a1b65071/consolidated.00.pth --params /var/lib/ci-user/.cache/huggingface/hub/models--meta-llama--Llama-3.2-1B-Instruct-QLORA_INT4_EO8/snapshots/3fdf98b6bc1069f632a468b0676299a0a1b65071/params.json -qat -lora 16 --preq_mode 8da4w_output_8da8w --preq_group_size 32 --preq_embedding_quantize 8,0 --use_sdpa_with_kv_cache -kv -X --xnnpack-extended-ops -d fp32 --max_seq_length 2048 --output_name llama-3.2-1b-instruct-qlora-int4-eo8_llama3_qlora.pte --metadata '{"get_bos_id":128000, "get_eos_ids":[128009, 128001]}'
2025-01-31T00:34:23.3012177Z Traceback (most recent call last):
2025-01-31T00:34:23.3012854Z   File "/opt/conda/envs/py_3.10/lib/python3.10/runpy.py", line 196, in _run_module_as_main
2025-01-31T00:34:23.3013599Z     return _run_code(code, main_globals, None,
2025-01-31T00:34:23.3014243Z   File "/opt/conda/envs/py_3.10/lib/python3.10/runpy.py", line 86, in _run_code
2025-01-31T00:34:23.3014870Z     exec(code, run_globals)
2025-01-31T00:34:23.3015517Z   File "/pytorch/executorch/examples/models/llama/export_llama.py", line 32, in <module>
2025-01-31T00:34:23.3016259Z     main()  # pragma: no cover
2025-01-31T00:34:23.3016899Z   File "/pytorch/executorch/examples/models/llama/export_llama.py", line 28, in main
2025-01-31T00:34:23.3017583Z     export_llama(args)
2025-01-31T00:34:23.3018265Z   File "/pytorch/executorch/examples/models/llama/export_llama_lib.py", line 540, in export_llama
2025-01-31T00:34:23.3019069Z     builder = _export_llama(args)
2025-01-31T00:34:23.3019817Z   File "/pytorch/executorch/examples/models/llama/export_llama_lib.py", line 677, in _export_llama
2025-01-31T00:34:23.3020602Z     _validate_args(args)
2025-01-31T00:34:23.3021305Z   File "/pytorch/executorch/examples/models/llama/export_llama_lib.py", line 650, in _validate_args
2025-01-31T00:34:23.3022094Z     raise ValueError(
2025-01-31T00:34:23.3023561Z ValueError: max_context_length 128 must be >= max_seq_len 2048. max_context_length impacts kv cache size that is used to remember history, while max_seq_length refers to user prompt length. Please use --max_context_length to specify context length.
2025-01-31T00:34:23.3046001Z ##[error]Process completed with exit code 1.

Versions

trunk

Metadata

Metadata

Assignees

Labels

module: benchmarkIssues related to the benchmark infrastructuremodule: ciIssues related to continuous integrationtriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

Type

No type

Projects

Status

No status

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions