QLora and SpintQuant recipes fail to export on CI

### 🐛 Describe the bug

The job starts failing since last Thursday: https://hud.pytorch.org/hud/pytorch/executorch/main/1?per_page=50&name_filter=export-models%20(meta-llama&mergeLF=true

It looks like the root cause is this PR: https://github.com/pytorch/executorch/pull/7927

Stacktrace:
```
2025-01-31T00:34:23.3001996Z + DOWNLOADED_PATH=/var/lib/ci-user/.cache/huggingface/hub/models--meta-llama--Llama-3.2-1B-Instruct-QLORA_INT4_EO8/snapshots/3fdf98b6bc1069f632a468b0676299a0a1b65071
2025-01-31T00:34:23.3007472Z + python -m examples.models.llama.export_llama --model llama3_2 --checkpoint /var/lib/ci-user/.cache/huggingface/hub/models--meta-llama--Llama-3.2-1B-Instruct-QLORA_INT4_EO8/snapshots/3fdf98b6bc1069f632a468b0676299a0a1b65071/consolidated.00.pth --params /var/lib/ci-user/.cache/huggingface/hub/models--meta-llama--Llama-3.2-1B-Instruct-QLORA_INT4_EO8/snapshots/3fdf98b6bc1069f632a468b0676299a0a1b65071/params.json -qat -lora 16 --preq_mode 8da4w_output_8da8w --preq_group_size 32 --preq_embedding_quantize 8,0 --use_sdpa_with_kv_cache -kv -X --xnnpack-extended-ops -d fp32 --max_seq_length 2048 --output_name llama-3.2-1b-instruct-qlora-int4-eo8_llama3_qlora.pte --metadata '{"get_bos_id":128000, "get_eos_ids":[128009, 128001]}'
2025-01-31T00:34:23.3012177Z Traceback (most recent call last):
2025-01-31T00:34:23.3012854Z   File "/opt/conda/envs/py_3.10/lib/python3.10/runpy.py", line 196, in _run_module_as_main
2025-01-31T00:34:23.3013599Z     return _run_code(code, main_globals, None,
2025-01-31T00:34:23.3014243Z   File "/opt/conda/envs/py_3.10/lib/python3.10/runpy.py", line 86, in _run_code
2025-01-31T00:34:23.3014870Z     exec(code, run_globals)
2025-01-31T00:34:23.3015517Z   File "/pytorch/executorch/examples/models/llama/export_llama.py", line 32, in <module>
2025-01-31T00:34:23.3016259Z     main()  # pragma: no cover
2025-01-31T00:34:23.3016899Z   File "/pytorch/executorch/examples/models/llama/export_llama.py", line 28, in main
2025-01-31T00:34:23.3017583Z     export_llama(args)
2025-01-31T00:34:23.3018265Z   File "/pytorch/executorch/examples/models/llama/export_llama_lib.py", line 540, in export_llama
2025-01-31T00:34:23.3019069Z     builder = _export_llama(args)
2025-01-31T00:34:23.3019817Z   File "/pytorch/executorch/examples/models/llama/export_llama_lib.py", line 677, in _export_llama
2025-01-31T00:34:23.3020602Z     _validate_args(args)
2025-01-31T00:34:23.3021305Z   File "/pytorch/executorch/examples/models/llama/export_llama_lib.py", line 650, in _validate_args
2025-01-31T00:34:23.3022094Z     raise ValueError(
2025-01-31T00:34:23.3023561Z ValueError: max_context_length 128 must be >= max_seq_len 2048. max_context_length impacts kv cache size that is used to remember history, while max_seq_length refers to user prompt length. Please use --max_context_length to specify context length.
2025-01-31T00:34:23.3046001Z ##[error]Process completed with exit code 1.
```

### Versions

trunk

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

QLora and SpintQuant recipes fail to export on CI #8154

🐛 Describe the bug

Versions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

QLora and SpintQuant recipes fail to export on CI #8154

Description

🐛 Describe the bug

Versions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions