-
Notifications
You must be signed in to change notification settings - Fork 64
Open
Description
Describe the bug
I tried to compile and test inference of Qwen3-30B-A3B-Instruct-2507(which is listed in validated model), but process killed (possibly due to OOM).
To Reproduce
Steps to reproduce the behavior:
- Command Used to run / script used
- Error details
(qeff_env) ai@ai-ABPI-130:~/workspace/gmkim/efficient-transformers$ python -m QEfficient.cloud.infer --model_name Qwen/Qwen3-30B-A3B-Instruct-2507 --batch_size 1 --prompt_len 32 --ctx_len 128 --mxfp6 --num_cores 24 --device_group [0] --prompt "My name is" --mos 1 --aic_enable_depth_first
/home/ai/workspace/gmkim/qeff_env/lib/python3.10/site-packages/onnxscript/converter.py:816: FutureWarning: 'onnxscript.values.Op.param_schemas' is deprecated in version 0.1 and will be removed in the future. Please use '.op_signature' instead.
param_schemas = callee.param_schemas()
WARNING - QEfficient - mxfp6 is going to be deprecated in a future release, use -mxfp6_matmul instead.
[Warning]: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████| 16/16 [00:38<00:00, 2.41s/it]
KilledExpected behavior
Model compiled and inference proceed
Screenshots
Environment (please complete the following information):
- OS: Linux ai-ABPI-130 6.8.0-86-generic 87~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Mon Sep 29 09:48:07 UTC 2 x86_64 x86_64 x86_64 GNU/Linux
- Environment details with packages version etc.
- Version/Branch/Commit ID [e.g. 22]: v1
Additional context
Add any other context about the problem here.
Metadata
Metadata
Assignees
Labels
No labels