T-MAC reports an error when using the qwen2.5 3-bit model #76

xdd130 · 2024-12-04T02:38:29Z

I try used the AutoGPTQ tool to quantize the Qwen2.5-3B-Instruct model to 3 bits. I successfully obtained the model in GPTQ format, but when I compiled the script using T-MAC：

python compile.py -fa -o tuned -da -nt 8 -tb -gc -gs 128 -ags 64 -m gptq-auto -md /home/phytium/yq/Qwen2.5-3B-yq/Qwen2.5-3B-Quant-3b -t

I got the following error:

Traceback (most recent call last):
  File "/home/tmac/tmac/T-MAC/deploy/compile.py", line 244, in <module>
    main()
  File "/home/tmac/tmac/T-MAC/deploy/compile.py", line 234, in main
    compile(**device_kwargs)
  File "/home/tmac/tmac/T-MAC/deploy/compile.py", line 130, in compile
    qgemm_mod = qgemm_lut.compile(
                ^^^^^^^^^^^^^^^^^^
  File "/home/tmac/tmac/T-MAC/python/t_mac/ops/base.py", line 255, in compile
    self.tuning(*args, n_trial=n_trial, thread_affinity=thread_affinity, **eval_kwargs)
  File "/home/tmac/tmac/T-MAC/python/t_mac/ops/base.py", line 95, in tuning
    task = autotvm.task.create(template_name, args=args, target=self.target)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tmac/T-MAC/3rdparty/tvm/python/tvm/autotvm/task/task.py", line 480, in create
    sch, _ = ret.func(*args)
             ^^^^^^^^^^^^^^^
  File "/home/tmac/tmac/T-MAC/3rdparty/tvm/python/tvm/autotvm/task/task.py", line 240, in __call__
    return self.fcustomized(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tmac/tmac/T-MAC/python/t_mac/ops/base.py", line 71, in _func
    tensors = self._compute(*args)
              ^^^^^^^^^^^^^^^^^^^^
  File "/home/tmac/tmac/T-MAC/python/t_mac/ops/qgemm.py", line 138, in _compute
    raise TVMError("K({}) must be devisible by group_size({})".format(K, self.group_size))
tvm._ffi.base.TVMError: K(10304) must be devisible by group_size(128)

Does this indicate that there is a problem with my quantization step or that t-mac does not support the direct use of 3-bit models? What additional operations do I need to do?

The text was updated successfully, but these errors were encountered:

QingtaoLi1 · 2024-12-06T06:33:09Z

tvm._ffi.base.TVMError: K(10304) must be devisible by group_size(128)

@xdd130 Does your model have any weight of shape 10304? Since 10304=64*161, you should set -gs=64 in the command.

xdd130 · 2024-12-09T10:17:30Z

@QingtaoLi1
Thanks for your reply.
When I set -gs=64, encountered a new shape problem：
tvm._ffi.base.TVMError: K(10320) must be devisible by act_group_size(64)
So, I set -ags=16.

python compile.py -fa -o tuned -da -nt 8 -tb -gc -gs 64 -ags 16 -m gptq-auto -md /home/phytium/yq/Qwen2.5-3B-yq/Qwen2.5-3B-Quant-3b -t

error is as follows：
tvm._ffi.base.TVMError: K(10320) must be devisible by group_size(128)
No matter what value I set gs to, I get the same error as above

QingtaoLi1 · 2024-12-10T07:15:58Z

@xdd130 -ags is activation group size (but should not be smaller than -gs), so you should set -gs for the weight shape as well. And currently, they should not be smaller than 32, so 10320 may not be supported now.

xdd130 · 2024-12-12T06:06:58Z

That's right, thanks for your reply！！

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

T-MAC reports an error when using the qwen2.5 3-bit model #76

T-MAC reports an error when using the qwen2.5 3-bit model #76

xdd130 commented Dec 4, 2024

QingtaoLi1 commented Dec 6, 2024

xdd130 commented Dec 9, 2024

QingtaoLi1 commented Dec 10, 2024

xdd130 commented Dec 12, 2024

T-MAC reports an error when using the qwen2.5 3-bit model #76

T-MAC reports an error when using the qwen2.5 3-bit model #76

Comments

xdd130 commented Dec 4, 2024

QingtaoLi1 commented Dec 6, 2024

xdd130 commented Dec 9, 2024

QingtaoLi1 commented Dec 10, 2024

xdd130 commented Dec 12, 2024