You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I try used the AutoGPTQ tool to quantize the Qwen2.5-3B-Instruct model to 3 bits. I successfully obtained the model in GPTQ format, but when I compiled the script using T-MAC:
Traceback (most recent call last):
File "/home/tmac/tmac/T-MAC/deploy/compile.py", line 244, in <module>
main()
File "/home/tmac/tmac/T-MAC/deploy/compile.py", line 234, in main
compile(**device_kwargs)
File "/home/tmac/tmac/T-MAC/deploy/compile.py", line 130, in compile
qgemm_mod = qgemm_lut.compile(
^^^^^^^^^^^^^^^^^^
File "/home/tmac/tmac/T-MAC/python/t_mac/ops/base.py", line 255, in compile
self.tuning(*args, n_trial=n_trial, thread_affinity=thread_affinity, **eval_kwargs)
File "/home/tmac/tmac/T-MAC/python/t_mac/ops/base.py", line 95, in tuning
task = autotvm.task.create(template_name, args=args, target=self.target)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/tmac/T-MAC/3rdparty/tvm/python/tvm/autotvm/task/task.py", line 480, in create
sch, _ = ret.func(*args)
^^^^^^^^^^^^^^^
File "/home/tmac/tmac/T-MAC/3rdparty/tvm/python/tvm/autotvm/task/task.py", line 240, in __call__
return self.fcustomized(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/tmac/tmac/T-MAC/python/t_mac/ops/base.py", line 71, in _func
tensors = self._compute(*args)
^^^^^^^^^^^^^^^^^^^^
File "/home/tmac/tmac/T-MAC/python/t_mac/ops/qgemm.py", line 138, in _compute
raise TVMError("K({}) must be devisible by group_size({})".format(K, self.group_size))
tvm._ffi.base.TVMError: K(10304) must be devisible by group_size(128)
Does this indicate that there is a problem with my quantization step or that t-mac does not support the direct use of 3-bit models? What additional operations do I need to do?
The text was updated successfully, but these errors were encountered:
@QingtaoLi1
Thanks for your reply.
When I set -gs=64, encountered a new shape problem: tvm._ffi.base.TVMError: K(10320) must be devisible by act_group_size(64)
So, I set -ags=16.
error is as follows: tvm._ffi.base.TVMError: K(10320) must be devisible by group_size(128)
No matter what value I set gs to, I get the same error as above
@xdd130-ags is activation group size (but should not be smaller than -gs), so you should set -gs for the weight shape as well. And currently, they should not be smaller than 32, so 10320 may not be supported now.
I try used the AutoGPTQ tool to quantize the Qwen2.5-3B-Instruct model to 3 bits. I successfully obtained the model in GPTQ format, but when I compiled the script using T-MAC:
I got the following error:
Does this indicate that there is a problem with my quantization step or that t-mac does not support the direct use of 3-bit models? What additional operations do I need to do?
The text was updated successfully, but these errors were encountered: