经过几番捣腾，后台报语法错误：TypeError: '<' not supported between instances of 'tuple' and 'float'`

经过几番捣腾，载入int4的模型OK了，浏览器提交prompt，后台报语法错误如下。
ubuntu：2204
NVIDIA-SMI 530.41.03        

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:07:16_PDT_2019
Cuda compilation tools, release 10.1, V10.1.243



`~/MOSS$ python moss_gui_demo.py
Waiting for all devices to be ready, it may take a few minutes...


Running on local URL:  http://0.0.0.0:6006
Running on public URL: https://7b29d06f6fba682b.gradio.live
This share link expires in 72 hours. For free permanent hosting and GPU upgrades (NEW!), check out Spaces: https://huggingface.co/spaces


Traceback (most recent call last):
  File "/home/good/anaconda3/envs/moss/lib/python3.8/site-packages/gradio/routes.py", line 401, in run_predict
    output = await app.get_blocks().process_api(
  File "/home/good/anaconda3/envs/moss/lib/python3.8/site-packages/gradio/blocks.py", line 1302, in process_api
    result = await self.call_function(
  File "/home/good/anaconda3/envs/moss/lib/python3.8/site-packages/gradio/blocks.py", line 1025, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "/home/good/anaconda3/envs/moss/lib/python3.8/site-packages/anyio/to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/home/good/anaconda3/envs/moss/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
  File "/home/good/anaconda3/envs/moss/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 867, in run
    result = context.run(func, *args)
  File "moss_gui_demo.py", line 122, in predict
    outputs = model.generate(
  File "/home/good/anaconda3/envs/moss/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
    return func(*args, **kwargs)
  File "/home/good/anaconda3/envs/moss/lib/python3.8/site-packages/transformers/generation/utils.py", line 1571, in generate
    return self.sample(
  File "/home/good/anaconda3/envs/moss/lib/python3.8/site-packages/transformers/generation/utils.py", line 2534, in sample
    outputs = self(
  File "/home/good/anaconda3/envs/moss/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/good/MOSS/models/modeling_moss.py", line 678, in forward
    transformer_outputs = self.transformer(
  File "/home/good/anaconda3/envs/moss/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/good/MOSS/models/modeling_moss.py", line 545, in forward
    outputs = block(
  File "/home/good/anaconda3/envs/moss/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/good/MOSS/models/modeling_moss.py", line 270, in forward
    attn_outputs = self.attn(
  File "/home/good/anaconda3/envs/moss/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/good/MOSS/models/modeling_moss.py", line 164, in forward
    qkv = self.qkv_proj(hidden_states)
  File "/home/good/anaconda3/envs/moss/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/good/MOSS/models/quantization.py", line 371, in forward
    out = QuantLinearFunction.apply(x.reshape(-1, x.shape[-1]), self.qweight, self.scales,
  File "/home/good/anaconda3/envs/moss/lib/python3.8/site-packages/torch/cuda/amp/autocast_mode.py", line 94, in decorate_fwd
    return fwd(*args, **kwargs)
  File "/home/good/MOSS/models/quantization.py", line 283, in forward
    output = matmul248(input, qweight, scales, qzeros, g_idx, bits, maxq)
  File "/home/good/MOSS/models/quantization.py", line 254, in matmul248
    matmul_248_kernel[grid](input, qweight, output,
  File "/home/good/MOSS/models/custom_autotune.py", line 93, in run
    self.cache[key] = builtins.min(timings, key=timings.get)
TypeError: '<' not supported between instances of 'tuple' and 'float'`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

经过几番捣腾，后台报语法错误：TypeError: '<' not supported between instances of 'tuple' and 'float'` #159

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

经过几番捣腾，后台报语法错误：TypeError: '<' not supported between instances of 'tuple' and 'float'` #159

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions