-
Notifications
You must be signed in to change notification settings - Fork 54
Description
测试环境:Ubuntu 22.04,128GB RAM,Python 3.11,CUDA 12.4,Torch 2.6.0,RTX 4090 24GB 单显卡。
测试模型:Ming-Lite-Omni-1.5,量化方式:NF4 与 bf16 相结合。
首先分析了模型结构、权重和精度,个人认为应该可以通过量化,把项目在单卡 24GB 显存内跑起来,但不确定量化对输出质量的影响有多大。
在测试过程中,除了视频内容识别 OOM 外,全部模型加载后,现在基本在 20-23GB 左右。
在测试过程中,发现 repo 中的代码,很多还是 v1.0 时候的代码,好像还没有完全根据 v1.5 的升级,将相应代码全部调整完成。
模型刚加载完成后,显存占用 20GB 左右,如下:
基础文字对话功能正常:
图像识别对话功能正常:
视频识别对话功能 OOM:
语音识别对话功能正常:
文生图功能正常:
文字输入,语音输出,或者语音对话功能报错,好像是 talker 部分模型加载问题:
报错日志:
history: [(('/home/tkadm/Ming/temp/e8b1237d57a5922949fe61c6bca802ae2ddd7d63c4159763d08acaa7aaef4683/speechQA_sample.wav',), None)]
[{'role': 'HUMAN', 'content': [{'type': 'audio', 'audio': '/home/tkadm/Ming/temp/e8b1237d57a5922949fe61c6bca802ae2ddd7d63c4159763d08acaa7aaef4683/speechQA_sample.wav'}]}]
Traceback (most recent call last):
File "/home/tkadm/miniconda3/envs/video/lib/python3.11/site-packages/gradio/queueing.py", line 715, in process_events
response = await route_utils.call_process_api(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/tkadm/miniconda3/envs/video/lib/python3.11/site-packages/gradio/route_utils.py", line 322, in call_process_api
output = await app.get_blocks().process_api(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/tkadm/miniconda3/envs/video/lib/python3.11/site-packages/gradio/blocks.py", line 2220, in process_api
result = await self.call_function(
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/tkadm/miniconda3/envs/video/lib/python3.11/site-packages/gradio/blocks.py", line 1743, in call_function
prediction = await utils.async_iteration(iterator)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/tkadm/miniconda3/envs/video/lib/python3.11/site-packages/gradio/utils.py", line 739, in async_iteration
return await anext(iterator)
^^^^^^^^^^^^^^^^^^^^^
File "/home/tkadm/miniconda3/envs/video/lib/python3.11/site-packages/gradio/utils.py", line 733, in anext
return await anyio.to_thread.run_sync(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/tkadm/miniconda3/envs/video/lib/python3.11/site-packages/anyio/to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/tkadm/miniconda3/envs/video/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 2470, in run_sync_in_worker_thread
return await future
^^^^^^^^^^^^
File "/home/tkadm/miniconda3/envs/video/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 967, in run
result = context.run(func, *args)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/tkadm/miniconda3/envs/video/lib/python3.11/site-packages/gradio/utils.py", line 716, in run_sync_iterator_async
return next(iterator)
^^^^^^^^^^^^^^
File "/home/tkadm/miniconda3/envs/video/lib/python3.11/site-packages/gradio/utils.py", line 877, in gen_wrapper
response = next(iterator)
^^^^^^^^^^^^^^
File "/home/tkadm/Ming/gradio_demo_me-old.py", line 344, in chat_predict
text, audio_path, image_path = generate(model, processor, messages, state, use_audio_response=use_audio_response)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/tkadm/Ming/gradio_demo_me-old.py", line 252, in generate
audio_path = text_to_speach(model, text, outputs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/tkadm/Ming/gradio_demo_me-old.py", line 151, in text_to_speach
audio_detokenizer = AudioDetokenizer(
^^^^^^^^^^^^^^^^^
File "/home/tkadm/Ming/modeling_bailing_talker.py", line 661, in init
self.model.load(flow_model_path, hifigan_model_path)
File "/home/tkadm/Ming/audio_detokenizer/cli/flow_stream_model.py", line 39, in load
self.flow.load_state_dict(torch.load(flow_model, map_location=self.device), strict=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/tkadm/miniconda3/envs/video/lib/python3.11/site-packages/torch/serialization.py", line 1470, in load
raise pickle.UnpicklingError(_get_wo_message(str(e))) from None
_pickle.UnpicklingError: Weights only load failed. This file can still be loaded, to do so you have two options, do those steps only if you trust the source of the checkpoint.
(1) In PyTorch 2.6, we changed the default value of the weights_only argument in torch.load from False to True. Re-running torch.load with weights_only set to False will likely succeed, but it can result in arbitrary code execution. Do it only if you got the file from a trusted source.
(2) Alternatively, to load with weights_only=True please check the recommended steps in the following error message.
WeightsUnpickler error: Unsupported global: GLOBAL pathlib.PosixPath was not an allowed global by default. Please use torch.serialization.add_safe_globals([PosixPath]) or the torch.serialization.safe_globals([PosixPath]) context manager to allowlist this global if you trust this class/function.
Check the documentation of torch.load to learn more about types accepted by default with weights_only https://pytorch.org/docs/stable/generated/torch.load.html.
图像编辑功能报错,好像是图片尺寸问题:
错误日志如下:
history: [(('/home/tkadm/Ming/temp/16da8d87c405652ce67ca4fe9eb661562eb47afb9ff1b0a3cd872bd3a9a5a1d0/cake.jpg',), None), ('Add a candle on top of the cake', None)]
[{'role': 'HUMAN', 'content': [{'type': 'image', 'image': '/home/tkadm/Ming/temp/16da8d87c405652ce67ca4fe9eb661562eb47afb9ff1b0a3cd872bd3a9a5a1d0/cake.jpg'}, {'type': 'text', 'text': 'Add a candle on top of the cake'}]}]
Traceback (most recent call last):
File "/home/tkadm/miniconda3/envs/video/lib/python3.11/site-packages/gradio/queueing.py", line 715, in process_events
response = await route_utils.call_process_api(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/tkadm/miniconda3/envs/video/lib/python3.11/site-packages/gradio/route_utils.py", line 322, in call_process_api
output = await app.get_blocks().process_api(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/tkadm/miniconda3/envs/video/lib/python3.11/site-packages/gradio/blocks.py", line 2220, in process_api
result = await self.call_function(
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/tkadm/miniconda3/envs/video/lib/python3.11/site-packages/gradio/blocks.py", line 1743, in call_function
prediction = await utils.async_iteration(iterator)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/tkadm/miniconda3/envs/video/lib/python3.11/site-packages/gradio/utils.py", line 739, in async_iteration
return await anext(iterator)
^^^^^^^^^^^^^^^^^^^^^
File "/home/tkadm/miniconda3/envs/video/lib/python3.11/site-packages/gradio/utils.py", line 733, in anext
return await anyio.to_thread.run_sync(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/tkadm/miniconda3/envs/video/lib/python3.11/site-packages/anyio/to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/tkadm/miniconda3/envs/video/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 2470, in run_sync_in_worker_thread
return await future
^^^^^^^^^^^^
File "/home/tkadm/miniconda3/envs/video/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 967, in run
result = context.run(func, *args)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/tkadm/miniconda3/envs/video/lib/python3.11/site-packages/gradio/utils.py", line 716, in run_sync_iterator_async
return next(iterator)
^^^^^^^^^^^^^^
File "/home/tkadm/miniconda3/envs/video/lib/python3.11/site-packages/gradio/utils.py", line 877, in gen_wrapper
response = next(iterator)
^^^^^^^^^^^^^^
File "/home/tkadm/Ming/gradio_demo_me-old.py", line 344, in chat_predict
text, audio_path, image_path = generate(model, processor, messages, state, use_audio_response=use_audio_response)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/tkadm/Ming/gradio_demo_me-old.py", line 256, in generate
image_path = generate_image(model, processor, messages, has_audio=has_audio)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/tkadm/Ming/gradio_demo_me-old.py", line 133, in generate_image
image = model.generate(
^^^^^^^^^^^^^^^
TypeError: modeling_bailingmm.BailingMMNativeForConditionalGeneration.generate() got multiple values for keyword argument 'image_gen_width'
烦请项目团队帮忙分析一下错误原因,谢谢!
由于时间关系,还没来得及详细阅读分析代码,可能很多理解上还不够到位,请项目团队指点。
待代码全部测试通过后,我将把量化后的模型,上传到 https://huggingface.co/wikeeyang 本人的模型目录下分享。






