does it support multiple gpu #691

BASSEM45325 · 2025-01-28T09:10:23Z

System Info / 系統信息

i used i2v and add .to('cuda') with removing offline but it still not using all gpus iam using 4 a10 with 24 vram

Information / 问题信息

The official example scripts / 官方的示例脚本
My own modified scripts / 我自己修改的脚本和任务

Reproduction / 复现过程

text_encoder = T5EncoderModel.from_pretrained("THUDM/CogVideoX-5b-I2V", subfolder="text_encoder", torch_dtype=torch.bfloat16)

quantize_(text_encoder, quantization())

transformer = CogVideoXTransformer3DModel.from_pretrained("THUDM/CogVideoX-5b-I2V",subfolder="transformer", torch_dtype=torch.bfloat16)

quantize_(transformer, quantization())

vae = AutoencoderKLCogVideoX.from_pretrained("THUDM/CogVideoX-5b-I2V", subfolder="vae", torch_dtype=torch.bfloat16)

quantize_(vae, quantization())

Create pipeline and run inference

pipe = CogVideoXImageToVideoPipeline.from_pretrained(
"THUDM/CogVideoX-5b-I2V",
text_encoder=text_encoder,
transformer=transformer,
vae=vae,
torch_dtype=torch.bfloat16,
).to('cuda')

Manually assign components to GPUs

pipe.vae.enable_tiling()
pipe.vae.enable_slicing()

print(pipe.text_encoder.device)
print(pipe.transformer.device)
print(pipe.vae.device)

video = pipe(
prompt='test',
image=image,
num_videos_per_prompt=1,
num_inference_steps=50,
num_frames=49,
guidance_scale=6,
).frames[0]
out = 'temp.mp4'
export_to_video(video, f'{out}', fps=8)

Expected behavior / 期待表现

got only one gpu uitilized

zRzRzRzRzRzRzR · 2025-01-28T12:16:31Z

Please check inference/cli_demo.py to see how to distribute to multiple GPUs, but this does not support quantization.

BASSEM45325 · 2025-01-28T12:55:03Z

i have tried and got error

pipe = CogVideoXImageToVideoPipeline.from_pretrained(
"THUDM/CogVideoX-5b-I2V",
torch_dtype=torch.bfloat16,
device_map="balanced"
)

pipe.scheduler = CogVideoXDPMScheduler.from_config(pipe.scheduler.config, timestep_spacing="trailing")
pipe.vae.enable_tiling()
pipe.vae.enable_slicing()

video = pipe(
prompt=prompt,
image=image,
num_videos_per_prompt=1,
num_inference_steps=50,
num_frames=49,
guidance_scale=6,
use_dynamic_cfg=True,
generator=torch.Generator().manual_seed(112),
).frames[0]

export_to_video(video, "output.mp4", fps=8)

Loading checkpoint shards: 100%|█| 2/2 [00:01<00:00, 1.24it
Loading pipeline components...: 100%|█| 5/5 [00:07<00:00, 1
Traceback (most recent call last):
File "/home/ec2-user/test/t.py", line 21, in
video = pipe(
File "/home/ec2-user/test/.venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/home/ec2-user/test/.venv/lib/python3.10/site-packages/diffusers/pipelines/cogvideo/pipeline_cogvideox_image2video.py", line 782, in call
latents, image_latents = self.prepare_latents(
File "/home/ec2-user/test/.venv/lib/python3.10/site-packages/diffusers/pipelines/cogvideo/pipeline_cogvideox_image2video.py", line 407, in prepare_latents
image_latents = torch.cat([image_latents, latent_padding], dim=1)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:2 and cuda:1! (when checking argument for argument tensors in method wrapper_CUDA_cat)

BASSEM45325 · 2025-01-28T18:11:55Z

it solved when i used this i dont know why it not working on 4 gpus it only working on 2 gpus

os.environ["CUDA_VISIBLE_DEVICES"] = "0,1

BASSEM45325 · 2025-01-29T12:10:35Z

and when i put .to('cuda') igot error also

zRzRzRzRzRzRzR self-assigned this Jan 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

does it support multiple gpu #691

does it support multiple gpu #691

BASSEM45325 commented Jan 28, 2025 •

edited

Loading

zRzRzRzRzRzRzR commented Jan 28, 2025

BASSEM45325 commented Jan 28, 2025 •

edited

Loading

BASSEM45325 commented Jan 28, 2025

BASSEM45325 commented Jan 29, 2025

does it support multiple gpu #691

does it support multiple gpu #691

Comments

BASSEM45325 commented Jan 28, 2025 • edited Loading

System Info / 系統信息

Information / 问题信息

Reproduction / 复现过程

quantize_(text_encoder, quantization())

quantize_(transformer, quantization())

quantize_(vae, quantization())

Create pipeline and run inference

Manually assign components to GPUs

Expected behavior / 期待表现

zRzRzRzRzRzRzR commented Jan 28, 2025

BASSEM45325 commented Jan 28, 2025 • edited Loading

BASSEM45325 commented Jan 28, 2025

BASSEM45325 commented Jan 29, 2025

BASSEM45325 commented Jan 28, 2025 •

edited

Loading

BASSEM45325 commented Jan 28, 2025 •

edited

Loading