cuda : fix vmm pool with multi GPU (#4620) · ggerganov/llama.cpp@dc68f00

Commit

cuda : fix vmm pool with multi GPU (#4620)

* cuda : fix vmm pool with multi GPU

* hip

* use recommended granularity instead of minimum

* better error checking

* fix mixtral

* use cudaMemcpy3DPeerAsync

* use cuda_pool_alloc in ggml_cuda_op_mul_mat

* consolidate error checking in ggml_cuda_set_device

* remove unnecessary inlines

ggml-ci

* style fixes

* only use vmm for the main device

* fix scratch buffer size, re-enable vmm pool for all devices

* remove unnecessary check id != g_main_device

Loading branch information

slaren authored Dec 26, 2023

1 parent de8e496 commit dc68f00

0 comments on commit `dc68f00`

Please sign in to comment.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Commit

There are no files selected for viewing

0 comments on commit `dc68f00`

Commit

There are no files selected for viewing

0 comments on commit dc68f00

0 comments on commit `dc68f00`