CUDA: add set #14980

jeemzz147 · 2025-07-31T03:50:21Z

Make sure to read the contributing guidelines before submitting a PR

jeemzz147 · 2025-07-31T06:27:07Z

Part of #14909

ggml/src/ggml-cuda/set.cu

jeemzz147 · 2025-07-31T17:25:20Z

Hi, @JohannesGaessler Could you please review the changes when you have a chance? Thank you in advance!

JohannesGaessler

One of my current goals is to consolidate and deduplicate the code around copying data in the CUDA backend. As such I think rather than adding new kernels here it would be better to re-use the existing code. If the operation is not inlace you can use cudaMemsetAsync to set dst with the contents of src0. Afterwards you can use ggml_cpy_flt_cuda in cpy.cu to do the copy. That kernel does not have an argument for the offset but it's not needed as you can simply apply the offset in host code.

jeemzz147 · 2025-08-01T02:51:49Z

@JohannesGaessler

I have already used cudaMemcpyAsync to copy the contents from src0 to dst.

Regarding the implementation of ggml_cpy_flt_cuda, the functionality of the set kernel and ggml_cpy_flt_cuda may be inconsistent.
- ggml_cpy_flt_cuda
- set_f32_cuda
- The element size of src1 may be less than in dst.
- The set kernel doesn’t rely on any implicit shape in dst itself;
- The dst of the set may be a slice, and when writing from src1 to dst, so it may not be aware of dst->ne. Instead, dst->nb1, nb2, and nb3 are provided by the parameters.
- Could it be that the data of src1 within dst are not contiguous?

If I’m wrong, please correct me. Thanks for your help!

jeemzz147 · 2025-08-19T09:30:55Z

Hi @am17an, thanks again for your previous review. Since @JohannesGaessler hasn’t had a chance to respond for a few weeks, would it be possible to ask another maintainer or contributor to review this as well? I’d really appreciate any additional feedback to help move this forward.

JohannesGaessler · 2025-08-19T11:42:59Z

Sorry, I forgot about this PR. The code in cpy.cuh can do the same as the one in this PR (and more), you just need to call it with the right parameters.

The element size of src1 may be less than in dst.

Set both types to float.

The set kernel doesn’t rely on any implicit shape in dst itself;

Use the same shape twice.

Jeemzz added 2 commits July 30, 2025 17:11

draft: set cuda

9a53f40

draft: cuda set op

e38e857

github-actions bot added Nvidia GPU Issues specific to Nvidia GPUs ggml changes relating to the ggml tensor library for machine learning labels Jul 31, 2025

am17an reviewed Jul 31, 2025

View reviewed changes

ggml/src/ggml-cuda/set.cu Outdated Show resolved Hide resolved

am17an reviewed Jul 31, 2025

View reviewed changes

ggml/src/ggml-cuda/set.cu Show resolved Hide resolved

jeemzz147 marked this pull request as ready for review July 31, 2025 08:35

am17an requested a review from JohannesGaessler July 31, 2025 08:36

am17an approved these changes Jul 31, 2025

View reviewed changes

JohannesGaessler reviewed Jul 31, 2025

View reviewed changes

Replace copy kernel with cudaMemcpyAsync

bfdca26

jeemzz147 requested a review from JohannesGaessler August 1, 2025 13:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

CUDA: add set #14980

CUDA: add set #14980

jeemzz147 commented Jul 31, 2025

Uh oh!

jeemzz147 commented Jul 31, 2025

Uh oh!

Uh oh!

Uh oh!

jeemzz147 commented Jul 31, 2025

Uh oh!

JohannesGaessler left a comment

Uh oh!

jeemzz147 commented Aug 1, 2025 •

edited

Loading

Uh oh!

jeemzz147 commented Aug 19, 2025

Uh oh!

JohannesGaessler commented Aug 19, 2025

Uh oh!

Uh oh!

CUDA: add set #14980

Are you sure you want to change the base?

CUDA: add set #14980

Conversation

jeemzz147 commented Jul 31, 2025

Uh oh!

jeemzz147 commented Jul 31, 2025

Uh oh!

Uh oh!

Uh oh!

jeemzz147 commented Jul 31, 2025

Uh oh!

JohannesGaessler left a comment

Choose a reason for hiding this comment

Uh oh!

jeemzz147 commented Aug 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jeemzz147 commented Aug 19, 2025

Uh oh!

JohannesGaessler commented Aug 19, 2025

Uh oh!

Uh oh!

jeemzz147 commented Aug 1, 2025 •

edited

Loading