Set default other = 0 for masked load #356

ziliangzl · 2025-09-18T05:29:47Z

For masked loads where other is not specified, the Triton CUDA backend fills the masked elements with 0 by default. This change ensures that the behavior matches the Triton CUDA backend results.

A reference test case named basic_load in Triton can be found here: test/Conversion/tritongpu_to_llvm.mlir
This test case specifies an other value. Its generated LLVM IR includes:

(module attributes {...}
  llvm.func @basic_load(...) {
    ...
    %10 = llvm.inline_asm has_side_effects ... "mov.u32 $0, $1;\0A\09@$3 ld.global.b32 { $0 }, [ $2 + 0 ];", ...
    ...
  }
)

Another test case vecadd_masked_vec1 without an other value generates LLVM IR like:

(module attributes {...}
  llvm.func @vecadd_masked_vec1(...) {
    ...
    %72 = llvm.inline_asm has_side_effects asm_dialect = att operand_attrs = [] "mov.u32 $0, 0x0;\0A\09@$2 ld.global.b32 { $0 }, [ $1 + 0 ];", "=r,l,b" %70, %71 : (!llvm.ptr<1>, i1) -> i32
    ...
  }
)

In this case, the masked elements are always filled with 0x0 when other is not specified.

ziliangzl · 2025-09-18T05:34:41Z

The tests for this PR have not been updated yet. It currently conflicts with #355, and the tests will be updated once #355 is resolved.

ziliangzl · 2025-09-29T05:19:07Z

The tests for this PR have not been updated yet. It currently conflicts with #355, and the tests will be updated once #355 is resolved.

I have already updated tests, this pr is ready now.

bmyerz0 · 2025-10-03T18:02:36Z

While we often treat the CUDA backend as the language spec, this particular behavior is not listed in the documentation for load https://triton-lang.org/main/python-api/generated/triton.language.load.html#triton.language.load. It seems like a performance issue and lack of programmer control to force backends to initialize.

Do you have any references to discussions on this issue in Triton that this other=0 should be default behavior?

ziliangzl added 2 commits September 28, 2025 17:18

Set default other = 0 for masked load

8f27555

update test

9c6a575

ziliangzl force-pushed the fix/masked-value branch from 227c4cc to 9c6a575 Compare September 29, 2025 05:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Set default other = 0 for masked load #356

Set default other = 0 for masked load #356

Uh oh!

ziliangzl commented Sep 18, 2025

Uh oh!

ziliangzl commented Sep 18, 2025

Uh oh!

ziliangzl commented Sep 29, 2025

Uh oh!

bmyerz0 commented Oct 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Set default other = 0 for masked load #356

Are you sure you want to change the base?

Set default other = 0 for masked load #356

Uh oh!

Conversation

ziliangzl commented Sep 18, 2025

Uh oh!

ziliangzl commented Sep 18, 2025

Uh oh!

ziliangzl commented Sep 29, 2025

Uh oh!

bmyerz0 commented Oct 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants