add missing mul! implementation #47

ExpandingMan · 2025-02-03T23:17:29Z

Prior to this PR, this package does not define a method for LinearAlgebra.mul!. This low-level method is used by some procedures instead of *. Without the new method included in this PR, such low-level calls could break in some cases. In particular, some of the optimized matmul calls used by Lux.jl would error out with a method ambiguity or scalar indexing of GPU array error when used with GPU arrays.

Ultimately the * methods should probably be removed, but I have left them alone for now to get this in without worrying too much about it breaking anything.

GPU tests pass for me on CUDA.

PR Checklist

Tests are added
Documentation, if applicable

mcabbott

Overloading mul! seems fine.

Re deleting *, note that @less *(rand(2,2), rand(2,2)) shows it calls similar(B), and that will give the wrong type here... although that could be fixed.

julia> using JLArrays

julia> onehotbatch([1,3,2], 0:5) |> jl
6×3 OneHotMatrix(::JLArray{UInt32, 1}) with eltype Bool:
 ⋅  ⋅  ⋅
 1  ⋅  ⋅
 ⋅  ⋅  1
 ⋅  1  ⋅
 ⋅  ⋅  ⋅
 ⋅  ⋅  ⋅

julia> similar(ans, Float32)
6×3 Matrix{Float32}:
 3.0f-44  3.66694f29  3.66699f29
 0.0      1.0f-45     1.0f-45
 2.7f-44  3.66696f29  3.66701f29
 0.0      1.0f-45     1.0f-45
 2.8f-44  3.66697f29  3.0f-45
 0.0      1.0f-45     0.0

mcabbott · 2025-02-04T03:23:10Z

test/gpu.jl

  end
+
+  # some specialized implementations call only mul! and not *, so we must ensure this works
+  @test LinearAlgebra.mul!(similar(gA, 3, 3), gA, y) isa CuArray


This tests only one case. I think that it should test vector & matrix output, each with a OneHotArray, and some reshaped array... and maybe something for which _isonehot(B) === false to test the invoke path?

And it should check that the results are correct, not just the type.

Many of the tests in the GPU suite only check type, but I now have this comparing to * for the time being.

The branching here opens up a huge can of worms which it is a non-goal of mine to fix, see #48.

Sure. But new code really has to test that it produces the correct value. (And enough cases that obvious method ambiguities would be found.)

src/linalg.jl

test/gpu.jl

ExpandingMan · 2025-02-04T22:15:09Z

The overall situation with matrix operations in this package is likely still incredibly wonky, ideally somebody would carefully think through exactly how these should be done and replace all of linalg.jl. The challenge to that is of course primarily due to the need to support GPU-memory-backed arrays. For the time being, I'm just trying to get this to a state where using this package does not automatically fail spectacularly with Lux.

ExpandingMan · 2025-02-04T22:18:09Z

Also, am I correct in assuming that 1.6 is no longer relevant as it is no longer LTS? If so I can remove the 1.6 tests in favor of 1.10.

ExpandingMan · 2025-02-05T22:17:55Z

Alright, I've swapped out 1.6 for 1.10. That's as much as I was planning to fix here, it should no longer immediately fail for someone trying to use it with Lux.

ToucheSir · 2025-02-05T22:54:23Z

Bumping

OneHotArrays.jl/.buildkite/pipeline.yml

Line 8 in 82c8ed4

version: "1.6"

to 1.10 should get GPU CI to pass.

codecov · 2025-02-05T23:22:21Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 95.07%. Comparing base (0ea288d) to head (8c4b6e3).
Report is 1 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main      #47      +/-   ##
==========================================
- Coverage   96.26%   95.07%   -1.20%     
==========================================
  Files           3        3              
  Lines         134      142       +8     
==========================================
+ Hits          129      135       +6     
- Misses          5        7       +2

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

ExpandingMan · 2025-03-24T19:52:54Z

I've admittedly stopped paying attention to this myself, but can we either merge this or set up an extended discussion about what's the right thing to do here? This PR definitely doesn't fix the whole situation, but current latest tag is very broken on GPU right now, and (for me at least, probably everyone?) renders OneHotArrays.jl totally broken on what I believe was its originally intended use case (deep learning).

test/gpu.jl

mcabbott · 2025-03-26T14:29:20Z

test/gpu.jl

  end
+
+  # some specialized implementations call only mul! and not *, so we must ensure this works
+  @test LinearAlgebra.mul!(similar(gA, 3, 3), gA, y) ≈ gA*y


I think it would be good to check a vector case too. And also a non-GPU case.... not 100% sure what happens when tests are run without CUDA, but the new code is LinearAlgebra.mul!(Y::AbstractVecOrMat, A::AbstractMatrix, B::OneHotLike) which certainly includes A::Matrix.

Ok, I've added a test for vector.

mcabbott · 2025-03-26T20:11:42Z

Here's an example where this PR changes an error into a silently wrong answer:

julia> using OneHotArrays, LinearAlgebra

julia> A = rand(3, 2)
3×2 Matrix{Float64}:
 0.660383  0.685363
 0.726413  0.795676
 0.325422  0.511682

julia> y = onehotbatch(ones(3), 1:2)
2×3 OneHotMatrix(::Vector{UInt32}) with eltype Bool:
 1  1  1
 ⋅  ⋅  ⋅

julia> z = A * y
3×3 Matrix{Float64}:
 0.660383  0.660383  0.660383
 0.726413  0.726413  0.726413
 0.325422  0.325422  0.325422

julia> z ≈ A * collect(y)
true

julia> function LinearAlgebra.mul!(Y::AbstractVecOrMat, A::AbstractMatrix, B::OneHotArrays.OneHotLike)
         OneHotArrays._isonehot(B) || return invoke(mul!, Tuple{AbstractArray,AbstractMatrix,AbstractMatrix}, Y, A, B)
         size(A,2) == size(B,1) || throw(DimensionMismatch("Matrix column must correspond with the OneHot Size $(size(A,2)) ≠ $(size(B,1))")
       )
         # matmul sometimes wraps in ReshapedArray, taking parent is a simple way to handle that case
         copyto!(Y, view(A, :, onecold(parent(B))))
       end

julia> z ≈ mul!(fill(NaN,3,3), A, y)
true

julia> z ≈ mul!(fill(NaN,3,3), transpose(permutedims(A)), y)
true

julia> mul!(fill(NaN,4,3), A, y)
4×3 Matrix{Float64}:
 0.660383  0.726413    0.325422
 0.726413  0.325422  NaN
 0.325422  0.660383  NaN
 0.660383  0.726413  NaN

I realise it's annoying to write more tests than your use case, especially when other people adding features have not bothered to do so. But adding more ways to get wrong answers is something we should try pretty hard to avoid.

src/linalg.jl

ExpandingMan · 2025-03-29T00:00:41Z

Alright, sorry it took so me so long to get to that especially after I had been complaining about it.

Ok, I think everything should be about wrapped up now. There are still a lot of issues floating around because it hits getindex, but they will require a more comprehensive fix (and some cases are probably just not possible to handle elegantly with GPU arrays). With this at least mul! should not cause problems any more.

mcabbott

I pushed some Vector/Matrix tests like I keep suggesting, re-using some of the existing cases for *, and they fail with this PR (but pass without).

There are still a lot of issues floating around because it hits getindex, but they will require a more comprehensive fix (and some cases are probably just not possible to handle elegantly with GPU arrays).

If you have other cases which fail, please make issues. Otherwise it's just vague feelings & hard for anyone to act on.

src/linalg.jl

Co-authored-by: Michael Abbott <[email protected]>

ExpandingMan added 3 commits February 3, 2025 18:13

add missing mul! implementation

bd85741

remove broken from working test

9ea8552

bump version

b31c69d

mcabbott requested changes Feb 4, 2025

View reviewed changes

ExpandingMan mentioned this pull request Feb 4, 2025

GPU array scalar indexing edge cases #48

Open

ExpandingMan added 2 commits February 4, 2025 17:08

add broken test

b16e43f

remove test_broken note that is no longer relevant

fae5d4d

LTS is now 1.10

115e44f

bump buildkite julia version

8c4b6e3

mcabbott reviewed Mar 26, 2025

View reviewed changes

src/linalg.jl Outdated Show resolved Hide resolved

ExpandingMan added 2 commits March 28, 2025 19:24

clean up tests

160ef4e

check dimensions of output matrix in mul!

f11c7cb

ExpandingMan force-pushed the mulfix branch from 3f0298c to f11c7cb Compare March 28, 2025 23:58

ExpandingMan requested a review from mcabbott March 29, 2025 00:27

Update linalg.jl

9451f68

mcabbott reviewed Mar 29, 2025

View reviewed changes

src/linalg.jl Outdated Show resolved Hide resolved

mcabbott and others added 2 commits March 28, 2025 22:06

Update linalg.jl

c11723b

Update src/linalg.jl

8538a1d

Co-authored-by: Michael Abbott <[email protected]>

mcabbott merged commit 0b49d1b into FluxML:main Apr 2, 2025
11 checks passed

Uh oh!

add missing mul! implementation #47

add missing mul! implementation #47

Uh oh!

Conversation

ExpandingMan commented Feb 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Checklist

Uh oh!

mcabbott left a comment

Choose a reason for hiding this comment

Uh oh!

mcabbott Feb 4, 2025

Choose a reason for hiding this comment

Uh oh!

ExpandingMan Feb 4, 2025

Choose a reason for hiding this comment

Uh oh!

mcabbott Feb 5, 2025

Choose a reason for hiding this comment

Uh oh!

ExpandingMan Mar 28, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ExpandingMan commented Feb 4, 2025

Uh oh!

ExpandingMan commented Feb 4, 2025

Uh oh!

ExpandingMan commented Feb 5, 2025

Uh oh!

ToucheSir commented Feb 5, 2025

Uh oh!

codecov bot commented Feb 5, 2025

Codecov Report

Uh oh!

ExpandingMan commented Mar 24, 2025

Uh oh!

Uh oh!

mcabbott Mar 26, 2025

Choose a reason for hiding this comment

Uh oh!

ExpandingMan Mar 28, 2025

Choose a reason for hiding this comment

Uh oh!

mcabbott commented Mar 26, 2025

Uh oh!

Uh oh!

ExpandingMan commented Mar 29, 2025

Uh oh!

mcabbott left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ExpandingMan commented Feb 3, 2025 •

edited

Loading