-
Notifications
You must be signed in to change notification settings - Fork 238
Audit uses of 32-bit indexing #1968
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Labels
bug
Something isn't working
Comments
Dear CUDA.jl team, I would like to bump this issue. Error 1 (broadcasting)julia> using CUDA
julia> A = CUDA.fill(1f0, 2^32); A .= 2f0
ERROR: InexactError: trunc(Int32, 4294967296)
Stacktrace:
[1] throw_inexacterror(::Symbol, ::Vararg{Any})
@ Core ./boot.jl:750
[2] checked_trunc_sint
@ ./boot.jl:764 [inlined]
[3] toInt32
@ ./boot.jl:801 [inlined]
[4] Int32
@ ./boot.jl:891 [inlined]
[5] convert
@ ./number.jl:7 [inlined]
[6] cconvert
@ ./essentials.jl:687 [inlined]
[7] macro expansion
@ ~/.julia/packages/CUDA/1kIOw/lib/utils/call.jl:222 [inlined]
[8] macro expansion
@ ~/.julia/packages/CUDA/1kIOw/lib/cudadrv/libcuda.jl:5139 [inlined]
[9] #735
@ ~/.julia/packages/CUDA/1kIOw/lib/utils/call.jl:35 [inlined]
[10] check
@ ~/.julia/packages/CUDA/1kIOw/lib/cudadrv/libcuda.jl:35 [inlined]
[11] cuOccupancyMaxPotentialBlockSize
@ ~/.julia/packages/CUDA/1kIOw/lib/utils/call.jl:34 [inlined]
[12] launch_configuration(fun::CuFunction; shmem::Int64, max_threads::Int64)
@ CUDA ~/.julia/packages/CUDA/1kIOw/lib/cudadrv/occupancy.jl:61
[13] launch_configuration
@ ~/.julia/packages/CUDA/1kIOw/lib/cudadrv/occupancy.jl:56 [inlined]
[14] (::KernelAbstractions.Kernel{…})(::CuArray{…}, ::Vararg{…}; ndrange::Tuple{…}, workgroupsize::Nothing)
@ CUDA.CUDAKernels ~/.julia/packages/CUDA/1kIOw/src/CUDAKernels.jl:107 Error 2 (filling a large array, no explicit broadcasting)julia> A = CUDA.fill(true, 2^32);
ERROR: InexactError: trunc(Int32, 4294967296)
Stacktrace:
[1] throw_inexacterror(::Symbol, ::Vararg{Any})
@ Core ./boot.jl:750
[2] checked_trunc_sint
@ ./boot.jl:764 [inlined]
[3] toInt32
@ ./boot.jl:801 [inlined]
[4] Int32
@ ./boot.jl:891 [inlined]
[5] convert
@ ./number.jl:7 [inlined]
[6] cconvert
@ ./essentials.jl:687 [inlined]
[7] macro expansion
@ ~/.julia/packages/CUDA/1kIOw/lib/utils/call.jl:222 [inlined]
[8] macro expansion
@ ~/.julia/packages/CUDA/1kIOw/lib/cudadrv/libcuda.jl:5139 [inlined]
[9] #735
@ ~/.julia/packages/CUDA/1kIOw/lib/utils/call.jl:35 [inlined]
[10] check
@ ~/.julia/packages/CUDA/1kIOw/lib/cudadrv/libcuda.jl:35 [inlined]
[11] cuOccupancyMaxPotentialBlockSize
@ ~/.julia/packages/CUDA/1kIOw/lib/utils/call.jl:34 [inlined]
[12] launch_configuration(fun::CuFunction; shmem::Int64, max_threads::Int64)
@ CUDA ~/.julia/packages/CUDA/1kIOw/lib/cudadrv/occupancy.jl:61
[13] launch_configuration
@ ~/.julia/packages/CUDA/1kIOw/lib/cudadrv/occupancy.jl:56 [inlined]
[14] (::KernelAbstractions.Kernel{…})(::CuArray{…}, ::Vararg{…}; ndrange::Tuple{…}, workgroupsize::Nothing)
@ CUDA.CUDAKernels ~/.julia/packages/CUDA/1kIOw/src/CUDAKernels.jl:107
[15] fill!(A::CuArray{Bool, 1, CUDA.DeviceMemory}, x::Bool)
@ GPUArrays ~/.julia/packages/GPUArrays/uiVyU/src/host/construction.jl:22
[16] fill
@ ~/.julia/packages/CUDA/1kIOw/src/array.jl:777 [inlined]
[17] macro expansion
@ ~/.julia/packages/CUDA/1kIOw/src/utilities.jl:35 [inlined]
[18] macro expansion
@ ~/.julia/packages/CUDA/1kIOw/src/memory.jl:831 [inlined]
[19] top-level scope
@ ./REPL[114]:1
Some type information was truncated. Use `show(err)` to see complete types. EDIT: I believe this was fixed a couple of days ago, I'll wait for the next release and re-run my code. |
As you noted, those issues are unrelated, and are fixed on the master branch. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
We're currently using Int32 indices in some kernels, using the
i32
hack, because that often results in significantly better performance. However, GPUs are getting large, and users are starting to use arrays that overflowtypemax(Int32)
elements. This can results in bugs like #1963We should be more careful about using 32-bit indexing, and probably not use
i32
until we have a better way of deciding which index type to use. Maybe we can add some kind ofindex_type
trait, defaulting toInt
but possibly usingInt32
when the input arrays allow it, e.g., using #1895.The text was updated successfully, but these errors were encountered: