Skip to content

Commit c70bb39

Browse files
authored
Merge pull request #514 from JuliaGPU/tb/docs
Add some docs
2 parents 11eb8ec + 89896bf commit c70bb39

File tree

9 files changed

+287
-20
lines changed

9 files changed

+287
-20
lines changed

docs/Manifest.toml

Lines changed: 14 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -18,10 +18,16 @@ uuid = "ffbed154-4ef7-542d-bbb7-c09d3a79fcae"
1818
version = "0.8.3"
1919

2020
[[Documenter]]
21-
deps = ["Base64", "Dates", "DocStringExtensions", "InteractiveUtils", "JSON", "LibGit2", "Logging", "Markdown", "REPL", "Test", "Unicode"]
22-
git-tree-sha1 = "fb1ff838470573adc15c71ba79f8d31328f035da"
21+
deps = ["Base64", "Dates", "DocStringExtensions", "IOCapture", "InteractiveUtils", "JSON", "LibGit2", "Logging", "Markdown", "REPL", "Test", "Unicode"]
22+
git-tree-sha1 = "71e35e069daa9969b8af06cef595a1add76e0a11"
2323
uuid = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
24-
version = "0.25.2"
24+
version = "0.25.3"
25+
26+
[[IOCapture]]
27+
deps = ["Logging"]
28+
git-tree-sha1 = "377252859f740c217b936cebcd918a44f9b53b59"
29+
uuid = "b5f81e59-6552-4d32-b1f0-c071b021bf89"
30+
version = "0.1.1"
2531

2632
[[InteractiveUtils]]
2733
deps = ["Markdown"]
@@ -42,9 +48,9 @@ uuid = "8f399da3-3557-5675-b5ff-fb832c97cbdb"
4248

4349
[[Literate]]
4450
deps = ["Base64", "JSON", "REPL"]
45-
git-tree-sha1 = "0ee3b052b944e1a84b6eb0ca15ce3899718df599"
51+
git-tree-sha1 = "7f289e9db7a93d30b9a44af4a8ae9cf92af74683"
4652
uuid = "98b081ad-f1c9-55d3-8b20-4c87d4299306"
47-
version = "2.6.0"
53+
version = "2.7.0"
4854

4955
[[Logging]]
5056
uuid = "56ddb016-857b-54e1-b83d-db4d58db5568"
@@ -57,10 +63,10 @@ uuid = "d6f4376e-aef5-505a-96c1-9c027394607a"
5763
uuid = "a63ad114-7e13-5084-954f-fe012c677804"
5864

5965
[[Parsers]]
60-
deps = ["Dates", "Test"]
61-
git-tree-sha1 = "8077624b3c450b15c087944363606a6ba12f925e"
66+
deps = ["Dates"]
67+
git-tree-sha1 = "6fa4202675c05ba0f8268a6ddf07606350eda3ce"
6268
uuid = "69de0a69-1ddd-5017-9359-2bf0b02dc9f0"
63-
version = "1.0.10"
69+
version = "1.0.11"
6470

6571
[[Pkg]]
6672
deps = ["Dates", "LibGit2", "Libdl", "Logging", "Markdown", "Printf", "REPL", "Random", "SHA", "UUIDs"]

docs/make.jl

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,7 @@ function main()
4646
],
4747
"Development" => Any[
4848
"development/profiling.md",
49+
"development/troubleshooting.md",
4950
],
5051
"API reference" => Any[
5152
"api/essentials.md",
Lines changed: 175 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,175 @@
1+
# Troubleshooting
2+
3+
To increase logging verbosity of the CUDA.jl compiler, launch Julia with the `JULIA_DEBUG`
4+
environment variable set to `CUDA`.
5+
6+
7+
## InvalidIRError: compiling ... resulted in invalid LLVM IR
8+
9+
Not all of Julia is supported by CUDA.jl. Several commonly-used features, like strings or
10+
exceptions, will not compile to GPU code, because of their interactions with the CPU-only
11+
runtime library.
12+
13+
For example, say we define and try to execute the following kernel:
14+
15+
```julia
16+
julia> function kernel(a)
17+
@inbounds a[threadId().x] = 0
18+
return
19+
end
20+
21+
julia> @cuda kernel(CuArray([1]))
22+
ERROR: InvalidIRError: compiling kernel kernel(CuDeviceArray{Int64,1,1}) resulted in invalid LLVM IR
23+
Reason: unsupported dynamic function invocation (call to setindex!)
24+
Stacktrace:
25+
[1] kernel at REPL[2]:2
26+
Reason: unsupported dynamic function invocation (call to getproperty)
27+
Stacktrace:
28+
[1] kernel at REPL[2]:2
29+
Reason: unsupported use of an undefined name (use of 'threadId')
30+
Stacktrace:
31+
[1] kernel at REPL[2]:2
32+
```
33+
34+
CUDA.jl does its best to decode the unsupported IR and figure out where it came from. In
35+
this case, there's two so-called dynamic invocations, which happen when a function call
36+
cannot be statically resolved (often because the compiler could not fully infer the call,
37+
e.g., due to inaccurate or instable type information). These are a red herring, and the real
38+
cause is listed last: a typo in the use of the `threadIdx` function! If we fix this, the IR
39+
error disappears and our kernel successfully compiles and executes.
40+
41+
42+
## KernelError: kernel returns a value of type `Union{}`
43+
44+
Where the previous section clearly pointed to the source of invalid IR, in other cases your
45+
function will return an error. This is encoded by the Julia compiler as a return value of
46+
type `Union{}`:
47+
48+
```julia
49+
julia> function kernel(a)
50+
@inbounds a[threadId().x] = CUDA.sin(a[threadIdx().x])
51+
return
52+
end
53+
54+
julia> @cuda kernel(CuArray([1]))
55+
ERROR: GPU compilation of kernel kernel(CuDeviceArray{Int64,1,1}) failed
56+
KernelError: kernel returns a value of type `Union{}`
57+
```
58+
59+
Now we don't know where this error came from, and we will have to take a look ourselves at
60+
the generated code. This is easily done using the `@device_code` introspection macros, which
61+
mimic their Base counterparts (e.g. `@device_code_llvm` instead of `@code_llvm`, etc).
62+
63+
To debug an error returned by a kernel, we should use `@device_code_warntype` to inspect the
64+
Julia IR. Furthermore, this macro has an `interactive` mode, which further facilitates
65+
inspecting this IR using Cthulhu.jl. First, install and import this package, and then try to
66+
execute the kernel again prefixed by `@device_code_warntype interactive=true`:
67+
68+
```julia
69+
julia> using Cthulhu
70+
71+
julia> @device_code_warntype interactive=true @cuda kernel(CuArray([1]))
72+
Variables
73+
#self#::Core.Compiler.Const(kernel, false)
74+
a::CuDeviceArray{Int64,1,1}
75+
val::Union{}
76+
77+
Body::Union{}
78+
1%1 = CUDA.sin::Core.Compiler.Const(CUDA.sin, false)
79+
...
80+
%14 = (...)::Int64
81+
└── goto #2
82+
2 ─ (%1)(%14)
83+
└── $(Expr(:unreachable))
84+
85+
Select a call to descend into or to ascend.
86+
%17 = call CUDA.sin(::Int64)::Union{}
87+
```
88+
89+
Both from the IR and the list of calls Cthulhu offers to inspect further, we can see that
90+
the call to `CUDA.sin(::Int64)` results in an error: in the IR it is immediately followed by
91+
an `unreachable`, while in the list of calls it is inferred to return `Union{}`. Now we know
92+
where to look, it's easy to figure out what's wrong:
93+
94+
```julia
95+
help?> CUDA.sin
96+
# 2 methods for generic function "sin":
97+
[1] sin(x::Float32) in CUDA at /home/tim/Julia/pkg/CUDA/src/device/intrinsics/math.jl:13
98+
[2] sin(x::Float64) in CUDA at /home/tim/Julia/pkg/CUDA/src/device/intrinsics/math.jl:12
99+
```
100+
101+
There's no method of `CUDA.sin` that accepts an Int64, and thus the function was determined
102+
to unconditionally throw a method error. For now, we disallow these situations and refuse to
103+
compile, but in the spirit of dynamic languages we might change this behavior to just throw
104+
an error at run time.
105+
106+
107+
## Debug info and line-number information
108+
109+
On Julia debug level 1, which is the default setting if unspecified, CUDA.jl emits line
110+
number information corresponding to `nvcc -lineinfo`. This information does not hurt
111+
performance, and is used by a variety of tools to improve the debugging experience.
112+
113+
To emit actual debug info as `nvcc -G` does, you need to start Julia on debug level 2 by
114+
passing the flag `-g2`. Support for emitting PTX-compatible debug info is a recent addition
115+
to the NVPTX LLVM back-end, so it's possible this information is incorrect or otherwise
116+
affects compilation.
117+
118+
!!! warning
119+
120+
Due to bugs in LLVM and/or CUDA, the debug info as emitted by LLVM 8.0 or higher
121+
results in crashed when loading the compiled code. As a result, all types of debug info
122+
are disabled by CUDA.jl on Julia 1.4 or above. If you need line number information, you
123+
need to revert to using Julia 1.3 which uses LLVM 6.0 (note that actual debug info is
124+
not supported by LLVM 6.0).
125+
126+
To disable all debug info emission, start Julia with the flag `-g0`.
127+
128+
129+
## Stack trace information
130+
131+
The Julia debug level is also used to emit determine how much backtrace information to embed
132+
in the module. This information is used when displaying exceptions on the device, e.g., when
133+
going out of bounds:
134+
135+
```julia
136+
julia> function kernel(a)
137+
a[threadIdx().x] = 0
138+
return
139+
end
140+
kernel (generic function with 1 method)
141+
142+
julia> @cuda threads=2 kernel(CuArray([1]))
143+
```
144+
145+
On the default debug level of 1, an simple error message will be displayed:
146+
147+
```
148+
ERROR: a exception was thrown during kernel execution.
149+
Run Julia on debug level 2 for device stack traces.
150+
```
151+
152+
If we set the debug level to 2, by passing `-g2` to `julia`, we see:
153+
154+
```
155+
ERROR: a exception was thrown during kernel execution.
156+
Stacktrace:
157+
[1] throw_boundserror at abstractarray.jl:541
158+
[2] checkbounds at abstractarray.jl:506
159+
[3] arrayset at /home/tim/Julia/pkg/CUDA/src/device/array.jl:84
160+
[4] setindex! at /home/tim/Julia/pkg/CUDA/src/device/array.jl:101
161+
[5] kernel at REPL[4]:2
162+
```
163+
164+
Note that these messages are embedded in the module (CUDA does not support stack unwinding),
165+
and thus bloat its size. To avoid any overhead, you can disable these messages by setting
166+
the debug level to 0 (passing `-g0` to `julia`). This disabled any device-side message, but
167+
retains the host-side detection:
168+
169+
```
170+
julia> @cuda threads=2 kernel(CuArray([1]))
171+
# no device-side error message!
172+
173+
julia> synchronize()
174+
ERROR: KernelException: exception thrown during kernel execution
175+
```

docs/src/faq.md

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,48 @@
33
This page is a compilation of frequently asked questions and answers.
44

55

6+
## An old version of CUDA.jl keeps getting installed!
7+
8+
Sometimes it happens that a breaking version of CUDA.jl or one of its dependencies is
9+
released. If any package you use isn't yet compatible with this release, this will block
10+
automatic upgrade of CUDA.jl. For example, with Flux.jl v0.11.1 we get CUDA.jl v1.3.3
11+
despite there being a v2.x release:
12+
13+
```
14+
pkg> add Flux
15+
[587475ba] + Flux v0.11.1
16+
pkg> add CUDA
17+
[052768ef] + CUDA v1.3.3
18+
```
19+
20+
To examine which package is holding back CUDA.jl, you can "force" an upgrade by specifically
21+
requesting a newer version. The resolver will then complain, and explain why this upgrade
22+
isn't possible:
23+
24+
```
25+
pkg> add CUDA.jl@2
26+
Resolving package versions...
27+
ERROR: Unsatisfiable requirements detected for package Adapt [79e6a3ab]:
28+
Adapt [79e6a3ab] log:
29+
├─possible versions are: [0.3.0-0.3.1, 0.4.0-0.4.2, 1.0.0-1.0.1, 1.1.0, 2.0.0-2.0.2, 2.1.0, 2.2.0, 2.3.0] or uninstalled
30+
├─restricted by compatibility requirements with CUDA [052768ef] to versions: [2.2.0, 2.3.0]
31+
│ └─CUDA [052768ef] log:
32+
│ ├─possible versions are: [0.1.0, 1.0.0-1.0.2, 1.1.0, 1.2.0-1.2.1, 1.3.0-1.3.3, 2.0.0-2.0.2] or uninstalled
33+
│ └─restricted to versions 2 by an explicit requirement, leaving only versions 2.0.0-2.0.2
34+
└─restricted by compatibility requirements with Flux [587475ba] to versions: [0.3.0-0.3.1, 0.4.0-0.4.2, 1.0.0-1.0.1, 1.1.0] — no versions left
35+
└─Flux [587475ba] log:
36+
├─possible versions are: [0.4.1, 0.5.0-0.5.4, 0.6.0-0.6.10, 0.7.0-0.7.3, 0.8.0-0.8.3, 0.9.0, 0.10.0-0.10.4, 0.11.0-0.11.1] or uninstalled
37+
├─restricted to versions * by an explicit requirement, leaving only versions [0.4.1, 0.5.0-0.5.4, 0.6.0-0.6.10, 0.7.0-0.7.3, 0.8.0-0.8.3, 0.9.0, 0.10.0-0.10.4, 0.11.0-0.11.1]
38+
└─restricted by compatibility requirements with CUDA [052768ef] to versions: [0.4.1, 0.5.0-0.5.4, 0.6.0-0.6.10, 0.7.0-0.7.3, 0.8.0-0.8.3, 0.9.0, 0.10.0-0.10.4] or uninstalled, leaving only versions: [0.4.1, 0.5.0-0.5.4, 0.6.0-0.6.10, 0.7.0-0.7.3, 0.8.0-0.8.3, 0.9.0, 0.10.0-0.10.4]
39+
└─CUDA [052768ef] log: see above
40+
```
41+
42+
A common source of these incompatibilities is having both CUDA.jl and the older
43+
CUDAnative.jl/CuArrays.jl/CUDAdrv.jl stack installed: These are incompatible, and cannot
44+
coexist. You can inspect in the Pkg REPL which exact packages you have installed using the
45+
`status --manifest` option.
46+
47+
648
## Can you wrap this or that CUDA API?
749

850
If a certain API isn't wrapped with some high-level functionality, you can always use the

docs/src/installation/overview.md

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,47 @@ using the artifact subsystem.
88

99

1010

11+
## Package installation
12+
13+
For most users, installing the latest tagged version of CUDA.jl will be sufficient. You can
14+
easily do that using the package manager:
15+
16+
```
17+
pkg> add CUDA
18+
```
19+
20+
Or, equivalently, via the `Pkg` API:
21+
22+
```julia
23+
julia> import Pkg; Pkg.add("CUDA")
24+
```
25+
26+
In some cases, you might need to use the `master` version of this package, e.g., because it
27+
includes a specific fix you need. Often, however, the development version of this package
28+
itself relies on unreleased versions of other packages. This information is recorded in the
29+
manifest at the root of the repository, which you can use by starting Julia from the CUDA.jl
30+
directory with the `--project` flag:
31+
32+
```
33+
$ cd .julia/dev/CUDA.jl # or wherever you have CUDA.jl checked out
34+
$ julia --project
35+
pkg> instantiate # to install correct dependencies
36+
julia> using CUDA
37+
```
38+
39+
In the case you want to use the development version of CUDA.jl with other packages, you
40+
cannot use the manifest and you need to manually install those dependencies from the master
41+
branch. Again, the exact requirements are recorded in CUDA.jl's manifest, but often the
42+
following instructions will work:
43+
44+
```
45+
pkg> add GPUCompiler#master
46+
pkg> add GPUArrays#master
47+
pkg> add LLVM#master
48+
```
49+
50+
51+
1152
## Platform support
1253

1354
All three major operation systems are supported: Linux, Windows and macOS. However, that

docs/src/installation/troubleshooting.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,12 +21,14 @@ Generally though, it's impossible to say what's the reason for the error, but Ju
2121
likely not to blame. Make sure your set-up works (e.g., try executing `nvidia-smi`, a CUDA C
2222
binary, etc), and if everything looks good file an issue.
2323

24+
2425
## NVML library not found (on Windows)
2526

2627
Check and make sure the `NVSMI` folder is in your `PATH`. By default it may not be. Look in
2728
`C:\Program Files\NVIDIA Corporation` for the `NVSMI` folder - you should see `nvml.dll`
2829
within it. You can add this folder to your `PATH` and check that `nvidia-smi` runs properly.
2930

31+
3032
## LLVM error: Cannot cast between two non-generic address spaces
3133

3234
You are using an unpatched copy of LLVM, likely caused by using Julia as packaged by your
@@ -37,3 +39,9 @@ extensive list of patches to be applied to the specific versions of LLVM that ar
3739

3840
It is thus recommended to use the official binaries, or use a version of Julia built without
3941
setting `USE_SYSTEM_LLVM=1` (which you can suggest to maintainers of your Linux distribution).
42+
43+
44+
## LoadError: UndefVarError: AddrSpacePtr not defined
45+
46+
You are using an old version of CUDA.jl in combination with a recent version of Julia
47+
(1.5+). This is not supported, and you should be using CUDA.jl 1.x or above.

docs/src/usage/array.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -326,8 +326,8 @@ julia> y = CUDA.rand(2)
326326
0.03902049
327327
0.9689629
328328
329-
julia> CUBLAS.dot(2, x, 0, y, 0)
330-
0.057767443f0
329+
julia> CUBLAS.dot(2, x, y)
330+
0.92129254f0
331331
332332
julia> using LinearAlgebra
333333

lib/cudadrv/error.jl

Lines changed: 2 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -26,18 +26,12 @@ Base.:(==)(x::CuError,y::CuError) = x.code == y.code
2626
2727
Gets the string representation of an error code.
2828
29-
This name can often be used as a symbol in source code to get an instance of this error.
30-
For example:
31-
3229
```jldoctest
33-
julia> err = CuError(1)
34-
CuError(1, ERROR_INVALID_VALUE)
30+
julia> err = CuError(CUDA.cudaError_enum(1))
31+
CuError(CUDA_ERROR_INVALID_VALUE)
3532
3633
julia> name(err)
3734
"ERROR_INVALID_VALUE"
38-
39-
julia> ERROR_INVALID_VALUE
40-
CuError(1, ERROR_INVALID_VALUE)
4135
```
4236
"""
4337
function name(err::CuError)

src/device/intrinsics/wmma.jl

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -379,8 +379,8 @@ All WMMA operations take a `Config` as their final argument.
379379
380380
# Examples
381381
```jldoctest
382-
julia> config = Config{16, 16, 16, Float32}
383-
Config{16,16,16,Float32}
382+
julia> config = WMMA.Config{16, 16, 16, Float32}
383+
CUDA.WMMA.Config{16,16,16,Float32}
384384
```
385385
"""
386386
struct Config{M, N, K, d_type} end

0 commit comments

Comments
 (0)