-
Notifications
You must be signed in to change notification settings - Fork 244
Rationalize and try to fix failing ldiv tests #2809
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Your PR requires formatting changes to meet the project's style guidelines. Click here to view the suggested changes.diff --git a/test/libraries/cusparse/interfaces.jl b/test/libraries/cusparse/interfaces.jl
index fa25d8330..34f9d75f8 100644
--- a/test/libraries/cusparse/interfaces.jl
+++ b/test/libraries/cusparse/interfaces.jl
@@ -258,7 +258,7 @@ nB = 2
end
end
@testset "\\ -- CuMatrix" begin
- C = triangle(opa(A)) \ opb(B)
+ C = triangle(opa(A)) \ opb(B)
dC = triangle(opa(dA)) \ opb(dB)
@test C ≈ collect(dC)
if CUSPARSE.version() < v"12.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CUDA.jl Benchmarks
Benchmark suite | Current: 2b58bc1 | Previous: 4f38802 | Ratio |
---|---|---|---|
latency/precompile |
42937214294 ns |
42824416725 ns |
1.00 |
latency/ttfp |
7006899212 ns |
7051266950 ns |
0.99 |
latency/import |
3598749321 ns |
3574411987 ns |
1.01 |
integration/volumerhs |
9609056 ns |
9608389 ns |
1.00 |
integration/byval/slices=1 |
146856 ns |
146872 ns |
1.00 |
integration/byval/slices=3 |
426026.5 ns |
425794 ns |
1.00 |
integration/byval/reference |
145080 ns |
144942 ns |
1.00 |
integration/byval/slices=2 |
286326 ns |
286144 ns |
1.00 |
integration/cudadevrt |
103529 ns |
103388 ns |
1.00 |
kernel/indexing |
14148 ns |
14276 ns |
0.99 |
kernel/indexing_checked |
15036 ns |
15083 ns |
1.00 |
kernel/occupancy |
673.5796178343949 ns |
677.6114649681529 ns |
0.99 |
kernel/launch |
2258.777777777778 ns |
2157.8888888888887 ns |
1.05 |
kernel/rand |
14983 ns |
14900 ns |
1.01 |
array/reverse/1d |
19640 ns |
20028 ns |
0.98 |
array/reverse/2d |
23399 ns |
25007 ns |
0.94 |
array/reverse/1d_inplace |
10382 ns |
10952 ns |
0.95 |
array/reverse/2d_inplace |
12031 ns |
12545 ns |
0.96 |
array/copy |
20933 ns |
21084 ns |
0.99 |
array/iteration/findall/int |
157263 ns |
158043.5 ns |
1.00 |
array/iteration/findall/bool |
139616 ns |
140007 ns |
1.00 |
array/iteration/findfirst/int |
167118 ns |
164557.5 ns |
1.02 |
array/iteration/findfirst/bool |
170423 ns |
167385 ns |
1.02 |
array/iteration/scalar |
71422 ns |
74295 ns |
0.96 |
array/iteration/logical |
213235 ns |
215875.5 ns |
0.99 |
array/iteration/findmin/1d |
45880 ns |
47331 ns |
0.97 |
array/iteration/findmin/2d |
96439 ns |
97017 ns |
0.99 |
array/reductions/reduce/Int64/1d |
42507 ns |
43072.5 ns |
0.99 |
array/reductions/reduce/Int64/dims=1 |
45746.5 ns |
55698.5 ns |
0.82 |
array/reductions/reduce/Int64/dims=2 |
62141 ns |
62572.5 ns |
0.99 |
array/reductions/reduce/Int64/dims=1L |
88721 ns |
89129 ns |
1.00 |
array/reductions/reduce/Int64/dims=2L |
86683 ns |
88184.5 ns |
0.98 |
array/reductions/reduce/Float32/1d |
34086 ns |
35313 ns |
0.97 |
array/reductions/reduce/Float32/dims=1 |
43953 ns |
51818 ns |
0.85 |
array/reductions/reduce/Float32/dims=2 |
59614 ns |
59835 ns |
1.00 |
array/reductions/reduce/Float32/dims=1L |
52197.5 ns |
52336 ns |
1.00 |
array/reductions/reduce/Float32/dims=2L |
69873 ns |
70233.5 ns |
0.99 |
array/reductions/mapreduce/Int64/1d |
42098 ns |
44093 ns |
0.95 |
array/reductions/mapreduce/Int64/dims=1 |
46766 ns |
47633.5 ns |
0.98 |
array/reductions/mapreduce/Int64/dims=2 |
61726 ns |
62709 ns |
0.98 |
array/reductions/mapreduce/Int64/dims=1L |
88921 ns |
89036 ns |
1.00 |
array/reductions/mapreduce/Int64/dims=2L |
86327 ns |
87347.5 ns |
0.99 |
array/reductions/mapreduce/Float32/1d |
34357.5 ns |
34780.5 ns |
0.99 |
array/reductions/mapreduce/Float32/dims=1 |
41477 ns |
41996.5 ns |
0.99 |
array/reductions/mapreduce/Float32/dims=2 |
59999 ns |
60450.5 ns |
0.99 |
array/reductions/mapreduce/Float32/dims=1L |
52679 ns |
52739 ns |
1.00 |
array/reductions/mapreduce/Float32/dims=2L |
70249.5 ns |
70715 ns |
0.99 |
array/broadcast |
20011 ns |
20360 ns |
0.98 |
array/copyto!/gpu_to_gpu |
12766 ns |
12890 ns |
0.99 |
array/copyto!/cpu_to_gpu |
213580.5 ns |
217680 ns |
0.98 |
array/copyto!/gpu_to_cpu |
283317 ns |
286671 ns |
0.99 |
array/accumulate/Int64/1d |
124768.5 ns |
125190 ns |
1.00 |
array/accumulate/Int64/dims=1 |
83243 ns |
84136 ns |
0.99 |
array/accumulate/Int64/dims=2 |
157572 ns |
158690 ns |
0.99 |
array/accumulate/Int64/dims=1L |
1708962 ns |
1709534 ns |
1.00 |
array/accumulate/Int64/dims=2L |
965903 ns |
967437 ns |
1.00 |
array/accumulate/Float32/1d |
109074 ns |
109803 ns |
0.99 |
array/accumulate/Float32/dims=1 |
80983 ns |
81170 ns |
1.00 |
array/accumulate/Float32/dims=2 |
147799 ns |
147834 ns |
1.00 |
array/accumulate/Float32/dims=1L |
1619103.5 ns |
1619112.5 ns |
1.00 |
array/accumulate/Float32/dims=2L |
698281 ns |
698583 ns |
1.00 |
array/construct |
1292.3 ns |
1275.8 ns |
1.01 |
array/random/randn/Float32 |
43716.5 ns |
44761 ns |
0.98 |
array/random/randn!/Float32 |
25068 ns |
25104 ns |
1.00 |
array/random/rand!/Int64 |
27432 ns |
27468 ns |
1.00 |
array/random/rand!/Float32 |
8771 ns |
8662 ns |
1.01 |
array/random/rand/Int64 |
38026 ns |
30080 ns |
1.26 |
array/random/rand/Float32 |
12913 ns |
13152 ns |
0.98 |
array/permutedims/4d |
60201 ns |
60473 ns |
1.00 |
array/permutedims/2d |
53933 ns |
54524 ns |
0.99 |
array/permutedims/3d |
54865.5 ns |
55468 ns |
0.99 |
array/sorting/1d |
2756471 ns |
2763710 ns |
1.00 |
array/sorting/by |
3366837 ns |
3356377 ns |
1.00 |
array/sorting/2d |
1087589 ns |
1085339 ns |
1.00 |
cuda/synchronization/stream/auto |
1032.8 ns |
1018.0909090909091 ns |
1.01 |
cuda/synchronization/stream/nonblocking |
7152.2 ns |
7602.700000000001 ns |
0.94 |
cuda/synchronization/stream/blocking |
810.1630434782609 ns |
806.236559139785 ns |
1.00 |
cuda/synchronization/context/auto |
1178.4 ns |
1183.8 ns |
1.00 |
cuda/synchronization/context/nonblocking |
7946.4 ns |
7801 ns |
1.02 |
cuda/synchronization/context/blocking |
918.8809523809524 ns |
897.2923076923076 ns |
1.02 |
This comment was automatically generated by workflow using github-action-benchmark.
5db6744
to
ef0395c
Compare
ef0395c
to
2b58bc1
Compare
@maleadt failing tests are the |
Trying to fix intermittently failing CI. Doesn't make sense to have these checks for only one of the inplace/not-inplace versions. Hopefully this helps stability.