Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update GPUToolbox to v0.2 #2687

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

update GPUToolbox to v0.2 #2687

wants to merge 2 commits into from

Conversation

vchuravy
Copy link
Member

No description provided.

@@ -4,7 +4,7 @@ using GPUCompiler

using GPUArrays

using GPUToolbox: SimpleVersion, @sv_str
using GPUToolbox: SimpleVersion, @sv_str, i32
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not simply import everything? We control that package, after all.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CUDA.jl Benchmarks

Benchmark suite Current: 9161a0e Previous: c75b56f Ratio
latency/precompile 46241783821 ns 46181620127.5 ns 1.00
latency/ttfp 7104757954 ns 7068629094 ns 1.01
latency/import 3740569375 ns 3724762175 ns 1.00
integration/volumerhs 9624153.5 ns 9624216.5 ns 1.00
integration/byval/slices=1 146935 ns 146894 ns 1.00
integration/byval/slices=3 425223 ns 425137.5 ns 1.00
integration/byval/reference 144909 ns 144952 ns 1.00
integration/byval/slices=2 286077 ns 285936 ns 1.00
integration/cudadevrt 103282 ns 103412 ns 1.00
kernel/indexing 14211 ns 14099 ns 1.01
kernel/indexing_checked 14757 ns 14674 ns 1.01
kernel/occupancy 671.8037974683544 ns 701.1379310344828 ns 0.96
kernel/launch 2059.7 ns 2179.6666666666665 ns 0.94
kernel/rand 14590 ns 14749 ns 0.99
array/reverse/1d 19644 ns 19776 ns 0.99
array/reverse/2d 25320 ns 24908 ns 1.02
array/reverse/1d_inplace 10356.5 ns 10219 ns 1.01
array/reverse/2d_inplace 12076 ns 11910 ns 1.01
array/copy 21226 ns 21311 ns 1.00
array/iteration/findall/int 159303 ns 158209 ns 1.01
array/iteration/findall/bool 139699 ns 139123 ns 1.00
array/iteration/findfirst/int 154653 ns 153168 ns 1.01
array/iteration/findfirst/bool 155261 ns 154631 ns 1.00
array/iteration/scalar 72166 ns 71886 ns 1.00
array/iteration/logical 216022.5 ns 213254 ns 1.01
array/iteration/findmin/1d 41704 ns 40786 ns 1.02
array/iteration/findmin/2d 94285 ns 93428 ns 1.01
array/reductions/reduce/1d 40190.5 ns 35669 ns 1.13
array/reductions/reduce/2d 51223.5 ns 40477 ns 1.27
array/reductions/mapreduce/1d 38542 ns 33443 ns 1.15
array/reductions/mapreduce/2d 51598 ns 40694.5 ns 1.27
array/broadcast 20887 ns 20825 ns 1.00
array/copyto!/gpu_to_gpu 11687 ns 13806 ns 0.85
array/copyto!/cpu_to_gpu 210260 ns 208873 ns 1.01
array/copyto!/gpu_to_cpu 245325 ns 242948 ns 1.01
array/accumulate/1d 109209 ns 108924 ns 1.00
array/accumulate/2d 79725.5 ns 80034 ns 1.00
array/construct 1312.6 ns 1297.3 ns 1.01
array/random/randn/Float32 44061 ns 43906.5 ns 1.00
array/random/randn!/Float32 26846 ns 26669 ns 1.01
array/random/rand!/Int64 27252 ns 27027 ns 1.01
array/random/rand!/Float32 8717.333333333334 ns 8863 ns 0.98
array/random/rand/Int64 33643 ns 30048.5 ns 1.12
array/random/rand/Float32 13102 ns 13342 ns 0.98
array/permutedims/4d 61149.5 ns 60675.5 ns 1.01
array/permutedims/2d 55648.5 ns 55115.5 ns 1.01
array/permutedims/3d 56588 ns 55700 ns 1.02
array/sorting/1d 2766862 ns 2777689 ns 1.00
array/sorting/by 3354947 ns 3368739 ns 1.00
array/sorting/2d 1085516.5 ns 1084912 ns 1.00
cuda/synchronization/stream/auto 1039.5 ns 1013.5384615384615 ns 1.03
cuda/synchronization/stream/nonblocking 6487.8 ns 6485.2 ns 1.00
cuda/synchronization/stream/blocking 840.1782178217821 ns 826 ns 1.02
cuda/synchronization/context/auto 1161.2 ns 1197.3 ns 0.97
cuda/synchronization/context/nonblocking 6706.8 ns 6768.6 ns 0.99
cuda/synchronization/context/blocking 931.1578947368421 ns 946.4193548387096 ns 0.98

This comment was automatically generated by workflow using github-action-benchmark.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants