-
-
Notifications
You must be signed in to change notification settings - Fork 27
Open
Labels
multithreadingBase.Threads and related functionalityBase.Threads and related functionalityperformanceMust go fasterMust go faster
Description
This is the minimal example I arrived at. It consists of calling mul!
within a multi-threaded loop:
import Pkg
Pkg.activate(".")
using LinearAlgebra
using BenchmarkTools
using Random
BLAS.set_num_threads(1)
function test()
nMax = 1e2
mMax = 1e3
s = zeros(Threads.nthreads())
Threads.@threads for i in 1:nMax
tmp = 0.0
a = rand(10,10)
b = similar(a)
for m = 1:mMax
mul!(b, a, a)
tmp = tmp + sum(b)
end
s[Threads.threadid()] = tmp
end
return sum(s)
end
@btime test()
using MKL
@btime test()
When running with a single thread, one gets:
% julia -t 1 code.jl
Activating project at `~/Downloads`
29.816 ms (208 allocations: 175.73 KiB)
28.485 ms (208 allocations: 175.73 KiB)
meaning that LinearAlgebra.mul!
and MKL.mul!
are similar in performance.
With multi-threading, one gets:
% julia -t 4 code.jl
Activating project at `~/Downloads`
43.085 ms (226 allocations: 177.53 KiB)
10.046 ms (226 allocations: 177.53 KiB)
The MKL
version is faster, as expected, but the LinearAlgebra.mul!
version is slower than the serial one.
I've run this in 1.9.0+rc1
and 1.8.5
and the results are the same.
Metadata
Metadata
Assignees
Labels
multithreadingBase.Threads and related functionalityBase.Threads and related functionalityperformanceMust go fasterMust go faster