Skip to content

multi-threaded parallel calls to LinearAlgebra.mul! are slower than serial calls #1000

@lmiq

Description

@lmiq

from: https://discourse.julialang.org/t/multi-threading-of-julia-1-8-5-does-not-improve-speed-when-combined-with-blas/97568/12

This is the minimal example I arrived at. It consists of calling mul! within a multi-threaded loop:

import Pkg
Pkg.activate(".")

using LinearAlgebra
using BenchmarkTools
using Random

BLAS.set_num_threads(1)

function test()
    nMax = 1e2
    mMax = 1e3
    s = zeros(Threads.nthreads())
    Threads.@threads for i in 1:nMax
        tmp = 0.0
        a = rand(10,10)
        b = similar(a)
        for m = 1:mMax
            mul!(b, a, a)
            tmp = tmp + sum(b)
        end
        s[Threads.threadid()] = tmp
    end
    return sum(s)
end

@btime test()

using MKL
@btime test()

When running with a single thread, one gets:

% julia -t 1 code.jl
   Activating project at `~/Downloads`
  29.816 ms (208 allocations: 175.73 KiB)
  28.485 ms (208 allocations: 175.73 KiB)

meaning that LinearAlgebra.mul! and MKL.mul! are similar in performance.

With multi-threading, one gets:

% julia -t 4 code.jl
  Activating project at `~/Downloads`
  43.085 ms (226 allocations: 177.53 KiB)
  10.046 ms (226 allocations: 177.53 KiB)

The MKL version is faster, as expected, but the LinearAlgebra.mul! version is slower than the serial one.

I've run this in 1.9.0+rc1 and 1.8.5 and the results are the same.

Metadata

Metadata

Assignees

No one assigned

    Labels

    multithreadingBase.Threads and related functionalityperformanceMust go faster

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions