-
-
Notifications
You must be signed in to change notification settings - Fork 28
Open
Labels
performanceMust go fasterMust go faster
Description
When copying a vector to a vector, copyto_unaliased!
checks IndexStyle(src) isa IndexLinear
: https://github.com/JuliaLang/julia/blob/9de107a63a56f9ca922497991a7eb9dcd027dc06/base/abstractarray.jl#L961
On the other hand, ldiv! uses copyto!(dst, view(src, 1:n, :))
and that array view has index style IndexCartesian
: https://github.com/JuliaLang/julia/blob/9de107a63a56f9ca922497991a7eb9dcd027dc06/stdlib/LinearAlgebra/src/factorization.jl#L142
n = 10^6; x1, x2 = randn(n), randn(n);
julia> @benchmark copyto!($x1, $x2)
BenchmarkTools.Trial:
memory estimate: 0 bytes
allocs estimate: 0
--------------
minimum time: 312.540 μs (0.00% GC)
median time: 329.336 μs (0.00% GC)
mean time: 338.304 μs (0.00% GC)
maximum time: 738.871 μs (0.00% GC)
--------------
samples: 10000
evals/sample: 1
julia> @benchmark copyto!($x1, view($x2, 1:n, :))
BenchmarkTools.Trial:
memory estimate: 288 bytes
allocs estimate: 7
--------------
minimum time: 1.544 ms (0.00% GC)
median time: 1.587 ms (0.00% GC)
mean time: 1.590 ms (0.00% GC)
maximum time: 1.947 ms (0.00% GC)
--------------
samples: 3143
evals/sample: 1
julia> IndexStyle(view(x2, 1:n, :))
IndexCartesian()
julia> IndexStyle(x1)
IndexLinear()
To give a concrete performance issue, calling ldiv!(::Vector{Float64}, ::LDLt{Float64,SymTridiagonal{Float64,Array{Float64,1}}}, ::Vector{Float64})
is substantially slower just because of the slow path through copy_unaliased!
julia> n=10^6; A = SymTridiagonal(4 .+ rand(n), randn(n)); Ainv = ldlt(A);
julia> @benchmark ldiv!(zeros(n), Ainv, randn(n))
BenchmarkTools.Trial:
memory estimate: 15.26 MiB
allocs estimate: 6
--------------
minimum time: 6.828 ms (0.00% GC)
median time: 6.982 ms (0.00% GC)
mean time: 7.134 ms (1.91% GC)
maximum time: 9.062 ms (0.00% GC)
--------------
samples: 701
evals/sample: 1
julia> @benchmark ldiv!(Ainv, randn(n))
BenchmarkTools.Trial:
memory estimate: 7.63 MiB
allocs estimate: 2
--------------
minimum time: 4.855 ms (0.00% GC)
median time: 4.907 ms (0.00% GC)
mean time: 4.985 ms (1.46% GC)
maximum time: 6.199 ms (14.13% GC)
--------------
samples: 1003
evals/sample: 1
julia> @benchmark ldiv!(Ainv, copyto!(zeros(n), randn(n)))
BenchmarkTools.Trial:
memory estimate: 15.26 MiB
allocs estimate: 4
--------------
minimum time: 5.535 ms (0.00% GC)
median time: 5.685 ms (0.00% GC)
mean time: 5.858 ms (2.30% GC)
maximum time: 8.155 ms (10.01% GC)
--------------
samples: 853
evals/sample: 1
version info:
julia> versioninfo()
Julia Version 1.5.3
Commit 788b2c77c1 (2020-11-09 13:37 UTC)
Platform Info:
OS: Linux (x86_64-pc-linux-gnu)
CPU: AMD Ryzen 7 3700X 8-Core Processor
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-9.0.1 (ORCJIT, znver2)
Metadata
Metadata
Assignees
Labels
performanceMust go fasterMust go faster