-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
POC: Make ZonedDateTime
an isbits type
#287
base: master
Are you sure you want to change the base?
Conversation
Codecov Report
@@ Coverage Diff @@
## master #287 +/- ##
==========================================
+ Coverage 93.64% 93.74% +0.09%
==========================================
Files 32 33 +1
Lines 1527 1583 +56
==========================================
+ Hits 1430 1484 +54
- Misses 97 99 +2
Continue to review full report at Codecov.
|
I'll note the benchmarks above are all using #281 |
One thing to benchmark: Current master:
This PR:
Huge difference! And notably much lower GC time now, which is one thing we were trying to fix. |
serialize(Serializer(b), zdt) | ||
seekstart(b) | ||
|
||
@test deserialize(Serializer(b)) == zdt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should have a test that emulates, or just plain is, another process with a different set of globals.
This looks like it handles that right to me, but let's have tests to prove it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can temporarily clear the globals for the tests
Looks like we don't always get better performance (Julia 1.5.1) Current master (70df06e) julia> using BenchmarkTools, TimeZones, Dates
[ Info: Precompiling BenchmarkTools [6e4b80f9-dd63-53aa-95a3-0cdb28fa8baf]
[ Info: Precompiling TimeZones [f269a46b-ccf7-5d73-abea-4c690281aa53]
julia> f2() = collect(ZonedDateTime(2000, tz"America/Winnipeg"):Day(1):ZonedDateTime(2020, tz"America/Winnipeg"))[end]
f2 (generic function with 1 method)
julia> @benchmark f2()
BenchmarkTools.Trial:
memory estimate: 2.40 MiB
allocs estimate: 21967
--------------
minimum time: 710.109 μs (0.00% GC)
median time: 982.287 μs (0.00% GC)
mean time: 1.249 ms (9.08% GC)
maximum time: 8.318 ms (0.00% GC)
--------------
samples: 3998
evals/sample: 1 This PR julia> using BenchmarkTools, TimeZones, Dates
julia> f2() = collect(ZonedDateTime(2000, tz"America/Winnipeg"):Day(1):ZonedDateTime(2020, tz"America/Winnipeg"))[end]
f2 (generic function with 1 method)
julia> @benchmark f2()
BenchmarkTools.Trial:
memory estimate: 1.79 MiB
allocs estimate: 29284
--------------
minimum time: 1.528 ms (0.00% GC)
median time: 2.330 ms (0.00% GC)
mean time: 3.157 ms (4.84% GC)
maximum time: 118.613 ms (3.35% GC)
--------------
samples: 1584
evals/sample: 1 |
This seems to be down to the math part: master:
this PR:
|
Nice find. We can probably address that by making |
Hmm I think this mainly has to do with the speed of |
Problem:
(this is Julia 1.5) I was looking at the performance of using BenchmarkTools
using TimeZones
const xs = ZonedDateTime.(1000:2000, 1, 2, tz"America/Winnipeg");
@benchmark diff($xs)
# master
julia> @benchmark diff($xs)
BenchmarkTools.Trial:
memory estimate: 7.94 KiB
allocs estimate: 1
--------------
minimum time: 1.141 μs (0.00% GC)
median time: 1.654 μs (0.00% GC)
mean time: 2.646 μs (32.99% GC)
maximum time: 667.165 μs (99.68% GC)
--------------
samples: 10000
evals/sample: 10
# this branch
julia> @benchmark diff($xs)
BenchmarkTools.Trial:
memory estimate: 7.94 KiB
allocs estimate: 1
--------------
minimum time: 780.589 ns (0.00% GC)
median time: 1.177 μs (0.00% GC)
mean time: 2.181 μs (42.82% GC)
maximum time: 77.476 μs (96.32% GC)
--------------
samples: 10000
evals/sample: 90 |
Isn't that true for all isbits types? julia> xs = [1,2,3]
3-element Array{Int64,1}:
1
2
3
julia> isbits(xs[1])
true
julia> isbits(xs)
false |
fair enough. masterjulia> @code_native xs[1]
.section __TEXT,__text,regular,pure_instructions
; ┌ @ array.jl:809 within `getindex'
pushq %rbp
movq %rsp, %rbp
movq %rdx, %r8
movq %rdi, %rax
leaq -1(%rcx), %rdi
cmpq 8(%rdx), %rdi
jae L80
movq (%r8), %rdx
leaq (%rdi,%rdi,4), %rdi
movq 8(%rdx,%rdi,8), %rcx
testq %rcx, %rcx
je L114
movq 16(%rdx,%rdi,8), %r8
vmovups 24(%rdx,%rdi,8), %xmm0
movq (%rdx,%rdi,8), %rdx
movq %rcx, (%rsi)
movq %r8, 8(%rsi)
movq %rdx, (%rax)
movq %rcx, 8(%rax)
movq %r8, 16(%rax)
vmovups %xmm0, 24(%rax)
movq %rbp, %rsp
popq %rbp
retq
L80:
movq %rsp, %rax
leaq -16(%rax), %rsi
movq %rsi, %rsp
movq %rcx, -16(%rax)
movabsq $jl_bounds_error_ints, %rax
movl $1, %edx
movq %r8, %rdi
callq *%rax
L114:
movabsq $jl_throw, %rax
movabsq $jl_system_image_data, %rdi
callq *%rax
nopl (%rax,%rax)
; └ this PRjulia> @code_native xs[1]
.section __TEXT,__text,regular,pure_instructions
; ┌ @ array.jl:809 within `getindex'
pushq %rbp
movq %rsp, %rbp
leaq -1(%rdx), %rcx
cmpq 8(%rsi), %rcx
jae L38
movq %rdi, %rax
; │ @ array.jl:809 within `getindex'
movq (%rsi), %rdx
shlq $4, %rcx
vmovups (%rdx,%rcx), %xmm0
vmovups %xmm0, (%rdi)
movq %rbp, %rsp
popq %rbp
retq
L38:
movq %rsp, %rcx
leaq -16(%rcx), %rax
movq %rax, %rsp
movq %rdx, -16(%rcx)
movabsq $jl_bounds_error_ints, %rcx
movl $1, %edx
movq %rsi, %rdi
movq %rax, %rsi
callq *%rcx
nopl (%rax,%rax)
; └ |
Just look at masterjulia> @benchmark $xs[1]
BenchmarkTools.Trial:
memory estimate: 0 bytes
allocs estimate: 0
--------------
minimum time: 2.218 ns (0.00% GC)
median time: 2.229 ns (0.00% GC)
mean time: 2.297 ns (0.00% GC)
maximum time: 23.736 ns (0.00% GC)
--------------
samples: 10000
evals/sample: 1000 This PRjulia> @benchmark $xs[1]
BenchmarkTools.Trial:
memory estimate: 0 bytes
allocs estimate: 0
--------------
minimum time: 1.248 ns (0.00% GC)
median time: 1.468 ns (0.00% GC)
mean time: 1.445 ns (0.00% GC)
maximum time: 15.447 ns (0.00% GC)
--------------
samples: 10000
evals/sample: 1000 |
Alright, I've determined that |
Here is another benchmark, Masterjulia> using TimeZones
julia> const xs = ZonedDateTime.(1000:2000, 1, 2, tz"America/Winnipeg");
julia> const earliest_permitted = ZonedDateTime(1920, tz"UTC");
julia> @benchmark filter(x->x > $earliest_permitted, $xs)
BenchmarkTools.Trial:
memory estimate: 39.20 KiB
allocs estimate: 3
--------------
minimum time: 2.899 μs (0.00% GC)
median time: 3.280 μs (0.00% GC)
mean time: 3.700 μs (6.53% GC)
maximum time: 69.362 μs (88.95% GC)
--------------
samples: 10000
evals/sample: 9 this PRjulia> @benchmark filter(x->x > $earliest_permitted, $xs)
BenchmarkTools.Trial:
memory estimate: 15.81 KiB
allocs estimate: 1
--------------
minimum time: 1.037 μs (0.00% GC)
median time: 1.471 μs (0.00% GC)
mean time: 2.632 μs (40.14% GC)
maximum time: 543.182 μs (99.71% GC)
--------------
samples: 10000
evals/sample: 10 Do we want to add some of these to the packages benchmark suite? |
Yes, we will. I do need to get the package benchmarks up to date but posting them as comments for now is great. |
One thing we can do to help is overload
Or create a new type to hold the pair of them that hashs only using the
But if we can just make it |
Something that might be of interest. |
We can't use ShortStrings.jl as the longest time zone name in the tzdb is 32 bytes ( |
Too bad, Even for |
df6dc00
to
ae608b0
Compare
Rebased to take advantage of #291 and #292. Here's the result of running the package benchmarks: Benchmark Report for /Users/omus/.julia/dev/TimeZonesJob Properties
ResultsA ratio greater than
Benchmark Group ListHere's a list of all the benchmark groups executed by this job:
Julia versioninfoTarget
Baseline
|
I'm looking into ShortStrings.jl and StaticArrays.jl as alternatives to making ZonedDateTime an isbitstype |
I've been doing a bunch of experimentation and I'll provide an overview of some of bits of knowledge I've discovered:
|
You should ask @Keno if that is a reasonable expectation. |
I should have been clearer in my statement. The current implementation being 2x slower doesn't seem worth merging at this time. There are areas left to explore. I just wanted to provide an update as to why this PR has stalled. |
ae608b0
to
74c0be3
Compare
Benchmark Report for /Users/omus/.julia/dev/TimeZonesJob Properties
ResultsA ratio greater than
Benchmark Group ListHere's a list of all the benchmark groups executed by this job:
Julia versioninfoTarget
Baseline
|
Code definitely needs polishing but I've managed to get the memory reductions from having |
I assume this is not pursued anymore? |
This was just a POC. |
It's good to know there is still interest in this work. |
Fixes #271.
Current master:
I originally used a single
Dict
for lookup and the performance was:The latest version uses a
Vector
for lookup:I need to run some additional tests yet.