Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

POC: Make ZonedDateTime an isbits type #287

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open

Conversation

omus
Copy link
Member

@omus omus commented Sep 4, 2020

Fixes #271.

Current master:

julia> isbitstype(ZonedDateTime)
false

julia> @btime ZonedDateTime(2000, 1, 2, tz"America/Winnipeg")
  72.826 ns (3 allocations: 160 bytes)
2000-01-02T00:00:00-06:00

julia> @btime fill(ZonedDateTime(2000, 1, 2, tz"America/Winnipeg"), 100);
  212.670 ns (4 allocations: 1.03 KiB)

I originally used a single Dict for lookup and the performance was:

julia> isbitstype(ZonedDateTime)
true

julia> @btime ZonedDateTime(2000, 1, 2, tz"America/Winnipeg")
  721.936 ns (2 allocations: 160 bytes)
2000-01-02T00:00:00-06:00

julia> @btime fill(ZonedDateTime(2000, 1, 2, tz"America/Winnipeg"), 100);
  748.686 ns (3 allocations: 1.92 KiB)

The latest version uses a Vector for lookup:

julia> isbitstype(ZonedDateTime)
true

julia> @btime ZonedDateTime(2000, 1, 2, tz"America/Winnipeg")
  177.060 ns (4 allocations: 208 bytes)
2000-01-02T00:00:00-06:00

julia> @btime fill(ZonedDateTime(2000, 1, 2, tz"America/Winnipeg"), 100);
  285.683 ns (5 allocations: 1.97 KiB)

I need to run some additional tests yet.

@codecov-commenter
Copy link

codecov-commenter commented Sep 4, 2020

Codecov Report

Merging #287 into master will increase coverage by 0.09%.
The diff coverage is 96.66%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #287      +/-   ##
==========================================
+ Coverage   93.64%   93.74%   +0.09%     
==========================================
  Files          32       33       +1     
  Lines        1527     1583      +56     
==========================================
+ Hits         1430     1484      +54     
- Misses         97       99       +2     
Impacted Files Coverage Δ
src/TimeZones.jl 100.00% <ø> (ø)
src/types/timezone.jl 88.00% <ø> (ø)
src/external.jl 80.00% <80.00%> (ø)
src/types/variabletimezone.jl 96.66% <94.11%> (-3.34%) ⬇️
src/types/fixedtimezone.jl 100.00% <100.00%> (ø)
src/types/zoneddatetime.jl 96.70% <100.00%> (+0.70%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3800fca...74c0be3. Read the comment docs.

@omus
Copy link
Member Author

omus commented Sep 4, 2020

I'll note the benchmarks above are all using #281

@iamed2
Copy link
Member

iamed2 commented Sep 4, 2020

One thing to benchmark: f(zdt) = fill(zdt, 100)[end]

Current master:

julia> @benchmark f($(ZonedDateTime(2000, 1, 2, tz"America/Winnipeg")))
BenchmarkTools.Trial:
  memory estimate:  4.06 KiB
  allocs estimate:  1
  --------------
  minimum time:     402.869 ns (0.00% GC)
  median time:      653.872 ns (0.00% GC)
  mean time:        1.472 μs (48.25% GC)
  maximum time:     52.163 μs (97.32% GC)
  --------------
  samples:          10000
  evals/sample:     199

This PR:

julia> @benchmark f($(ZonedDateTime(2000, 1, 2, tz"America/Winnipeg")))
BenchmarkTools.Trial:
  memory estimate:  1.77 KiB
  allocs estimate:  1
  --------------
  minimum time:     135.235 ns (0.00% GC)
  median time:      144.223 ns (0.00% GC)
  mean time:        158.605 ns (6.51% GC)
  maximum time:     881.062 ns (80.74% GC)
  --------------
  samples:          10000
  evals/sample:     886

Huge difference! And notably much lower GC time now, which is one thing we were trying to fix.

serialize(Serializer(b), zdt)
seekstart(b)

@test deserialize(Serializer(b)) == zdt
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should have a test that emulates, or just plain is, another process with a different set of globals.
This looks like it handles that right to me, but let's have tests to prove it

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can temporarily clear the globals for the tests

@omus
Copy link
Member Author

omus commented Sep 4, 2020

Looks like we don't always get better performance (Julia 1.5.1)

Current master (70df06e)

julia> using BenchmarkTools, TimeZones, Dates

[ Info: Precompiling BenchmarkTools [6e4b80f9-dd63-53aa-95a3-0cdb28fa8baf]
[ Info: Precompiling TimeZones [f269a46b-ccf7-5d73-abea-4c690281aa53]

julia> f2() = collect(ZonedDateTime(2000, tz"America/Winnipeg"):Day(1):ZonedDateTime(2020, tz"America/Winnipeg"))[end]
f2 (generic function with 1 method)

julia> @benchmark f2()
BenchmarkTools.Trial:
  memory estimate:  2.40 MiB
  allocs estimate:  21967
  --------------
  minimum time:     710.109 μs (0.00% GC)
  median time:      982.287 μs (0.00% GC)
  mean time:        1.249 ms (9.08% GC)
  maximum time:     8.318 ms (0.00% GC)
  --------------
  samples:          3998
  evals/sample:     1

This PR

julia> using BenchmarkTools, TimeZones, Dates

julia> f2() = collect(ZonedDateTime(2000, tz"America/Winnipeg"):Day(1):ZonedDateTime(2020, tz"America/Winnipeg"))[end]
f2 (generic function with 1 method)

julia> @benchmark f2()
BenchmarkTools.Trial:
  memory estimate:  1.79 MiB
  allocs estimate:  29284
  --------------
  minimum time:     1.528 ms (0.00% GC)
  median time:      2.330 ms (0.00% GC)
  mean time:        3.157 ms (4.84% GC)
  maximum time:     118.613 ms (3.35% GC)
  --------------
  samples:          1584
  evals/sample:     1

@iamed2
Copy link
Member

iamed2 commented Sep 4, 2020

This seems to be down to the math part:

master:

julia> f3(zdt) = zdt + Day(1)
f3 (generic function with 1 method)

julia> @benchmark f3($(ZonedDateTime(2000, 1, 2, tz"America/Winnipeg")))
BenchmarkTools.Trial:
  memory estimate:  304 bytes
  allocs estimate:  3
  --------------
  minimum time:     85.805 ns (0.00% GC)
  median time:      88.082 ns (0.00% GC)
  mean time:        99.505 ns (8.84% GC)
  maximum time:     2.017 μs (95.36% GC)
  --------------
  samples:          10000
  evals/sample:     962

this PR:

julia> f3(zdt) = zdt + Day(1)
f3 (generic function with 1 method)

julia> @benchmark f3($(ZonedDateTime(2000, 1, 2, tz"America/Winnipeg")))
BenchmarkTools.Trial:
  memory estimate:  240 bytes
  allocs estimate:  4
  --------------
  minimum time:     195.435 ns (0.00% GC)
  median time:      199.786 ns (0.00% GC)
  mean time:        216.572 ns (5.79% GC)
  maximum time:     4.863 μs (95.45% GC)
  --------------
  samples:          10000
  evals/sample:     621

@omus
Copy link
Member Author

omus commented Sep 4, 2020

Nice find. We can probably address that by making FixedTimeZone a isbits type and having that directly accessible by the ZonedDateTime struct

@iamed2
Copy link
Member

iamed2 commented Sep 4, 2020

Hmm I think this mainly has to do with the speed of interpret, I'm going to look at that

@oxinabox
Copy link
Contributor

oxinabox commented Sep 4, 2020

Problem:
It doesn't seem to make vectors of ZonedDateTimes isbits

julia> const xs = ZonedDateTime.(1000:2000, 1, 2, tz"America/Winnipeg");

julia> isbits(xs[1])
true

julia> isbits(xs)
false

(this is Julia 1.5)

I was looking at the performance of diff which I was hoping would be improved by more, since indexing operations should be faster as it should be inlining the memory rather than referencing it.
It is faster but i was hoping for more.

using BenchmarkTools
using TimeZones

const xs = ZonedDateTime.(1000:2000, 1, 2, tz"America/Winnipeg");

@benchmark diff($xs)


# master
julia> @benchmark diff($xs)
BenchmarkTools.Trial:
  memory estimate:  7.94 KiB
  allocs estimate:  1
  --------------
  minimum time:     1.141 μs (0.00% GC)
  median time:      1.654 μs (0.00% GC)
  mean time:        2.646 μs (32.99% GC)
  maximum time:     667.165 μs (99.68% GC)
  --------------
  samples:          10000
  evals/sample:     10


# this branch
julia> @benchmark diff($xs)
BenchmarkTools.Trial:
  memory estimate:  7.94 KiB
  allocs estimate:  1
  --------------
  minimum time:     780.589 ns (0.00% GC)
  median time:      1.177 μs (0.00% GC)
  mean time:        2.181 μs (42.82% GC)
  maximum time:     77.476 μs (96.32% GC)
  --------------
  samples:          10000
  evals/sample:     90

@omus
Copy link
Member Author

omus commented Sep 4, 2020

It doesn't seem to make vectors of ZonedDateTimes isbits

Isn't that true for all isbits types?

julia> xs = [1,2,3]
3-element Array{Int64,1}:
 1
 2
 3

julia> isbits(xs[1])
true

julia> isbits(xs)
false

@oxinabox
Copy link
Contributor

oxinabox commented Sep 4, 2020

fair enough.
Looking at the @code_native for getindex it it does seem short now.
I haven't looked closely enough to see if the shortness is because of it not needing to deal with being pointers unther the hood but that seems likely.

master

julia> @code_native xs[1]
        .section        __TEXT,__text,regular,pure_instructions
; ┌ @ array.jl:809 within `getindex'
        pushq   %rbp
        movq    %rsp, %rbp
        movq    %rdx, %r8
        movq    %rdi, %rax
        leaq    -1(%rcx), %rdi
        cmpq    8(%rdx), %rdi
        jae     L80
        movq    (%r8), %rdx
        leaq    (%rdi,%rdi,4), %rdi
        movq    8(%rdx,%rdi,8), %rcx
        testq   %rcx, %rcx
        je      L114
        movq    16(%rdx,%rdi,8), %r8
        vmovups 24(%rdx,%rdi,8), %xmm0
        movq    (%rdx,%rdi,8), %rdx
        movq    %rcx, (%rsi)
        movq    %r8, 8(%rsi)
        movq    %rdx, (%rax)
        movq    %rcx, 8(%rax)
        movq    %r8, 16(%rax)
        vmovups %xmm0, 24(%rax)
        movq    %rbp, %rsp
        popq    %rbp
        retq
L80:
        movq    %rsp, %rax
        leaq    -16(%rax), %rsi
        movq    %rsi, %rsp
        movq    %rcx, -16(%rax)
        movabsq $jl_bounds_error_ints, %rax
        movl    $1, %edx
        movq    %r8, %rdi
        callq   *%rax
L114:
        movabsq $jl_throw, %rax
        movabsq $jl_system_image_data, %rdi
        callq   *%rax
        nopl    (%rax,%rax)
; └

this PR

julia> @code_native xs[1]
        .section        __TEXT,__text,regular,pure_instructions
; ┌ @ array.jl:809 within `getindex'
        pushq   %rbp
        movq    %rsp, %rbp
        leaq    -1(%rdx), %rcx
        cmpq    8(%rsi), %rcx
        jae     L38
        movq    %rdi, %rax
; │ @ array.jl:809 within `getindex'
        movq    (%rsi), %rdx
        shlq    $4, %rcx
        vmovups (%rdx,%rcx), %xmm0
        vmovups %xmm0, (%rdi)
        movq    %rbp, %rsp
        popq    %rbp
        retq
L38:
        movq    %rsp, %rcx
        leaq    -16(%rcx), %rax
        movq    %rax, %rsp
        movq    %rdx, -16(%rcx)
        movabsq $jl_bounds_error_ints, %rcx
        movl    $1, %edx
        movq    %rsi, %rdi
        movq    %rax, %rsi
        callq   *%rcx
        nopl    (%rax,%rax)
; └

@oxinabox
Copy link
Contributor

oxinabox commented Sep 4, 2020

Just look at getindex time, it shows that it is now 1.7x faster.
Which is going to really add up in a lot of processing situations.
🎉

master

julia> @benchmark $xs[1]
BenchmarkTools.Trial:
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     2.218 ns (0.00% GC)
  median time:      2.229 ns (0.00% GC)
  mean time:        2.297 ns (0.00% GC)
  maximum time:     23.736 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1000

This PR

julia> @benchmark $xs[1]
BenchmarkTools.Trial:
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     1.248 ns (0.00% GC)
  median time:      1.468 ns (0.00% GC)
  mean time:        1.445 ns (0.00% GC)
  maximum time:     15.447 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1000

@iamed2
Copy link
Member

iamed2 commented Sep 4, 2020

Alright, I've determined that interpret was only slower because it constructs new ZonedDateTimes, which is now slower. Specifically, ZonedDateTime(::DateTime, ::VariableTimeZone, ::FixedTimeZone) used to be <10 ns, and now it's >70 ns.

@oxinabox
Copy link
Contributor

oxinabox commented Sep 4, 2020

Here is another benchmark,
filtering, which is pretty similar to operations we do a lot of.
It is now 2.8x faster

Master

julia> using TimeZones

julia> const xs = ZonedDateTime.(1000:2000, 1, 2, tz"America/Winnipeg");

julia> const earliest_permitted = ZonedDateTime(1920, tz"UTC");

julia> @benchmark filter(x->x > $earliest_permitted, $xs)
BenchmarkTools.Trial:
  memory estimate:  39.20 KiB
  allocs estimate:  3
  --------------
  minimum time:     2.899 μs (0.00% GC)
  median time:      3.280 μs (0.00% GC)
  mean time:        3.700 μs (6.53% GC)
  maximum time:     69.362 μs (88.95% GC)
  --------------
  samples:          10000
  evals/sample:     9

this PR

julia> @benchmark filter(x->x > $earliest_permitted, $xs)
BenchmarkTools.Trial:
  memory estimate:  15.81 KiB
  allocs estimate:  1
  --------------
  minimum time:     1.037 μs (0.00% GC)
  median time:      1.471 μs (0.00% GC)
  mean time:        2.632 μs (40.14% GC)
  maximum time:     543.182 μs (99.71% GC)
  --------------
  samples:          10000
  evals/sample:     10

Do we want to add some of these to the packages benchmark suite?

@omus
Copy link
Member Author

omus commented Sep 4, 2020

Do we want to add some of these to the packages benchmark suite?

Yes, we will. I do need to get the package benchmarks up to date but posting them as comments for now is great.

@oxinabox
Copy link
Contributor

oxinabox commented Sep 4, 2020

Alright, I've determined that interpret was only slower because it constructs new ZonedDateTimes, which is now slower. Specifically, ZonedDateTime(::DateTime, ::VariableTimeZone, ::FixedTimeZone) used to be <10 ns, and now it's >70 ns.

One thing we can do to help is overload hash for FixedTImeZone to just used the string (like in #281 did for VariableTimeZones)
That will shave a small chunk off (20ns)

julia> @btime hash(TimeZones.UTC_ZERO);
  30.749 ns (0 allocations: 0 bytes)

julia> @btime hash(TimeZones.UTC_ZERO.name);
  8.724 ns (0 allocations: 0 bytes)

julia> @btime hash(TimeZones.UTC_ZERO.offset);
  12.328 ns (0 allocations: 0 bytes)

Or create a new type to hold the pair of them that hashs only using the VariableTImeZone part.

Nice find. We can probably address that by making FixedTimeZone a isbits type and having that directly accessible by the ZonedDateTime struct

But if we can just make it isbits and not need to have it live in a Dict that is even better.

@oxinabox
Copy link
Contributor

oxinabox commented Sep 5, 2020

Something that might be of interest.
ShortStrings.jl is isbits.
It supports strings of up to 15 characters via reinterpretting a UInt
We could consider that,
or using the same trick, for some of our String fields. If we know they have to be short. (Or can reasonably apply that restriction)

https://github.com/xiaodaigh/ShortStrings.jl

@omus
Copy link
Member Author

omus commented Sep 8, 2020

ShortStrings.jl is isbits. It supports strings of up to 15 characters via reinterpretting a UInt

We can't use ShortStrings.jl as the longest time zone name in the tzdb is 32 bytes ("America/Argentina/ComodRivadavia")

@oxinabox
Copy link
Contributor

oxinabox commented Sep 8, 2020

Too bad, Even for FixTimeZone? (I don't understand what a FixedTimeZone is)
Possibly ShortStrings can be change to allow longer
JuliaString/ShortStrings.jl#9

@omus
Copy link
Member Author

omus commented Sep 23, 2020

Rebased to take advantage of #291 and #292. Here's the result of running the package benchmarks:

Benchmark Report for /Users/omus/.julia/dev/TimeZones

Job Properties

  • Time of benchmarks:
    • Target: 23 Sep 2020 - 13:48
    • Baseline: 23 Sep 2020 - 13:49
  • Package commits:
    • Target: ae608b
    • Baseline: 42bc06
  • Julia commits:
    • Target: 697e78
    • Baseline: 697e78
  • Julia command flags:
    • Target: None
    • Baseline: None
  • Environment variables:
    • Target: None
    • Baseline: None

Results

A ratio greater than 1.0 denotes a possible regression (marked with ❌), while a ratio less
than 1.0 denotes a possible improvement (marked with ✅). Only significant results - results
that indicate possible regressions or improvements - are shown below (thus, an empty table means that all
benchmark results remained invariant between builds).

ID time ratio memory ratio
["ZonedDateTime", "date-period-range"] 2.12 (5%) ❌ 1.10 (1%) ❌
["ZonedDateTime", "fill"] 0.33 (5%) ✅ 0.43 (1%) ✅
["ZonedDateTime", "local", "ambiguous"] 2.66 (5%) ❌ 0.00 (1%) ✅
["ZonedDateTime", "local", "standard"] 1.59 (5%) ❌ 0.00 (1%) ✅
["ZonedDateTime", "time-period-range"] 2.64 (5%) ❌ 0.73 (1%) ✅
["ZonedDateTime", "utc"] 2.27 (5%) ❌ 0.00 (1%) ✅
["arithmetic", "DatePeriod"] 2.12 (5%) ❌ 1.67 (1%) ❌
["arithmetic", "TimePeriod"] 2.66 (5%) ❌ 1.00 (1%)
["parse", "ISOZonedDateTimeFormat"] 1.15 (5%) ❌ 1.00 (1%)
["parse", "issue#25"] 1.01 (5%) 0.88 (1%) ✅
["transition_range", "local", "standard"] 1.08 (5%) ❌ 1.00 (1%)
["transition_range", "utc"] 1.06 (5%) ❌ 1.00 (1%)
["tryparsenext_tz", "EST"] 0.94 (5%) ✅ 1.00 (1%)
["tryparsenext_tz", "GMT"] 0.94 (5%) ✅ 1.00 (1%)
["tryparsenext_tz", "UTC"] 0.93 (5%) ✅ 1.00 (1%)

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["ZonedDateTime"]
  • ["ZonedDateTime", "local"]
  • ["arithmetic"]
  • ["interpret", "local"]
  • ["interpret"]
  • ["parse"]
  • ["transition_range", "local"]
  • ["transition_range"]
  • ["tryparsenext_fixedtz"]
  • ["tryparsenext_tz"]
  • ["tz_data"]

Julia versioninfo

Target

Julia Version 1.5.1
Commit 697e782ab8 (2020-08-25 20:08 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin18.7.0)
  uname: Darwin 18.7.0 Darwin Kernel Version 18.7.0: Thu Jun 18 20:50:10 PDT 2020; root:xnu-4903.278.43~1/RELEASE_X86_64 x86_64 i386
  CPU: Intel(R) Core(TM) i7-7567U CPU @ 3.50GHz:
              speed         user         nice          sys         idle          irq
       #1  3500 MHz     111442 s          0 s      96757 s     667877 s          0 s
       #2  3500 MHz      68121 s          0 s      35735 s     772215 s          0 s
       #3  3500 MHz     113850 s          0 s      67926 s     694296 s          0 s
       #4  3500 MHz      62131 s          0 s      30166 s     783775 s          0 s

  Memory: 16.0 GB (422.0625 MB free)
  Uptime: 87608.0 sec
  Load Avg:  3.81103515625  3.82373046875  3.66259765625
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-9.0.1 (ORCJIT, skylake)

Baseline

Julia Version 1.5.1
Commit 697e782ab8 (2020-08-25 20:08 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin18.7.0)
  uname: Darwin 18.7.0 Darwin Kernel Version 18.7.0: Thu Jun 18 20:50:10 PDT 2020; root:xnu-4903.278.43~1/RELEASE_X86_64 x86_64 i386
  CPU: Intel(R) Core(TM) i7-7567U CPU @ 3.50GHz:
              speed         user         nice          sys         idle          irq
       #1  3500 MHz     111732 s          0 s      96821 s     668197 s          0 s
       #2  3500 MHz      68354 s          0 s      35772 s     772622 s          0 s
       #3  3500 MHz     114184 s          0 s      67986 s     694577 s          0 s
       #4  3500 MHz      62343 s          0 s      30201 s     784202 s          0 s

  Memory: 16.0 GB (70.98046875 MB free)
  Uptime: 87675.0 sec
  Load Avg:  3.2783203125  3.67236328125  3.61767578125
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-9.0.1 (ORCJIT, skylake)

@omus omus added the performance Changes that impact code performance label Sep 23, 2020
@omus
Copy link
Member Author

omus commented Sep 24, 2020

I'm looking into ShortStrings.jl and StaticArrays.jl as alternatives to making ZonedDateTime an isbitstype

@omus
Copy link
Member Author

omus commented Sep 25, 2020

I've been doing a bunch of experimentation and I'll provide an overview of some of bits of knowledge I've discovered:

  • BitIntegers.jl had a bug where only IOBuffer could be used with serialization (Serialization doesn't work with IOStream rfourquet/BitIntegers.jl#19)
  • ShortStrings.jl could only be constructed with String and not SubString{String} (Support SubString{String} JuliaString/ShortStrings.jl#13)
  • ShortStrings.jl require constructing a String again for most operations including hashing which results worse performance when combined with storing the fields externally via a Dict/Vector
  • Attempting to make ZonedDateTime an isbitstype by making all subfields isbitstype is challenging.
    • A FixedTimeZone can be made isbitstype by using ShortStrings.jl but VariableTimeZone requires using ShortString and StaticArrays.jl (to replace the Vector{Transition}) which performed very poorly (never actually finished).
    • Updating VariableTimeZone to no longer precompute transitions (thereby eliminating Vector{Transition}) and instead use zones/rules from the tzsource may be feasible but would require a tuple of zones/rules as well as require a reference to an function
  • Storing the fields externally via a Dict/Vector (as in this PR) results in ZonedDateTime constructor performance that is over 2x slower than master but can result in no allocations in some scenarios. To me this seems like a bad trade off as I'd expect GC improvements in the future to reduce the impact of GC.

@oxinabox
Copy link
Contributor

oxinabox commented Sep 25, 2020

To me this seems like a bad trade off as I'd expect GC improvements in the future to reduce the impact of GC.

You should ask @Keno if that is a reasonable expectation.
Writing code that causes problems today on the assumption the language will improve to make it not cause problems tomorrow is risky.

@omus
Copy link
Member Author

omus commented Sep 26, 2020

I should have been clearer in my statement. The current implementation being 2x slower doesn't seem worth merging at this time. There are areas left to explore. I just wanted to provide an update as to why this PR has stalled.

@omus omus force-pushed the cv/zdt-isbits-poc branch from ae608b0 to 74c0be3 Compare September 29, 2020 20:58
@omus
Copy link
Member Author

omus commented Sep 29, 2020

Benchmark Report for /Users/omus/.julia/dev/TimeZones

Job Properties

  • Time of benchmarks:
    • Target: 29 Sep 2020 - 15:56
    • Baseline: 29 Sep 2020 - 15:57
  • Package commits:
    • Target: 74c0be
    • Baseline: 3800fc
  • Julia commits:
    • Target: 539f3c
    • Baseline: 539f3c
  • Julia command flags:
    • Target: None
    • Baseline: None
  • Environment variables:
    • Target: None
    • Baseline: None

Results

A ratio greater than 1.0 denotes a possible regression (marked with ❌), while a ratio less
than 1.0 denotes a possible improvement (marked with ✅). Only significant results - results
that indicate possible regressions or improvements - are shown below (thus, an empty table means that all
benchmark results remained invariant between builds).

ID time ratio memory ratio
["ZonedDateTime", "fill"] 0.55 (5%) ✅ 0.62 (1%) ✅
["ZonedDateTime", "local", "ambiguous"] 0.73 (5%) ✅ 0.00 (1%) ✅
["ZonedDateTime", "local", "standard"] 0.83 (5%) ✅ 0.00 (1%) ✅
["ZonedDateTime", "range", "FixedTimeZone/DatePeriod"] 0.71 (5%) ✅ 0.61 (1%) ✅
["ZonedDateTime", "range", "FixedTimeZone/TimePeriod"] 0.74 (5%) ✅ 0.61 (1%) ✅
["ZonedDateTime", "range", "VariableTimeZone/DatePeriod"] 0.91 (5%) ✅ 0.27 (1%) ✅
["ZonedDateTime", "range", "VariableTimeZone/TimePeriod"] 0.80 (5%) ✅ 0.27 (1%) ✅
["ZonedDateTime", "utc"] 0.73 (5%) ✅ 0.00 (1%) ✅
["arithmetic", "DatePeriod"] 0.82 (5%) ✅ 0.00 (1%) ✅
["arithmetic", "TimePeriod"] 0.76 (5%) ✅ 0.00 (1%) ✅
["arithmetic", "broadcast", "FixedTimeZone/DatePeriod"] 0.87 (5%) ✅ 1.00 (1%)
["arithmetic", "broadcast", "FixedTimeZone/TimePeriod"] 0.67 (5%) ✅ 0.61 (1%) ✅
["arithmetic", "broadcast", "VariableTimeZone/DatePeriod"] 0.91 (5%) ✅ 0.00 (1%) ✅
["arithmetic", "broadcast", "VariableTimeZone/TimePeriod"] 0.81 (5%) ✅ 0.18 (1%) ✅
["interpret", "local", "non-existent"] 1.05 (5%) ❌ 1.00 (1%)
["interpret", "local", "standard"] 1.06 (5%) ❌ 1.00 (1%)
["parse", "ISOZonedDateTimeFormat"] 0.99 (5%) 0.95 (1%) ✅
["parse", "issue#25"] 1.00 (5%) 0.83 (1%) ✅
["transition_range", "utc"] 1.13 (5%) ❌ 1.00 (1%)
["tryparsenext_tz", "UTC"] 1.06 (5%) ❌ 1.00 (1%)
["tz_data", "parse_components"] 0.98 (5%) 0.92 (1%) ✅

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["ZonedDateTime"]
  • ["ZonedDateTime", "local"]
  • ["ZonedDateTime", "range"]
  • ["arithmetic"]
  • ["arithmetic", "broadcast"]
  • ["interpret", "local"]
  • ["interpret"]
  • ["parse"]
  • ["transition_range", "local"]
  • ["transition_range"]
  • ["tryparsenext_fixedtz"]
  • ["tryparsenext_tz"]
  • ["tz_data"]

Julia versioninfo

Target

Julia Version 1.5.2
Commit 539f3ce943 (2020-09-23 23:17 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin18.7.0)
  uname: Darwin 18.7.0 Darwin Kernel Version 18.7.0: Thu Jun 18 20:50:10 PDT 2020; root:xnu-4903.278.43~1/RELEASE_X86_64 x86_64 i386
  CPU: Intel(R) Core(TM) i7-7567U CPU @ 3.50GHz:
              speed         user         nice          sys         idle          irq
       #1  3500 MHz     784274 s          0 s     636187 s    4712956 s          0 s
       #2  3500 MHz     422826 s          0 s     233582 s    5477004 s          0 s
       #3  3500 MHz     798144 s          0 s     459417 s    4875851 s          0 s
       #4  3500 MHz     381055 s          0 s     195692 s    5556665 s          0 s

  Memory: 16.0 GB (185.93359375 MB free)
  Uptime: 613664.0 sec
  Load Avg:  3.69287109375  4.08935546875  4.1923828125
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-9.0.1 (ORCJIT, skylake)

Baseline

Julia Version 1.5.2
Commit 539f3ce943 (2020-09-23 23:17 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin18.7.0)
  uname: Darwin 18.7.0 Darwin Kernel Version 18.7.0: Thu Jun 18 20:50:10 PDT 2020; root:xnu-4903.278.43~1/RELEASE_X86_64 x86_64 i386
  CPU: Intel(R) Core(TM) i7-7567U CPU @ 3.50GHz:
              speed         user         nice          sys         idle          irq
       #1  3500 MHz     784738 s          0 s     636302 s    4713134 s          0 s
       #2  3500 MHz     423035 s          0 s     233633 s    5477500 s          0 s
       #3  3500 MHz     798480 s          0 s     459544 s    4876144 s          0 s
       #4  3500 MHz     381256 s          0 s     195738 s    5557174 s          0 s

  Memory: 16.0 GB (281.078125 MB free)
  Uptime: 613739.0 sec
  Load Avg:  4.03564453125  4.10986328125  4.18798828125
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-9.0.1 (ORCJIT, skylake)

@omus
Copy link
Member Author

omus commented Sep 29, 2020

Code definitely needs polishing but I've managed to get the memory reductions from having ZonedDateTime isbits while keeping, or improving upon, the original performance. Mainly what needed to change was reducing the number of calls to hash in order to get the index.

@laborg
Copy link

laborg commented Nov 9, 2024

I assume this is not pursued anymore?

@oxinabox
Copy link
Contributor

This was just a POC.
In general this is a desirable feature.
No one is currently working on it

@omus
Copy link
Member Author

omus commented Nov 12, 2024

It's good to know there is still interest in this work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Changes that impact code performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ZonedDateTime not isbits
5 participants