Skip to content

perf(json): address performance penalty found in v0.4.0+ #149

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jul 18, 2025

Conversation

dominicbarnes
Copy link
Contributor

As part of #137, a new helper function decodeInto was introduced that used generics and was added into the decode path for objects, arrays, strings and numbers.

Unfortunately, this seems to have introduced a pretty severe memory penalty for one of our internal services, showing at least a significant increase of memory heap allocations (>30% in a testing environment) when decoding map[string]any values.

For the last few years, we've actually just stayed pinned to v0.3.6, but now we're being pulled forward by other dependency updates and need to have this addressed.

Once isolating the root cause to this particular helper function, I added some more benchmarks to cover this situation and did a comparison between the current "main.txt" and this proposed "branch.txt":

$ go test ./json -bench "BenchmarkUnmarshal(String|Object|Array)" -count=10 > main.txt
$ go test ./json -bench "BenchmarkUnmarshal(String|Object|Array)" -count=10 > branch.txt
$ benchstat main.txt branch.txt
goos: darwin
goarch: arm64
pkg: github.com/segmentio/encoding/json
cpu: Apple M3 Pro
                        │  main.txt   │             branch.txt              │
                        │   sec/op    │   sec/op     vs base                │
UnmarshalString-11        9.462n ± 6%   7.115n ± 3%  -24.80% (p=0.000 n=10)
UnmarshalObjectMixed-11   718.1n ± 1%   214.0n ± 1%  -70.21% (p=0.000 n=10)
UnmarshalArrayMixed-11    698.7n ± 0%   172.1n ± 1%  -75.37% (p=0.000 n=10)
geomean                   168.1n        63.99n       -61.93%

                        │  main.txt   │              branch.txt              │
                        │    B/op     │    B/op     vs base                  │
UnmarshalString-11         16.00 ± 0%   16.00 ± 0%        ~ (p=1.000 n=10) ¹
UnmarshalObjectMixed-11   5328.0 ± 0%   712.0 ± 0%  -86.64% (p=0.000 n=10)
UnmarshalArrayMixed-11    5280.0 ± 0%   664.0 ± 0%  -87.42% (p=0.000 n=10)
geomean                    766.4        196.3       -74.39%
¹ all samples are equal

                        │  main.txt  │              branch.txt              │
                        │ allocs/op  │ allocs/op   vs base                  │
UnmarshalString-11        1.000 ± 0%   1.000 ± 0%        ~ (p=1.000 n=10) ¹
UnmarshalObjectMixed-11   26.00 ± 0%   24.00 ± 0%   -7.69% (p=0.000 n=10)
UnmarshalArrayMixed-11    18.00 ± 0%   16.00 ± 0%  -11.11% (p=0.000 n=10)
geomean                   7.764        7.268        -6.38%
¹ all samples are equal

In short, reverting the object/array/string decode logic back to what it was previously, we get pretty significant improvements across the board.

Since the main point of #137 was about that number decoding, I decided to leave that alone as we are not using this and do not presume the performance characteristics, but the rest of the changes seemed like fair game to revert.

Copy link
Contributor

@extemporalgenome extemporalgenome left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dominicbarnes dominicbarnes merged commit 7b9ca4e into master Jul 18, 2025
8 checks passed
@dominicbarnes dominicbarnes deleted the perf-int-consumer branch July 18, 2025 16:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants