Skip to content

Migrate to encoding/json/v2 #292

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

inteon
Copy link
Member

@inteon inteon commented Jun 16, 2025

Replaces the github.com/json-iterator/go dependency with encoding/json/v2.
Performance is not yet great (feel free to push improvements/ create new PRs based on this PR):

# based on f90a164c304e010637632ae24ab0f05d66851259
$ go test -benchmem -run=^$ -bench ^BenchmarkFieldSet/serialize.*$ sigs.k8s.io/structured-merge-diff/v6/fieldpath -count=6 > pr.txt
$ go test -benchmem -run=^$ -bench ^BenchmarkFieldSet/serialize.*$ sigs.k8s.io/structured-merge-diff/v6/fieldpath -count=6 > master.txt
$ benchstat master.txt pr.txt

goos: linux
goarch: amd64
pkg: sigs.k8s.io/structured-merge-diff/v6/fieldpath
cpu: Intel(R) Core(TM) Ultra 7 165H
                            │  master.txt  │                pr.txt                │
                            │    sec/op    │    sec/op     vs base                │
FieldSet/serialize-20-8       3.647µ ±  3%    7.915µ ± 1%  +117.04% (p=0.002 n=6)
FieldSet/deserialize-20-8     11.07µ ±  7%    15.64µ ± 2%   +41.22% (p=0.002 n=6)
FieldSet/serialize-50-8       10.82µ ±  1%    21.85µ ± 2%  +102.00% (p=0.002 n=6)
FieldSet/deserialize-50-8     29.76µ ± 36%    43.60µ ± 2%   +46.48% (p=0.002 n=6)
FieldSet/serialize-100-8      38.58µ ±  4%    78.64µ ± 5%  +103.87% (p=0.002 n=6)
FieldSet/deserialize-100-8    83.91µ ±  2%   143.50µ ± 1%   +71.01% (p=0.002 n=6)
FieldSet/serialize-500-8      228.9µ ± 44%    420.8µ ± 6%   +83.84% (p=0.002 n=6)
FieldSet/deserialize-500-8    423.9µ ±  4%    699.2µ ± 1%   +64.95% (p=0.002 n=6)
FieldSet/serialize-1000-8     527.7µ ±  6%    929.0µ ± 2%   +76.04% (p=0.002 n=6)
FieldSet/deserialize-1000-8   979.2µ ± 33%   1578.9µ ± 5%   +61.24% (p=0.002 n=6)
geomean                       67.99µ          119.1µ        +75.18%

                            │  master.txt  │                pr.txt                │
                            │     B/op     │     B/op       vs base               │
FieldSet/serialize-20-8       1.543Ki ± 0%    2.754Ki ± 0%  +78.48% (p=0.002 n=6)
FieldSet/deserialize-20-8     9.761Ki ± 0%    8.269Ki ± 0%  -15.28% (p=0.002 n=6)
FieldSet/serialize-50-8       4.466Ki ± 0%    7.451Ki ± 0%  +66.83% (p=0.002 n=6)
FieldSet/deserialize-50-8     19.87Ki ± 0%    23.29Ki ± 0%  +17.22% (p=0.002 n=6)
FieldSet/serialize-100-8      15.73Ki ± 0%    25.00Ki ± 0%  +58.89% (p=0.002 n=6)
FieldSet/deserialize-100-8    59.21Ki ± 0%    80.49Ki ± 0%  +35.95% (p=0.002 n=6)
FieldSet/serialize-500-8      73.77Ki ± 0%   123.98Ki ± 0%  +68.06% (p=0.002 n=6)
FieldSet/deserialize-500-8    269.9Ki ± 0%    394.6Ki ± 0%  +46.20% (p=0.002 n=6)
FieldSet/serialize-1000-8     154.4Ki ± 1%    269.9Ki ± 0%  +74.84% (p=0.002 n=6)
FieldSet/deserialize-1000-8   593.5Ki ± 0%    870.3Ki ± 0%  +46.63% (p=0.002 n=6)
geomean                       34.33Ki         49.67Ki       +44.70%

                            │ master.txt  │               pr.txt                │
                            │  allocs/op  │  allocs/op   vs base                │
FieldSet/serialize-20-8        8.000 ± 0%   28.000 ± 0%  +250.00% (p=0.002 n=6)
FieldSet/deserialize-20-8      232.0 ± 0%    254.0 ± 0%    +9.48% (p=0.002 n=6)
FieldSet/serialize-50-8        14.00 ± 0%    74.00 ± 0%  +428.57% (p=0.002 n=6)
FieldSet/deserialize-50-8      655.0 ± 0%    739.0 ± 0%   +12.82% (p=0.002 n=6)
FieldSet/serialize-100-8       39.00 ± 0%   249.00 ± 0%  +538.46% (p=0.002 n=6)
FieldSet/deserialize-100-8    2.305k ± 0%   2.595k ± 0%   +12.60% (p=0.002 n=6)
FieldSet/serialize-500-8       185.0 ± 0%   1234.0 ± 0%  +567.03% (p=0.002 n=6)
FieldSet/deserialize-500-8    11.46k ± 0%   13.00k ± 0%   +13.45% (p=0.002 n=6)
FieldSet/serialize-1000-8      393.0 ± 0%   2662.0 ± 0%  +577.35% (p=0.002 n=6)
FieldSet/deserialize-1000-8   25.01k ± 0%   28.39k ± 0%   +13.52% (p=0.002 n=6)
geomean                        355.1         888.0       +150.06%

closes #202

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jun 16, 2025
@k8s-ci-robot k8s-ci-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Jun 16, 2025
@dims
Copy link
Member

dims commented Jun 17, 2025

xref: kubernetes/kubernetes#132312

@dims
Copy link
Member

dims commented Jun 18, 2025

For the pull-structured-merge-diff-test failure, please add this to fix the error @inteon

diff --git a/internal/cli/main_test.go b/internal/cli/main_test.go
index 3e409ede0673..8beed5c6cf55 100644
--- a/internal/cli/main_test.go
+++ b/internal/cli/main_test.go
@@ -21,6 +21,7 @@ import (
        "encoding/json"
        "io/ioutil"
        "path/filepath"
+       "strings"
        "testing"
 )

@@ -135,7 +136,7 @@ func (tt *testCase) checkOutput(t *testing.T, got []byte) {
                t.Fatalf("couldn't read expected output %q: %v", tt.expectedOutputPath, err)
        }

-       if a, e := string(got), string(want); a != e {
+       if a, e := strings.TrimSpace(string(got)), strings.TrimSpace(string(want)); a != e {
                t.Errorf("output didn't match expected output: got:\n%v\nwanted:\n%v\n", a, e)
        }
 }

@inteon inteon force-pushed the use_json_v2 branch 2 times, most recently from a4b6871 to bdce391 Compare June 18, 2025 10:14
@inteon
Copy link
Member Author

inteon commented Jun 18, 2025

For the pull-structured-merge-diff-test failure, please add this to fix the error @inteon
...

I fixed the test failure.

@dims
Copy link
Member

dims commented Jun 18, 2025

/assign @BenTheElder @liggitt

@liggitt
Copy link
Contributor

liggitt commented Jun 19, 2025

/assign @jpbetz
who is the primary apimachinery approver on this bit and was deeply involved in the initial performance-driven use of json-iterator in these bits

@liggitt
Copy link
Contributor

liggitt commented Jun 19, 2025

For the pull-structured-merge-diff-test failure, please add this to fix the error @inteon

I suspect using a json marshal function (like MarshalWrite) that doesn't append a newline would be a more efficient way to accomplish that

return nil, fmt.Errorf("parsing JSON: %v", err)
}

k := rawKey.String()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is rawKey.String() the same as decoding to a string, in terms of interpreting escape sequences, etc?

Comment on lines -330 to -339
{
JSON: `1.0`,
IntoType: reflect.TypeOf(json.Number("")),
Want: json.Number("1.0"),
},
{
JSON: `1`,
IntoType: reflect.TypeOf(json.Number("")),
Want: json.Number("1"),
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious if it's ok to drop these... were they added to try to catch a specific issue?

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: inteon
Once this PR has been reviewed and has the lgtm label, please ask for approval from jpbetz. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@liggitt
Copy link
Contributor

liggitt commented Jun 23, 2025

The .../deserialize... benchmarks actually don't look terrible now... I'd be willing to accept that performance drop in pursuit of correctness / safety.

The serialize benchmarks still look pretty rough. Need to see what we can improve there.

@liggitt
Copy link
Contributor

liggitt commented Jun 24, 2025

did you run the full set of benchmarks to see how we looked across all of them?

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 28, 2025
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 30, 2025
@liggitt
Copy link
Contributor

liggitt commented Jul 1, 2025

Thanks for the updates, how are the overall benchmarks looking (not just the subset in the description)?

As you were adjusting the implementation, were there any unit tests it would make sense to add to catch edges the previous implementations handled we want to ensure the new one does as well? I'm thinking specifically of things like:

  • handling of extra data in the input bytes/buffer when decoding/deserializing (e.g. "somevalue"extrastuff or {"key":"value"}extrastuff)
  • handling of special characters in strings that need escaping where the raw bytes would not be the same as the decoded or encoded/escaped bytes
  • handling of ignorable whitespace when decoding

@jpbetz
Copy link
Contributor

jpbetz commented Jul 1, 2025

First off- it's amazing to see this happening and the benchmarks are VERY promising. Thanks @inteon!

To get this to the finish line, and merge, what should our criteria be?

I chatted offline with @liggitt briefly and some of the criteria we discussed was:

  • Golang releases a stable json/v2 (The alternative would be to add an internal copy of json-experiment to this repo like kube-openapi has, but I don't know if it's worth it given how close json/v2 is to stable)
  • github.com/kubernetes/kubernetes CI test stability is not negatively impacted
  • This passes a scale test (SIG instrumentation)
  • We are confident on the correctness (triple check the implementation, shore up with additional functional tests)

Intuitively, it seems like the deserialization is already sufficiently fast. I suspect we need to optimize serialization a bit further since we serialize managed fields on all updates (not just patches). That said, I'm willing to be data driven here. If we can show downstream scale and performance is acceptable, I'm willing to accept a higher serialization perf regression in order to migrate to json/v2.

Thoughts, concerns?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

remove use of json-iterator / reflect2
6 participants