Skip to content

Fix #1176 JSON Decoder and Encoder limit disagreement #1242

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

jevonmao
Copy link

@jevonmao jevonmao commented Apr 7, 2025

Fix the JSON Decoder and Encoder limit disagreement bug described in issue #1176.

Due to the JSONWriter uses passed parameter to track the depth limit, its depth starts counting from 1 for the 1st iteration. But JSONScanner starts counting from 0. This result in large deep nested JSONs being able to decode without issues, but crashes when encoded in reverse.

I added additional JSON test file and test case to reproduce the error, and then incremented the maximum recursion depth variable for encoding to account for the off-by-1.

@jevonmao
Copy link
Author

jevonmao commented Apr 7, 2025

@YOCKOW Can you please take a look?

@weissi weissi requested a review from parkera April 8, 2025 10:24
Copy link
Contributor

@parkera parkera left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fine. The limit is somewhat arbitrary but it does make sense that we should be able to round trip to exactly the limit.

OpenStepPlist.swift also has a limit of 512 - does that one need to be updated as well? @kperryua

@kperryua
Copy link
Contributor

kperryua commented Apr 8, 2025

You can't round-trip an OpenStep plist, so that's less of a concern.

To make the symmetry more obvious, would it make more sense to start the counting at the same value instead? This 513 value begs for a comment explaining the choice.

@jevonmao
Copy link
Author

jevonmao commented Apr 8, 2025

@kperryua I considered about that. However, that would involve some much more significant code logic refactoring, rewriting most of JSONWriter.swift to be like JSONScanner.swift.

Another possibility is make depth start from -1, so it is at 0 for the first iteration. But I think that needs similar comment just like increasing the maximum limit.

mutating func serializeJSON(_ value: JSONEncoderValue, depth: Int = 0) throws {

But if you still prefer to see an alternate appraoch that refactor more code, lmk and I can look into that.

@YOCKOW
Copy link
Member

YOCKOW commented Apr 8, 2025

Edit: I noticed that I hadn't submitted my review. I'm sorry for having just requested reviews from others before my review...🙇


@kperryua @jevonmao
Just my spur-of-the-moment idea, can't we put guard depth < Self.maximumRecursionDepth else { ... } in serializeJSON instead of putting it in each function?

It may look like:

    mutating func serializeJSON(_ value: JSONEncoderValue, depth: Int = 0) throws {
        switch value {
        case .string(let str):
            serializeString(str)
        case .bool(let boolValue):
            writer(boolValue ? "true" : "false")
        case .number(let numberStr):
            writer(contentsOf: numberStr.utf8)
        case .array(let array):
            guard depth < Self.maximumRecursionDepth else {
                throw JSONError.tooManyNestedArraysOrDictionaries()
            }
            try serializeArray(array, depth: depth + 1)
        case .nonPrettyDirectArray(let arrayRepresentation):
            writer(contentsOf: arrayRepresentation)
        case let .directArray(bytes, lengths):
            guard depth < Self.maximumRecursionDepth else {
                throw JSONError.tooManyNestedArraysOrDictionaries()
            }
            try serializePreformattedByteArray(bytes, lengths, depth: depth + 1)
        case .object(let object):
            guard depth < Self.maximumRecursionDepth else {
                throw JSONError.tooManyNestedArraysOrDictionaries()
            }
            try serializeObject(object, depth: depth + 1)
        case .null:
            writer("null")
        }
    }

@jevonmao
Copy link
Author

jevonmao commented Apr 9, 2025

I think it is possible to further move the guard statements to top of the function?

mutating func serializeJSON(_ value: JSONEncoderValue, depth: Int = 0) throws {
        guard depth < Self.maximumRecursionDepth else {
            throw JSONError.tooManyNestedArraysOrDictionaries()
        }
        switch value {
        case .string(let str):
            serializeString(str)
        case .bool(let boolValue):
            writer(boolValue ? "true" : "false")
        case .number(let numberStr):
            writer(contentsOf: numberStr.utf8)
        case .array(let array):
            try serializeArray(array, depth: depth + 1)
        case .nonPrettyDirectArray(let arrayRepresentation):
            writer(contentsOf: arrayRepresentation)
        case let .directArray(bytes, lengths):
            try serializePreformattedByteArray(bytes, lengths, depth: depth + 1)
        case .object(let object):
            try serializeObject(object, depth: depth + 1)
        case .null:
            writer("null")
        }
    }

@jevonmao
Copy link
Author

jevonmao commented Apr 9, 2025

@YOCKOW Additionally, while debugging I discovered another issue with the way error is propagated in JSONDecoder for the test cases. It casts any JSONError thrown as as "not valid JSON" which I think can be misleading in this case, when its actually a too many nested array error.

image

@YOCKOW
Copy link
Member

YOCKOW commented Apr 10, 2025

@jevonmao

I think it is possible to further move the guard statements to top of the function?

Yeah, I think it's also ok if it doesn't affect perf.

It casts any JSONError thrown as "not valid JSON" which I think can be misleading in this case, when its actually a too many nested array error.

I'm not sure if it's intended or a bug, but I agree that it can be misleading.
Would you mind filing an issue separately?

@YOCKOW
Copy link
Member

YOCKOW commented Apr 10, 2025

@swift-ci Please smoke test

@YOCKOW
Copy link
Member

YOCKOW commented Apr 10, 2025

@swift-ci Please test

@YOCKOW
Copy link
Member

YOCKOW commented Apr 10, 2025

Jenkins hasn't reported results on macOS and on Linux (why?), but they seem to be passed:

However, it seems that JSONEncoderTests.test_JSONPassTests unexpectedly exited on Windows. Are there any possible Windows-specific issues?

@jevonmao
Copy link
Author

@YOCKOW I'm not too sure either, looking at the Windows test outputs it says:

√ Test run with 0 tests passed after 0.010 seconds.

All the test also pass without issues on my local development environment in Xcode. Can you give some help on this?

@YOCKOW
Copy link
Member

YOCKOW commented Apr 11, 2025

@YOCKOW I'm not too sure either, looking at the Windows test outputs it says:

√ Test run with 0 tests passed after 0.010 seconds.

That message just means that no tests did run with swift-testing. That's because swift-foundation doesn't migrate from XCTest to swift-testing yet: #908. You can see the same messages in the logs on macOS/Linux.

I guess the problem might lie in test_JSONPassTests.

On macOS:

Test Case '-[FoundationEssentialsTests.JSONEncoderTests test_JSONPassTests]' started.
Test Case '-[FoundationEssentialsTests.JSONEncoderTests test_JSONPassTests]' passed (0.218 seconds).

On Linux:

Test Case 'JSONEncoderTests.test_JSONPassTests' started at 2025-04-10 06:14:47.294
Test Case 'JSONEncoderTests.test_JSONPassTests' passed (0.197 seconds)

On the other hand, on Windows:

Test Case 'JSONEncoderTests.test_JSONPassTests' started at 2025-04-10 06:24:12.054

...Neither passed nor failed line can't be found. That implies the test process crashed during test_JSONPassTests.


@compnerd Would you give us some hints why this change fails only on Windows?

@compnerd
Copy link
Member

It is difficult to say, but my guess would be stack exhaustion as it seems that you are removing some of the recursion limits. Note that Windows has a much tighter bounds on the stack than Linux or macOS, so unless you are ensuring that the call is performing tail call elimination, you would exhaust the stack sooner on Windows than on other platforms.

@jevonmao
Copy link
Author

It is difficult to say, but my guess would be stack exhaustion as it seems that you are removing some of the recursion limits. Note that Windows has a much tighter bounds on the stack than Linux or macOS, so unless you are ensuring that the call is performing tail call elimination, you would exhaust the stack sooner on Windows than on other platforms.

Do you know if it is possible to debug the Windows issue if I only have Mac device?

@YOCKOW
Copy link
Member

YOCKOW commented Apr 18, 2025

@compnerd

Windows has a much tighter bounds on the stack than Linux or macOS, so unless you are ensuring that the call is performing tail call elimination, you would exhaust the stack sooner on Windows than on other platforms.

Thank you for your response. That makes sense.
But I don't think this PR introduces the deep-recursion bug. I guess the new tests added by this PR just revealed such a bug.

By the way, I did some coarse experiments1. From the results of them, program compiled with swift-6.1-RELEASE on Windows didn't crash during decoding/encoding a deeply nested instance. On the other hand, program compiled with the latest development snapshot on Windows crashed for the same situation.

(My questions is: Do some build configs of the compiler affect the results?)

🤔 Then, what do you think we should do? @jevonmao @kperryua @parkera

  • Refactor JSONWriter further in this PR.
  • Postpone refactoring JSONWriter and skip a new test in test_JSONPassTests temporarily on Windows after filing the issue.
  • Or...?

Footnotes

  1. https://github.com/YOCKOW/SF-Issue1176-Experiment and https://github.com/swiftlang/swift-foundation/pull/1250

@jevonmao
Copy link
Author

jevonmao commented May 5, 2025

@YOCKOW Personally I'm more on the side of keeping this PR within the scope of the issue it addresses, and postpone the fix for another future PR. But if you have other thoughts, just let me know and I can work on it.

@YOCKOW
Copy link
Member

YOCKOW commented May 6, 2025

@jevonmao

keeping this PR within the scope of the issue it addresses, and postpone the fix for another future PR.

I agree with you.
I think sign off is necessary on this. Sorry for pinging again @parkera. What do you think of the Windows-specific issue for this?

@YOCKOW YOCKOW requested a review from parkera May 6, 2025 01:31
@parkera
Copy link
Contributor

parkera commented May 6, 2025

The bottom line is that we can't merge this if it breaks Windows. Maybe the limit needs to be platform-specific?

@jevonmao
Copy link
Author

jevonmao commented May 7, 2025

@parkera I believe as pointed out by @YOCKOW , the new limits added isn't what caused the Windows crash, but rather the test case added just revealed it by testing a deep JSON recursion case.
Do you think it would be alright to merge if I conditionally skip the test on Windows and open a new issue? Or you intend to have this PR refactor JSONWriter and investgate more into the platform specific crash?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants