Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CBOR serializer encodes null class instance as empty map instead of null #2848

Open
Jacob-Amazon opened this issue Nov 5, 2024 · 11 comments

Comments

@Jacob-Amazon
Copy link

Traced to this line:

I consider this a bug. Null objects should be encoded as the simple type null, no matter whether they are class types or other types.

In our use case, this is causing issues after deserialization, since the empty map is deserialized into a default instance of the type, rather than null.

Please consider removing this line and always encoding as null.

@sandwwraith
Copy link
Member

Can you please provide an example of serializable class and expected/actual output?

@Jacob-Amazon
Copy link
Author

@Serializable
data class Test(val nullableClass: Inner?, val nullableMap: Map<String, String>?)

@Serializable
data class Inner(val test: Int)

val test = Test(nullableClass = null, nullableMap = null)
Cbor.encodeToByteArray(Test.serializer(), test)
// Hex output: bf6d6e756c6c61626c65436c617373a06b6e756c6c61626c654d6170f6ff

The hex output bf6d6e756c6c61626c65436c617373a06b6e756c6c61626c654d6170f6ff looks like this (from https://cbor.me/):

{ "nullableClass": {}, "nullableMap": null}
BF                               # map(*)
   6D                            # text(13)
      6E756C6C61626C65436C617373 # "nullableClass"
   A0                            # map(0)
   6B                            # text(11)
      6E756C6C61626C654D6170     # "nullableMap"
   F6                            # primitive(22)
   FF                            # primitive(*)

Expected output would be this: bf6d6e756c6c61626c65436c617373f66b6e756c6c61626c654d6170f6ff

{ "nullableClass": null, "nullableMap": null}
BF                               # map(*)
   6D                            # text(13)
      6E756C6C61626C65436C617373 # "nullableClass"
   F6                            # primitive(22)
   6B                            # text(11)
      6E756C6C61626C654D6170     # "nullableMap"
   F6                            # primitive(22)
   FF                            # primitive(*)

@ardune
Copy link
Contributor

ardune commented Feb 10, 2025

+1 on this

I am working on a project that involves ISO 18013-5 (mDL)
One of the types in that spec is "SessionTranscript" - which is a CBOR array with an optional element at the end (may be null or some other type)

That type is used in cryptographic operations - so the values must match (each party creates the SessionTranscript independently)

For sake of clarity, a simplified version of the type could be:

    @OptIn(ExperimentalSerializationApi::class)
    @CborArray
    @Serializable
    data class SessionTranscript(
        val someValue: String,
        val nested: OtherKind?
    )

    @OptIn(ExperimentalSerializationApi::class)
    @CborArray
    @Serializable
    data class OtherKind(
        val otherValue: String,
    )

Describe the solution you'd like

Using cbor (with useDefiniteLengthEncoding = true) like so:

        val withNull = SessionTranscript(
            someValue = "foo",
            nested = null
        )

        cbor.encodeToByteArray(SessionTranscript.serializer(), withNull) // should encode to a value that ends in 0xF6 (NULL)

Should encode to:

    /**
     * 82           # array(2)
     *    63        # text(3)
     *       666F6F # "foo"
     *    F6        # null <- notice this null
     */

Maybe with some other CborBuilder option and/or annotation?

As mentioned:
The current behavior (null becomes empty map for complex types) is expected, as per this PR: #2412

Rough changes - but that would be breaking to existing consumers expecting this as the default behavior
ardune@cfd6951

Maybe a "CborBuilder" opt-in for this behavior change?

@JesusMcCloud
Copy link
Contributor

JesusMcCloud commented Feb 28, 2025

I consider this a bug. Null objects should be encoded as the simple type null, no matter whether they are class types or other types.

It does indeed sound like a bug and fixing would be a breaking change.
Good news is: You can work around it, if you do some manual work with the serializer.
@ardune here's how we work with this in our mDL implementation. As it stands now, it seems to comply with the spec and interops with others (in de EUDIW context) perfectly fine. @nodh please care to chime in here? you know mDL better than me.

One more thing regarding mDL: if you need it in the EUDIW context: The EU pretty much botched their spec by disregarding the ISO standard in some subtle details. The ISO disregarded some strict CBOR/COSE encoding rules. (Whoever was responsible for this probably never wrote a parser in their life.) This means that you will need Obor and there is no way around it given how the default cbor decoder works (see here).

@ardune
Copy link
Contributor

ardune commented Feb 28, 2025

I appreciate the example @JesusMcCloud, as well as the heads up about EUDIW/Obor.
In the current case: we worked around it in our solution, it just didn't meet my assumed behavior when starting (i.e. an explicit null was not encoded as such)

Do you think it makes sense to that CborBuilder option to opt-in to this breaking change?
I can make it and do a PR if that seems like a viable/maintainable approach.
My concern is that as a toggle may be worse in the long run.

@JesusMcCloud
Copy link
Contributor

That is for @sandwwraith to decide. @Jacob-Amazon has me convinced that the original behaviour is incorrect and should be resolved. How to handle such a breaking change wrt. the existing ecosystem using this cbor implementation is a different cup of tea I don't feel qualified/experienced enough to comment on.

@nodh
Copy link
Contributor

nodh commented Mar 3, 2025

Now that I've seen this issue, I'm also convinced that a null object should be encoded as F6. This will also become an issue in our implementation as soon as we're implementing other handover structures.

@JesusMcCloud
Copy link
Contributor

Just to clarify publicly: The EU consortium did not botch it. The ISO mDL spec itself loosened up some encoding rules compared to vanilla CBOR/COSE. So that consortium is to blame.
What that means is: This issue it will not only affect the EUDIW, but every mDL implementation regardless of context.

@sandwwraith
Copy link
Member

I think we cannot change behavior as-is, indeed. You can send a PR that would incorporate the correct behavior under the flag.

@JesusMcCloud
Copy link
Contributor

While I do understand the compatibility argument, isn't CBOR still experimental? I'm not advocating for a radical breaking change, but rather a sensible strategy to make the default behaviour gradually the default:

  1. Introducing a flag with a default value that behaves, as-is
  2. Some time later, deprecating the flag and the function call without the flag and replacing the flag it with a new one, with flipped behaviour
  3. Raising the deprecation level to ´ERROR`
  4. Raising the deprecation level to. HIDDEN
  5. Never actually removing it, so that old code remains working

Probably too naive and never going to work as easily, but keeping known-wrong behaviour for to maintain compatibility with a state that is officially experimental smells funky.

Alternatively: Deprecate this here CBOR format and replace it by something entirely new, heavily based on the much more flexible Obor.

I think a discussion (both internally in the serialization team and with users) is warranted, because in the end the current behaviour remains wrong.

@ardune
Copy link
Contributor

ardune commented Mar 13, 2025

I've made a PR for the first draft of this with the flag discussed.

Hopefully it is a reasonable jumping off point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants