-
Notifications
You must be signed in to change notification settings - Fork 635
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve Protobuf OneOf type declarations to avoid an intermediate OneOf Type class #2918
Comments
In protobuf, data structures are reused at the message Data {
required string name = 1;
oneof phone {
HomePhone homePhone = 2;
WorkPhone workPhone = 3;
}
}
message Data2 {
required string name = 1;
oneof phone {
HomePhone homePhone = 2;
}
}
message HomePhone {
required string foo = 1;
}
message WorkPhone {
required string bar = 1;
} will need 2 kt definition to describe
I would like to know if there any real use case to require the consistent form of output for these 2 formats. And if any, it would be better to be supported by a well-defined ProtoJSON format, instead of using the origin JSON format. |
Why would it? In that case @Serializable
data class Data(
val name: String,
@ProtoOneOf val phone: IPhone,
)
@Serializable
data class Data2(
val name: String,
@ProtoOneOf val phone: IPhone2,
)
@Serializable
sealed interface IPhone
@Serializable
sealed interface IPhone2
@Serializable
@ProtoOneOfNumber(2)
data class HomePhone(val number: String): IPhone, IPhone2
@Serializable
@ProtoOneOfNumber(3)
data class WorkPhone(val number: String, val extension: String): IPhone With this structure, the only thing that can't be done is that
Its less about the output form (though making it consistent is a good thing), and more about having an idiomatic representation of model classes in Kotlin, rather than one constrained and determined by the serialization format. |
The kotlinx.serialization library (and formats) is fundamentally a Kotlin model first serialization approach (the serialization follows the Kotlin model). Given that it is a multi-format serialization approach this is somewhat unavoidable due to the infinite target issue. However, some formats allow a bit more customization of the output than others. Xml is fairly flexible (highly customizable), Json has a reasonable amount of knobs and protobuf has little. Even XML, while able to support most formats, has limits on the data structures. Json doesn't have/need the flexibility as by its very nature it has a good mapping to the code. Protobuf is just not very featureful. Note also that flexibility comes with complexity (and likely computational overheads). Even the most flexible of formats struggles to support arbitrary Kotlin - serialization mappings. The best way to do this is to create your own format. This format could be a delegating format / transforming serializer. |
I understand all that, but Protobuf actually outputs the representation one would expect. Its the Kotlin representation of the Protobuf OneOf that has the extra "type class" ceremony which is at issue here. |
That is the point. And you will need to modify the existing class |
Ok, a potential solution is to use type annotations: // example annotation, this would be in the library
@Target(AnnotationTarget.TYPE)
annotation class ProtoOneOfNumber(val value: Int)
@Serializable
data class Data(
@ProtoNumber(1) val name: String,
@ProtoOneOf val phone: IPhone,
)
@Serializable
data class Data2(
@ProtoNumber(1) val name: String,
@ProtoOneOf val phone: IPhone2,
)
@Serializable
sealed interface IPhone
@Serializable
sealed interface IPhone2
@Serializable
data class HomePhone(val number: String): @ProtoOneOfNumber(2) IPhone, @ProtoOneOfNumber(5) IPhone2
@Serializable
data class WorkPhone(val number: String, val extension: String): @ProtoOneOfNumber(3) IPhone Now serialization can infer that If we have additional field numbers we need another interface. But this is analogous to, but cheaper than, creating another @Serializable
class HomePhoneType2(
@ProtoNumber(2) val homePhone: HomePhone,
): IPhoneType
@Serializable
class HomePhoneType5(
@ProtoNumber(5) val homePhone: HomePhone,
): IPhoneType
...
Fair point, but this is a Kotlin modeling issue, not a serialization issue. This same logic applies to models that would be serialized to JSON as well. The point here is to give the modeler the choice on how to create their models. If the modeler wants an intermediate type class, go ahead. But many use cases do not require it. Its certainly unusual in my experience to have a single class participate in multiple sealed hierarchies, so this would almost never be a concern in any of my projects, and I suspect most projects. That being said, I acknowledge here that the need to define the |
Actually this is not how the ProtoBuf format serializes. Data is serialized as: Or in protobuf: syntax = "proto2";
message Data {
required string name = 1;
oneof phone {
HomePhoneType homePhone = 2; //note names disappear in the wire format
WorkPhoneType workPhone = 3;
}
}
message HomePhoneType {
required homePhone: HomePhone;
}
message WorkPhoneType {
required workPhone: WorkPhone;
}
message HomePhone {
required string number = 1;
}
message WorkPhone {
required string number = 1;
optional string extension = 2;
} What |
Are you sure? When I use this input: @Serializable
data class Data(
@ProtoNumber(1) val name: String,
@ProtoOneOf val phone: IPhoneType?,
)
@Serializable
sealed interface IPhoneType
@Serializable
class HomePhoneType(
@ProtoNumber(2) val homePhone: HomePhone,
): IPhoneType
@Serializable
class WorkPhoneType(
@ProtoNumber(3) val workPhone: WorkPhone,
): IPhoneType
@Serializable
data class HomePhone(val number: String)
@Serializable
data class WorkPhone(val number: String, val extension: String) then the output of:
is:
and encoding a protobuf like this:
and decoding that with
|
@rocketraman You are right. It would be good to avoid the single attribute wrapper (I believe it was introduced because The other factor is that in protobuf schemas the |
First of all, I agree that the wapper type here in oneof is quite redundent and would be good to omit it. But this implementation is the best one for now. Indeed, directly implementing sealed interface in the inner oneof Kotlin data class (lets call it the simple impl from here) make it simpler in some case, but it has some deadly limitations
So the simple impl would make it impossible to use
That is true, but
not agreed. IMO, modeling in ProtoBuf format is to create definitions in To sum up, the current impl is not the perfect solution, but what I can do most for now to make it functionable for most cases, both in simple project and multi-module project. |
I'm in the process of writing my own serializer tool, partially due to the lack of flexibility with the oneOf in ktx serialization. In my project, I'm using an annotation to be able to specify proto numbers of each implementations (open polymorphism on multiple modules). If ever it can help, it looks like that @ProtoPolymorphism(
MyInterface::class, // Define on a "common" module
[
Pair(MyFirstSubClass::class, 1), // Implements MyInterface in a module A
Pair(MySecondSubClass::class, 2) // Implements MyInterface in a module B
]
)
private object K2PBPolymorphismConfigHolder // Define on my app module This is allowing a better class re-usability, and a definition of those proto numbers for many different cases. (More similar to the protobuf schema description.) Note that I'm producing a message MyInterface which is only composed of the oneOf message MyInterface {
oneof myinterface {
MyFirstSubClass myFirstSubClass = 1;
MySecondSubClass mySecondSubClass = 2;
}
} |
@glureau You could just create your own variant of the Proto format that has your own annotations/semantics. But indeed, it is possible to specify mappings in an annotation, but said annotation will have to live with the limitations that annotations have (Pair is not valid) |
Oh i'm sorry, I forgot to share the annotation declaration @Repeatable
public annotation class ProtoPolymorphism(
val parent: KClass<*>,
val oneOf: Array<Pair>,
) {
public annotation class Pair(val kClass: KClass<*>, val number: Int)
} (Didn't find a better way to write it for now, but I just wanted to share the idea of 1 annotation in the "final" module to declare the polymorphism/oneof setup.) |
@glureau can you share your kt data class declaration of these messages? |
There's nothing special in those classes, you can have a look here, they are just marked with the equivalent of |
I'm not sure that serialization supports repeatable annotations (it didn't, but with k2 that may no longer be an issue), that would work to replace Pair (as you can have an array of them) |
Thanks everyone for the discussion. It seems there are no action points to make here, as alternative designs require annotation definitions that are not currently supported by kotlinx.serialization. I'll leave this open as a reference though. |
What is your use-case and why do you need this feature?
I am investigating using Protobuf to serialize polymorphic types. Using an example similar to the one in the guide, I have to declare my an example hierarchy that itself contains nested classes like this:
This results in inconsistent serialization between JSON and Protobuf, the former of which serializes like this:
and the latter of which uses a cleaner schema with no trace of the intermediate type:
Furthermore, the natural way of describing such a data structure in Kotlin would be like this:
which would result in the expected JSON, but of course does not work with Protobuf OneOf.
Describe the solution you'd like
I'd like to define the data structure in an idiomatic way, and have the ProtoBuf OneOf work as expected.
It seems like the only value the intermediate OneOf type adds is the definition of the
@ProtoNumber
for the OneOf.Therefore, maybe we could define the data structure in Kotlin using annotations like the following:
The result of this would be an idiomatic simpler Kotlin data structure, and consistent form of serialized output for JSON and Protobuf.
The text was updated successfully, but these errors were encountered: