Skip to content
This repository was archived by the owner on Nov 6, 2024. It is now read-only.

Versionize kvm structures #25

Closed
wants to merge 3 commits into from
Closed

Conversation

sboeuf
Copy link

@sboeuf sboeuf commented Sep 3, 2020

Based on the discussion #24, I think it's important for kvm-bindings to provide a common way for all projects to serialize and versionize KVM structures in order to perform snapshot.
This pull request ported multiple patches from the Firecracker fork of kvm-bindings to derive the KVM structures that are needed.
More pull requests can follow if more KVM structure needs to be derived with Versionize, but this can happen in a second stage.

One important thing to note, we can't really merge this pull request until the versionize crate is moved to rust-vmm organization.

Copy link
Contributor

@alexandruag alexandruag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi folks! Does anyone mind if we wait until early next week to have a conclusion for #24? Also, if it turns out it's best to make upstream kvm-bindings versionize-aware, we should still gate this behind a feature at least.

Copy link
Member

@andreeaflorescu andreeaflorescu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After we gate the versionize support behind a feature, we also need to update the Readme to add the documentation for this feature.

@@ -940,14 +997,14 @@ pub struct kvm_ioapic_state {
pub redirtbl: [kvm_ioapic_state__bindgen_ty_1; 24usize],
}
#[repr(C)]
#[derive(Copy, Clone)]
#[derive(Copy, Clone, Versionize)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wanted to call out that this is safe only because we are dealing with kernel bindings where the backwards compatibility is ensured, and the union has C representation.

This derive is not generally safe because the generated code assumes that the enum variants memory offset starts at 0, which is only available for repr(C).

With this in mind, we should see if we want versionize to autogenerate code for unions since this can put us at risk when dealing with other structures. To stay on the safe side, I would say that versionize for unions should be manually implemented, and we should keep track of the active variant (and write when serializing, reading it when deserializing).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have the same concern for Union support in versionize crate too. Suggest to drop union support from versionize.
By the way, I have no clear picture about the way to deal with deleted struct fields. It seems that to support deleted fields, we must define a dedicated state struct.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe deleting of fields in not supported by default with versionize. Looking at the autogenerated code, it seems that it will generate a panic. @sandreim will probably have a more accurate description here.

This is of course not the case for Linux bindings (again), but I would like for the other things we want to "versionize" to define a separate state structure where we uphold these requirements (i.e. fields don't vanish).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree.:)
We use versionize to generate serde code for FFI structs, which is stable and repr(C).
For other data structs, we should define companion state structs.

Copy link

@sandreim sandreim Sep 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addition and deletion of fields happens only at the serialization level. The versionized Rust struct can only grow in size while the code generator will branch based on target version and serialize only the fields that exist at that version. When a field is removed, it will not be present in the serializer output, but will still be present in the structure. When a structure with removed fields is deserialized, the generated code will provide a default value for that field via the Default trait. If an application removes a field from a structure, it should not care about it anymore with one exception. That is when when we serialize at a previous version when the field existed. In this case Versionize offers an interface to implement a semantic serializer that must fill in that field with data that carries the same semantics or otherwise return an error if this translation is impossible (for example, a feature in current version does not exist in the older version we are targeting).

Union serialization/deserialization works by only serializing the largest field of the union at a specific target version. So the generated code would read/write a very specific union field and would not be worried about repr.

Generally Versionize does not care about the semantics of what's serialized, it only cares about structure - which fields to serialize/deserialize given a target/source version of a structure.

Based on this, I can agree with what Andreea said about why it is safe to use the versionized kvm bindings unions, the Versionize layer acts as a proxy for filling in the values in Rust structs/enum/unions. Actually using the union in the application would require it to know the semantics.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi all! Looks like what versionize_derive implements for unions is essentially a bitwise (de)serialization based on the size of the object (i.e. for the kvm_ioapic_state__bindgen_ty_1 union, we end up reading/writing the opaque bits field every time). As far as I can tell, the kvm bindings guarantee that structures do not change their size, nor alter the field layout (at most, some of the "reserved" bits/bytes become meaningful). If that's true, then we can provide a manual Versionize implementation for kvm-bindings unions that just reads/writes size_of::<Self>() bytes.

Moreover, the same properties hold for all kvm-bindings structures (not just the unions), and this raises the question: do we actually need versioned serialization for kvm-bindings, or can we just save/restore the raw binary representation of the objects? The only concern is backwards/forward compatibility for reserved bits. Do we need to avoid writing to them when restoring state, or reading from them when saving? If the answer is no, then serializing bindings could become very straightforward. OTOH, if we do have to be mindful of the reserved bits, then our current approach is probably insufficient. Looks like we need to dive deeper into this, and it would be great if someone has more insight to share.

Copy link
Member

@jiangliu jiangliu Sep 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good insight. Actually the linux KVM uapi has kept an eye on forward/backward compatibility, so we may rely on this property. Previously we have used bit-copy to serialize/deserialize KVM structs. A (version, size, bit_copy_data) tuple may work here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: we removed union support from versionize and yanked all versions that had that support.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to revive the thread here :)
Don't you think as a starter we could rely on versionize the way Firecracker relies on it?
I'm suggesting this because at the moment not a single VMM project relies on the upstream kvm-bindings :(
It's a fundamental block for VMMs and I think it's okay if someone comes with a better solution in the future. But for now, let's try to agree on a simple way to move forward and let all our projects pull from rust-vmm/kvm-bindings.
/cc @sameo @rbradford

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi! We found some interesting things since starting to look closer into the KVM bindings. It turns out they're an ABI with some special properties defined by the kernel; for example their size never changes and neither does the layout of currently used fields. This means we can treat them as opaque binary blobs in terms of serialization at the lowest level. Bit/field manipulation situations can arise, but they're specific to each application and/or the underlying kernel. It also means we can get rid of having separate features of the crate for different kernel versions, and just keep the latest.

In Firecracker, we continue to rely on deriving versionize on everything because there's no straightforward way yet of including blobs in the whole process, but we don't otherwise need the versionize semantics and generated code for the bindings. The [patch] approach works well here, while we figure out how to avoid the extra logic. Our view on the short term is either to continue using the separate bindings forks, or elevate something like kvm-bindings-versionize/serialize to an official sidecar repository in rust-vmm, if multiple projects can leverage it via [patch]. Had a quick look at CH, and it seems your bindings fork differs from the one in Firecracker; what are the main differences there?

@roypat
Copy link
Member

roypat commented May 13, 2024

Closing this as superceded by #101, where we added serde based serialization to kvm-bindings.

@roypat roypat closed this May 13, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants