-
Notifications
You must be signed in to change notification settings - Fork 75
[RFC] Add serde [de]serialization for KVM bindings #4
Conversation
@@ -1,4 +1,61 @@ | |||
[[package]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember there was a discussion whether the Cargo.lock file should be included and the conclusion is:
- for library crate, the Cargo.lock should be excluded.
- for bin/app crate, the Cargo.lock should be included.
So how about remove the Cargo.lock file?
Really great work, thanks @aghecenco ! |
I’m going to work on that, haven’t had the chance to yet. I’m meaning to change the implementation itself too, with macros. I’ll add a build feature that optionally pulls in serde and adds the serialization functionality. |
It's a common issue to control serde by features, so a common solution is welcomed! |
How about adding a wrapper crate to feature control serde, typetag etc? |
@jiangliu what do you mean by wrapper crate? I am asking because I think it's similar to the proposal I had in the sync call which is: create another crate that uses kvm-bindings and re-exports the structures with added serde. Is that also what you had in mind? |
Seems we have different IDs. What I'm suggesting is to implement serde_fake/serde_fake_derive, which provides blank implementation for #[derive(Serialize, Deserialize)] macros. Then we could easily switch to the fake serde crates when serde feature is disabled. |
@alexandruag had a suggestion too - if I understood it correctly it was something in the lines of adding a new macro to @alexandruag please correct me if I got it wrong. |
I feel the only difference is that: @alexandruag suggests to introduce some new derive macros, my suggestion is to redefine derive macros of serde/serde_derive crates. |
@jiangliu if there are no name clashes because of using the same names as |
Another observation to consider - all data structures defined here are FFI safe so they can all be serialized as a raw byte dump. We could have a generic (This isn't actual code, it's just for demonstrative purposes) fn serialize_ffi(object: T) -> &[u8] {
let serialized_object: Vec<u8> = vec![];
serialized_object.copy_from_slice(slice::from_raw_parts(&object as *const T as *const u8, mem::size_of::<T>()));
serialized_object
}
fn deserialize_ffi(serialized_object: &[u8]) -> T {
ptr::read::<T>(serialized_object as *const u8 as *const T, mem::size_of::<T>())
} Then if we want #[cfg(use_serde)]
use serde::{Serialize, Deserialize}];
impl Serialize for PLACEHOLDER {
fn serialize<S>(&self, serializer: S) whereS: Serializer {
serializer.serialize_bytes(serialize_ffi::<PLACEHOLDER>(self), mem::size_of::<PLACEHOLDER>())
}
}
impl Deserialize for PLACEHOLDER {
fn deserialize<D>(deserializer: D) where D: Deserializer<'de> {
struct PLACEHOLDER_visitor;
impl<'de> Visitor<'de> for PLACEHOLDER_visitor {
type Value = PLACEHOLDER_visitor;
fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
formatter.write_str("PLACEHOLDER_visitor")
}
fn visit_seq<V>(self, mut seq: V) where V: SeqAccess<'de> {
let bytes: Vec<u8> = seq.next_element().unwrap();
Ok(deserialize_ffi::<PLACEHOLDER>(bytes))
}
}
const FIELDS: &'static [&'static str] = &["bytes"];
deserializer.deserialize_struct("PLACEHOLDER", FIELDS, PLACEHOLDER_visitor)
}
} |
Hi, really sorry for being this late to the discussion. I think there are a couple of aspects we have to consider. Some of them have what appears to be a straightforward/widely accepted approach already, but I'm going to try and mention them all just to make sure we're on the same page.
trait Foo: Sized + Copy {
fn from_byte_slice(bytes: &[u8]) -> Self {
assert_eq!(mem::size_of::<Self>, bytes.len());
unsafe { ptr::read_unaligned(bytes as *const u8 as *const Self) }
}
fn as_byte_slice(&self) -> &[u8] {
unsafe { slice::from_raw_parts(self as *const Self as *const u8, mem::size_of::<Self>() }
}
} The advantage of using a trait is that we can add the implementations impl<T: Foo> Serialize for T {
...
}
impl<T: Foo> Deserialize for T {
...
} And then, since all the
|
Serde is pulled in as a conditional dependency, activated by the with_serde feature. If active, this feature will also generate sources that implement serde::Serialize and serde::Deserialize for all generated bindings. The implementation (de)serializes each struct/union's raw bytes, as all of them are FFI-safe. Signed-off-by: Alexandra Iordache <[email protected]>
@jiangliu @alexandruag please take another look. I changed...well, everything.
|
Hi all,
|
It seems we could use proc_macro_derive() to achieve the same goal with build_serde.rs, it may be simpler and reused by crates. Should we have a try? |
@aghecenco @alexandruag
I have push the new vmm-serde crate to my personal repo at: https://github.com/jiangliu/vmm-serde With all this ready, it becomes easy to serialize/deserialize data structs generated by bindgen. #[cfg(feature = "serde_derive_ffi")]
#[test]
fn ffi_test_ffi_fam_struct() {
#[repr(C)]
#[derive(Default, Debug, SerializeFfi, DeserializeFfi)]
pub struct __IncompleteArrayField<T>(::std::marker::PhantomData<T>, [T; 0]);
impl<T> __IncompleteArrayField<T> {
#[inline]
pub fn new() -> Self {
__IncompleteArrayField(::std::marker::PhantomData, [])
}
}
#[repr(C)]
#[derive(Debug, Default, SerializeFfi, DeserializeFfiFam)]
pub struct kvm_msrs {
pub nmsrs: u32,
pub pad: u32,
pub entries: __IncompleteArrayField<u64>,
}
impl SizeofFamStruct for kvm_msrs {
fn size_of(&self) -> usize {
self.nmsrs as usize * std::mem::size_of::<u64>() + std::mem::size_of::<Self>()
}
}
let data = vec![
kvm_msrs {
nmsrs: 1,
pad: 0,
entries: __IncompleteArrayField::new(),
},
kvm_msrs {
nmsrs: 0x1,
pad: 0x2,
entries: __IncompleteArrayField::new(),
},
];
let ser = serde_json::to_string(&data[0]).unwrap();
let mut deserializer = serde_json::Deserializer::from_str(&ser);
let content: Vec<kvm_msrs> = kvm_msrs::deserialize(&mut deserializer).unwrap();
// let decoded: FamStructWrapper<kvm_msrs> = content.into();
assert_eq!(content[0].nmsrs, 1);
assert_eq!(content[0].pad, 0);
} And the Cargo.toml look like:
|
Closing in favor of #11. |
This RFC PR adds
#[derive(Serialize, Deserialize)]
tokvm-bindings
structs where possible. A Python script prepends the macro to the struct definitions, leaving out several "blacklisted" structs and all unions. For these,Serialize
andDeserialize
are implemented by leveraging the byte representation of the FFI-safe objects (i.e. their bytes are serialized as a&[u8]
).The serialization code is autogenerated at build time and kernel/arch-agnostic. It is based on a template and a list of "special" data structures (
__IncompleteArrayField
, for instance) that have a blanket serialization format and depend on the consumer to add a sane implementation.Serialization for bindings has a part to play in snapshotting Firecracker and other VMMs that use this crate. It might also prove useful in live migration of rust-vmm components.
Known missing items from this PR:
Fixes: #5