Skip to content

Conversation

weihanglo
Copy link
Member

@weihanglo weihanglo commented Oct 17, 2025

What does this PR try to resolve?

This was found during experimenting -Zbuild-analysis with ndjson.

From me tracing the code with cargo expand, basically there shouldn't have any significant performance difference between serde(flatten) and inlining all the fields. Here the differences between them

  • flatten one calls Serialize::serialize_map without fields size hint so cannot pre-allocate Vec with Vec::with_capacity, whereas inline case calls Serialize::serialize_struct with a known length of fields.
  • flatten would end up calling Serializer::serialize_map and line calls Serializer::serialize_struct. And in serde_json serializer serialize_struct actually call serailze_map. So no difference on serializer side.
  • There might be some function calls not inlined I like FlatMapSerializer but I doubt it is costly than allocation.

Here is the cargo-expand'd result:

#[derive(Serialize)]
pub struct Foo<D: Serialize> {
    id: u8,
    #[serde(flatten)]
    data: D,
}

#[derive(Serialize)]
struct Bar {
    a: bool,
}

// Expand to

extern crate serde as _serde;
impl<D: Serialize> _serde::Serialize for Foo<D>
where
    D: _serde::Serialize,
{
    fn serialize<__S>(
        &self,
        __serializer: __S,
    ) -> _serde::__private228::Result<__S::Ok, __S::Error>
    where
        __S: _serde::Serializer,
    {
        let mut __serde_state = _serde::Serializer::serialize_map(
            __serializer,
            _serde::__private228::None,
        )?;
        _serde::ser::SerializeMap::serialize_entry(
            &mut __serde_state,
            "id",
            &self.id,
        )?;
        _serde::Serialize::serialize(
            &&self.data,
            _serde::__private228::ser::FlatMapSerializer(&mut __serde_state),
        )?;
        _serde::ser::SerializeMap::end(__serde_state)
    }
}

impl _serde::Serialize for Bar {
    fn serialize<__S>(
        &self,
        __serializer: __S,
    ) -> _serde::__private228::Result<__S::Ok, __S::Error>
    where
        __S: _serde::Serializer,
    {
        let mut __serde_state = _serde::Serializer::serialize_struct(
            __serializer,
            "Bar",
            false as usize + 1,
        )?;
        _serde::ser::SerializeStruct::serialize_field(
            &mut __serde_state,
            "a",
            &self.a,
        )?;
        _serde::ser::SerializeStruct::end(__serde_state)
    }
}
#[derive(Serialize)]
pub struct Foo<D: Serialize> {
    id: u8,
    a: bool,
}

// Expand to

impl<D: Serialize> _serde::Serialize for Foo<D> {
    fn serialize<__S>(
        &self,
        __serializer: __S,
    ) -> _serde::__private228::Result<__S::Ok, __S::Error>
    where
        __S: _serde::Serializer,
    {
        let mut __serde_state = _serde::Serializer::serialize_struct(
            __serializer,
            "Foo",
            false as usize + 1 + 1,
        )?;
        _serde::ser::SerializeStruct::serialize_field(
            &mut __serde_state,
            "id",
            &self.id,
        )?;
        _serde::ser::SerializeStruct::serialize_field(
            &mut __serde_state,
            "a",
            &self.a,
        )?;
        _serde::ser::SerializeStruct::end(__serde_state)
    }
}

How to test and review this PR?

CI passing.

One change is that reason will no longer be the first field. We could activate the preserve_order feature in serde_json if we want, though that may get more perf loess than the gain.

See the benchmark result in #16130 (comment)

This was found during experimenting `-Zbuild-analysis` with ndjson.

From me tracing the code with `cargo expand`, basically there shouldn't
have any significant performance difference between `serde(flatten)`
and inlining all the fields. Here the differences between them

* flatten one calls `Serialize::serialize_map` without fields size
  hint so cannot pre-allocate Vec with `Vec::with_capacity`,
  whereas inline case calls `Serialize::serialize_struct` with a
  known length of fields.
* flatten would end up calling `Serializer::serialize_map`
  and line calls `Serializer::serialize_struct`. And in serde_json
  serializer `serialize_struct` actually call `serailze_map`.
  So no difference on serializer side.
* There might be some function calls not inlined I like
  `FlatMapSerializer` but I doubt it is costly than allocation.

Here is the `cargo-expand`'d result:

```rust
#[derive(Serialize)]
pub struct Foo<D: Serialize> {
    id: u8,
    #[serde(flatten)]
    data: D,
}

#[derive(Serialize)]
struct Bar {
    a: bool,
}

// Expand to

extern crate serde as _serde;
impl<D: Serialize> _serde::Serialize for Foo<D>
where
    D: _serde::Serialize,
{
    fn serialize<__S>(
        &self,
        __serializer: __S,
    ) -> _serde::__private228::Result<__S::Ok, __S::Error>
    where
        __S: _serde::Serializer,
    {
        let mut __serde_state = _serde::Serializer::serialize_map(
            __serializer,
            _serde::__private228::None,
        )?;
        _serde::ser::SerializeMap::serialize_entry(
            &mut __serde_state,
            "id",
            &self.id,
        )?;
        _serde::Serialize::serialize(
            &&self.data,
            _serde::__private228::ser::FlatMapSerializer(&mut __serde_state),
        )?;
        _serde::ser::SerializeMap::end(__serde_state)
    }
}

impl _serde::Serialize for Bar {
    fn serialize<__S>(
        &self,
        __serializer: __S,
    ) -> _serde::__private228::Result<__S::Ok, __S::Error>
    where
        __S: _serde::Serializer,
    {
        let mut __serde_state = _serde::Serializer::serialize_struct(
            __serializer,
            "Bar",
            false as usize + 1,
        )?;
        _serde::ser::SerializeStruct::serialize_field(
            &mut __serde_state,
            "a",
            &self.a,
        )?;
        _serde::ser::SerializeStruct::end(__serde_state)
    }
}
```

```rust
#[derive(Serialize)]
pub struct Foo<D: Serialize> {
    id: u8,
    a: bool,
}

// Expand to

impl<D: Serialize> _serde::Serialize for Foo<D> {
    fn serialize<__S>(
        &self,
        __serializer: __S,
    ) -> _serde::__private228::Result<__S::Ok, __S::Error>
    where
        __S: _serde::Serializer,
    {
        let mut __serde_state = _serde::Serializer::serialize_struct(
            __serializer,
            "Foo",
            false as usize + 1 + 1,
        )?;
        _serde::ser::SerializeStruct::serialize_field(
            &mut __serde_state,
            "id",
            &self.id,
        )?;
        _serde::ser::SerializeStruct::serialize_field(
            &mut __serde_state,
            "a",
            &self.a,
        )?;
        _serde::ser::SerializeStruct::end(__serde_state)
    }
}
```
@rustbot rustbot added A-console-output Area: Terminal output, colors, progress bar, etc. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Oct 17, 2025
@rustbot
Copy link
Collaborator

rustbot commented Oct 17, 2025

r? @epage

rustbot has assigned @epage.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@weihanglo
Copy link
Member Author

Just used Claude Haiku 4.5 to generate a criterioin benchamrk script for me. The benchmark was run under opt-level=3 and debug=false.

Here is the result:

Message Type master d80156f serde(flatten) Improvement
BuildFinished 173.91 ns 89.814 ns 48.5%
Artifact 1.4953 µs 1.3515 µs 9.6%

BTW, it is interesting that I can directly pull in local path dependencies in -Zscript.

Click to expand the AI-generated benchmark script (reviewed by human)

#!/usr/bin/env -S cargo +nightly -Zscript
---cargo
[package]
name = "bench_message_serialization"
edition = "2024"

[dependencies]
cargo = { path = "." }
cargo-util-schemas = { path = "crates/cargo-util-schemas" }
criterion = { version = "0.5" }
serde = { version = "1.0", features = ["derive"] }
serde_json = { version = "1.0", features = ["raw_value"] }

[profile.dev]
opt-level = 3
debug = false
---

//! # Machine Message Serialization Benchmark (Criterion-based)
//!
//! ## Quick Start
//! ```sh
//! cargo +nightly -Zscript bench_message_serialization.rs
//! ```
//!
//! ## Overview
//!
//! This benchmark uses criterion.rs for statistically rigorous benchmarking of
//! real message types from the cargo crate (`cargo::util::machine_message`):
//! - BuildFinished (simple)
//! - Artifact (complex)
//!
//! Compares:
//! - Original: Double serialization + string manipulation
//! - Optimized: Single serialization using `#[serde(flatten)]`
//!
//! ## Features
//!
//! - Multiple samples (100) for statistical reliability
//! - Automatic outlier detection and handling
//! - Confidence intervals and variance analysis
//! - Statistical significance testing

use cargo::core::Edition;
use cargo::core::Target;
use cargo::core::compiler::CrateType;
use cargo::util::machine_message::{Artifact, ArtifactDebuginfo, ArtifactProfile, BuildFinished};
use cargo_util_schemas::core::PackageIdSpec;
use criterion::{Criterion, black_box};
use serde::Serialize;
use std::path::PathBuf;
use std::time::Duration;

// ============================================================================
// Trait with Both Original and Optimized Methods
// ============================================================================

pub trait Message: Serialize {
    fn reason(&self) -> &str;

    fn to_json_string_original(&self) -> String {
        // Before: 2 serializations + string concatenation
        let json = serde_json::to_string(self).unwrap();
        assert!(json.starts_with("{\""));
        let reason = serde_json::json!(self.reason());
        format!("{{\"reason\":{},{}", reason, &json[1..])
    }

    fn to_json_string_optimized(&self) -> String {
        // After: 1 serialization using flatten
        #[derive(Serialize)]
        struct WithReason<'a, S: Serialize> {
            reason: &'a str,
            #[serde(flatten)]
            msg: &'a S,
        }
        let with_reason = WithReason {
            reason: self.reason(),
            msg: &self,
        };
        serde_json::to_string(&with_reason).unwrap()
    }
}

impl Message for BuildFinished {
    fn reason(&self) -> &str {
        "build-finished"
    }
}

impl<'a> Message for Artifact<'a> {
    fn reason(&self) -> &str {
        "compiler-artifact"
    }
}

// ============================================================================
// Test Data Builders
// ============================================================================

fn build_finished() -> BuildFinished {
    BuildFinished { success: true }
}

fn artifact<'a>() -> Artifact<'a> {
    let target = Box::leak(Box::new(Target::lib_target(
        "my_package",
        vec![CrateType::Rlib],
        PathBuf::from("/home/user/projects/my-package/src/lib.rs"),
        Edition::Edition2021,
    )));

    Artifact {
        package_id: PackageIdSpec::parse("my-package").unwrap(),
        manifest_path: PathBuf::from("/home/user/projects/my-package/Cargo.toml"),
        target,
        profile: ArtifactProfile {
            opt_level: "3",
            debuginfo: Some(ArtifactDebuginfo::Int(2)),
            debug_assertions: false,
            overflow_checks: false,
            test: false,
        },
        features: vec![
            "default".to_string(),
            "experimental".to_string(),
            "feature3".to_string(),
            "feature4".to_string(),
            "feature5".to_string(),
            "feature6".to_string(),
            "feature7".to_string(),
        ],
        filenames: vec![
            PathBuf::from("/home/user/projects/my-package/target/debug/deps/libmy_package.rlib"),
            PathBuf::from("/home/user/projects/my-package/target/debug/deps/libmy_package.d"),
        ],
        executable: None,
        fresh: false,
    }
}

// ============================================================================
// Benchmark Helper
// ============================================================================

fn benchmark_message<M: Message>(criterion: &mut Criterion, name: &str, msg: M) {
    let mut group = criterion.benchmark_group(name);
    group.sample_size(100);
    group.measurement_time(Duration::from_secs(5));
    group.warm_up_time(Duration::from_secs(3));

    group.bench_function("original", |b| {
        b.iter(|| black_box(&msg).to_json_string_original())
    });

    group.bench_function("optimized", |b| {
        b.iter(|| black_box(&msg).to_json_string_optimized())
    });

    group.finish();
}

// ============================================================================
// Main
// ============================================================================

fn main() {
    let mut criterion = Criterion::default()
        .measurement_time(Duration::from_secs(5))
        .sample_size(100)
        .warm_up_time(Duration::from_secs(3))
        .with_plots();

    eprintln!();
    eprintln!("╔══════════════════════════════════════════════════════════╗");
    eprintln!("║   Machine Message Serialization Benchmark (Criterion)    ║");
    eprintln!("║   Statistical Analysis • Real Cargo Types                ║");
    eprintln!("╚══════════════════════════════════════════════════════════╝");
    eprintln!();

    // Verify correctness before benchmarking
    {
        let bf = build_finished();
        let orig = bf.to_json_string_original();
        let opt = bf.to_json_string_optimized();
        assert_eq!(orig, opt, "BuildFinished outputs don't match!");

        let art = artifact();
        let orig = art.to_json_string_original();
        let opt = art.to_json_string_optimized();
        assert_eq!(orig, opt, "Artifact outputs don't match!");

        eprintln!("✓ Correctness check passed\n");
    }

    eprintln!("📊 Benchmarking BuildFinished (simple message)...");
    eprintln!();
    benchmark_message(&mut criterion, "BuildFinished", build_finished());
    eprintln!();

    eprintln!("📊 Benchmarking Artifact (complex message)...");
    eprintln!();
    benchmark_message(&mut criterion, "Artifact", artifact());
    eprintln!();

    criterion.final_summary();

    eprintln!();
    eprintln!("╔════════════════════════════════════════╗");
    eprintln!("║   Benchmark Complete                   ║");
    eprintln!("║   Results saved to: target/criterion/  ║");
    eprintln!("╚════════════════════════════════════════╝");
    eprintln!();
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-console-output Area: Terminal output, colors, progress bar, etc. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants