Add support for custom bakes to databake #6576

sffc · 2025-05-11T00:50:30Z

gemini-code-assist · 2025-05-11T00:50:34Z

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

sffc · 2025-05-11T08:19:03Z

Things I'd like feedback on:

Names of things
The two traits (safe and unsafe) and how they do/don't interact with each other
The unusual safety requirement on the unsafe trait
The new overload on the macro and whether it has any risk of being a breaking change

robertbastian · 2025-05-12T14:02:15Z

utils/databake/derive/src/lib.rs

+/// To bake to a different type than this, use `custom_bake`
+/// and implement `CustomBake`.


I think having a trait for this is overkill if you can provide a method to the macro

Reasons I made it a trait:

Gives a place to enforce the strange safety requirement in the unsafe version

And since we have it for unsafe, we can also use it for safe

How do you suggest handling the safety requirement without a trait?

robertbastian · 2025-05-12T14:02:50Z

utils/databake/derive/src/lib.rs

+///
+/// #[derive(Bake)]
+/// #[databake(path = bar::module)]
+/// #[databake(path = custom_bake)]


Suggested change

/// #[databake(path = custom_bake)]

/// #[databake(custom_bake = Self::bake_to_bytes)]

Manishearth · 2025-05-21T00:12:20Z

So, the promised "complicated thoughts":

By and large I disprefer adding new types and traits. On the other hand, adding attributes/toggles/etc to a derive are something I think is a good way to achieve goals. As such, the original proposed design seemed pretty good to me.

So my first reaction to this PR was "we should go back to the original proposal that we agreed on". Or use From.

So basically something where we can specify to/from functions or use a preexisting trait. So #[databake(path = ..., custom_bake = Foo)] using From/Into or custom_bake = (type = &[u8], to = Foo::to_bytes, from = Foo::from_bytes.

But that works for safe conversions. I agree that this is not as good for unsafe conversions. For an unsafe conversion the bare minimum is that the macro should mention unsafe somewhere, but ideally you have unsafe {} or unsafe impl somewhere.

But if you have to implement a custom trait, I once again go back to comparing it with the motivation of reducing boilerplate: isn't the whole idea to remove custom impls? I thought about it more, and concluded that replacing a TokenStream-universe custom impl with a value-universe custom impl is still valuable. (this type of question is why I am so insistent on fully understanding the motivation before talking too deeply about solutions)

Putting all of this together, I end up with:

I think we probably should use a trait for the unsafe conversions
I'm not convinced we should use a trait for the safe conversions, macro magic seems better, even if it ends up with two somewhat different ways of doing things. I'm overall fine with unsafe stuff being different.

Looking at the existing trait I'm not really a fan of the nonlocality of the guarantees, referencing the existence of an inherent method. How about a single trait:

/// Safety: implementation is valid if from_baked is always safe when fed values from to_bake
unsafe trait CustomBakeConversions {
   type Baked<'a>: Bake;
   /// Allowed to panic
   fn to_baked(&'a self) -> Self::Bake<'a>;
   /// Safety: called on values produced by to_baked
   unsafe fn from_baked(baked: Self::Bake) -> Self;
}

invoked with databake(..., custom_bake(type = &[u8], unsafe))

And then for the "safe" bake we have MVP #[databake(..., custom_bake = &[u8])] where we assume the existence of safe to_baked/from_baked functions OR From/Into functions (dealer's choice). We can add customizeability here when desired.

Thoughts?

robertbastian · 2025-05-21T06:41:55Z

can't use a trait for const construction, which is the actual unsafe part (to_baked is safe)

sffc · 2025-05-21T15:42:32Z

But if you have to implement a custom trait, I once again go back to comparing it with the motivation of reducing boilerplate: isn't the whole idea to remove custom impls? I thought about it more, and concluded that replacing a TokenStream-universe custom impl with a value-universe custom impl is still valuable.

This has been my position and I appreciate your eloquence. ❤️

I think we probably should use a trait for the unsafe conversions

I'm not convinced we should use a trait for the safe conversions, macro magic seems better, even if it ends up with two somewhat different ways of doing things. I'm overall fine with unsafe stuff being different.

My position is that since we need a trait for the unsafe, then it's harmless to support it in safe mode. The unsafe trait can just be an extension on the safe trait, as proposed in this PR. At least, it can be the default behavior when custom_bake is used without any arguments in the derive.

Looking at the existing trait I'm not really a fan of the nonlocality of the guarantees, referencing the existence of an inherent method. How about a single trait:

I would prefer a single trait, but the constructor can't be on a trait, at least not until we have const traits.

Manishearth · 2025-05-21T16:15:02Z

can't use a trait for const construction, which is the actual unsafe part (to_baked is safe)

argh. okay, fine, the indirect const function is acceptable, but I still don't like it.

My position is that since we need a trait for the unsafe, then it's harmless to support it in safe mode. The unsafe trait can just be an extension on the safe trait, as proposed in this PR. At least, it can be the default behavior when custom_bake is used without any arguments in the derive.

I think this is a bad trait. It has a strange nonlocal guarantee¹ and it's mimicking existing Rust conversion traits. We need it for proper unsafe hygeine, which makes me marginally okay with having it: unfortunately the bad trait is the best we can do. We do not need it for the other things. I do not want to introduce a second bad trait that is only there for consistency.

In the long run, we can probably have the first trait be const From/const Into.

If the options on the table are two traits or not doing this at all I prefer not doing this at all. I accept the motivation of this change, I do not accept that it overrides all other concerns, and I think having a largely extraneous trait is where I draw the line. I would prefer to solve this without new traits at all, but safety hygeine forces us to have at least one, which I begrudgingly accept. I don't want to stretch that to two traits.

yes, in safe mode it's not a safety guarantee, but it's still a guarantee. ↩

sffc · 2025-05-21T19:26:24Z

I don't think I agree with From/Into being a long-term goal we want to work toward.

<reasoning>
I've tried, multiple times, to get my Rust trait frameworks to sit on top of From/Into, and I run into various types of issues:

No great way to implement From<&T> for &U, and generally the traits get messy with borrowed things. They work much better with owned-to-owned conversions.
Sometimes we don't want the serialized/baked repr to be the canonical representation in the target type, like &str or &[u8]. For example, if we want to bake Pattern, we might bake it as bytes, but the canonical bytes should perhaps be the UTF-8 unparsed pattern.

Traits are cheap, clean, and easy to understand. I've moved much more toward "favor use-case-specific traits" than "try and shoehorn some existing trait into a use case that doesn't exactly match".
</reasoning>

However, I still think that this is cleaner using some trait, rather than making the proc macro more complicated. As you know, I would rather us work toward getting rid of proc macros. There seemed to be consensus at RustWeek that proc macros are bad, because the pull in Syn, require running code at build time, are hard for tooling like rust-analyzer, etc. This is a theme that came up again and again. There was desire to eventually move toward macro_rules for derives, but the team acknowledged that there's still a long way before we get there. However, what we can do for now is avoid over-complicating our proc macro.

sffc · 2025-05-21T19:36:04Z

If we wait for const traits to land, which I'm told by the lang team should be some time in the not-too-distant future, would you approve this with a trait-based solution?

You listed two reasons you don't prefer traits:

Because they have an indirect reference to the constructor
Because they look similar to From/Into

Const traits will fix (1). See my previous post for why I think we should not aim for (2).

Manishearth · 2025-05-21T21:09:16Z

As you know, I would rather us work toward getting rid of proc macros. There seemed to be consensus at RustWeek that proc macros are bad, because the pull in Syn, require running code at build time, are hard for tooling like rust-analyzer, etc. This is a theme that came up again and again. There was desire to eventually move toward macro_rules for derives, but the team acknowledged that there's still a long way before we get there. However, what we can do for now is avoid over-complicating our proc macro.

This is an extremely long term goal and I do not think we will get close to this any time soon (in the next five years or so). I've seen a lot of this desire, but the actual design work for this is extremely nascent¹, and macro work has historically taken ages to occur. We still do not have "Macros 2.0", a nine-year old feature proposal that is still actively desired and occasionally worked on.

The flexibility of the macro system overall makes it very tricky to evolve: I do not begrudge the Rust team their time in working on this, but I also expect very little when it comes to large macro system improvements.

Given that, I disagree with "what we can do for now is avoid overcomplicating our proc macro". Something that is >5 years out in the future, potentially even 10 years out in the future, is not something I find it useful (or even possible?) to design towards. When that time comes near, we can perform a proper holistic redesign of databake. Until then I don't find it useful to prevent ourselves from certain design patterns because they will need to change at that level; we cannot truly predict what will and won't be complicated in that future. Furthermore I think designing proc macros with good UX now would be helpful in informing what use cases the lang team should consider when designing a declarative macro future.

I very explicitly did not try and design yoke for a potential future GAT world. I knew it was coming soon, I could have designed it differently with expectations of it fitting in better with GATs, I decided not to. It's good that I did: the way GATs ended up working was not how I had envisoned them as working wrt yoke, and trying to "prepare" for that might have actually made the crate worse. There's still some stuff that I'd like to experiment with there, but in this case there's no rush.

I feel the same way about our proc macros and a potential future with more powerful decl macros. For zerovec, I am interested in ways to supplement zerovec with currently-possible decl macros to improve the dependency situation. But databake is not a normal runtime dep so I'm not super interested in databake decl macros unless we can replace it completely with decl macros (with decent UX), and I don't think decl macros are currently at that point. Eventually when we have fancy decl macros that can do this type of thing well, I'd love to try and use them, and revisit decisions like these. 5+ years is a wonderful time to perform a new holistic design.

If we wait for const traits to land, which I'm told by the lang team should be some time in the not-too-distant future, would you approve this with a trait-based solution?

I'd still be hesitant. My preference in databake and zerovec is if we are already using a proc macro, then we should use proc macro attribute configs as much as possible before adding new items to the public API. We have more flexibility with those attributes, and can play around with it and arrive on better holistic designs much more easily. This is a reason I have not yet stated in this thread, but I have stated before when it comes to additions to zerovec.

I'll also note: the problem with a full bidirectional trait without the indirectness is that the crate now needs an unconditional runtime dependency on databake. This is compile time infra, i'd love for it to stay compile time infra, which to me means solving it in the proc macro world.

I recognize that I've both expressed a dislike for the indirectness and just now expressed why not having the indirectness is also bad; but this is why I disprefer traits here.

Given the runtime dependency problem I think my preferred path forward is to have an "indirect" unsafe trait for the unsafe construction and use proc macro attributes for the safe construction, even in the presence of const Traits. I don't have a strong desire to use From/Into here, I just don't want to introduce new traits, but recognize a strong reason to do so for unsafe. In the long run i'd love to redesign this when macros are better.

in terms of progress made. I remember writing down ideas for custom derives that didn't need an AST library before we had the concept of tokenstream-based custom derives. People have been thinking about this problem since before Rust 1.0. ↩

Add support for custom bakes to databake

1b5aaf7

sffc marked this pull request as ready for review May 11, 2025 08:11

sffc requested review from robertbastian and Manishearth as code owners May 11, 2025 08:11

sffc added 2 commits May 11, 2025 10:29

The unsafe trait should inherit from the safe one

6b5ff38

Clippy

d9dcc1f

robertbastian reviewed May 12, 2025

View reviewed changes

sffc mentioned this pull request May 15, 2025

DataBake: split serialized form from runtime form #2452

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add support for custom bakes to databake #6576

Add support for custom bakes to databake #6576

Uh oh!

sffc commented May 11, 2025

Uh oh!

gemini-code-assist bot commented May 11, 2025

Uh oh!

sffc commented May 11, 2025

Uh oh!

robertbastian May 12, 2025

Uh oh!

sffc May 13, 2025

Uh oh!

robertbastian May 12, 2025

Uh oh!

Manishearth commented May 21, 2025

Uh oh!

robertbastian commented May 21, 2025

Uh oh!

sffc commented May 21, 2025

Uh oh!

Manishearth commented May 21, 2025

Uh oh!

sffc commented May 21, 2025

Uh oh!

sffc commented May 21, 2025

Uh oh!

Manishearth commented May 21, 2025

Uh oh!

Uh oh!

		/// To bake to a different type than this, use `custom_bake`
		/// and implement `CustomBake`.

	/// #[databake(path = custom_bake)]
	/// #[databake(custom_bake = Self::bake_to_bytes)]

Add support for custom bakes to databake #6576

Are you sure you want to change the base?

Add support for custom bakes to databake #6576

Uh oh!

Conversation

sffc commented May 11, 2025

Uh oh!

gemini-code-assist bot commented May 11, 2025

Uh oh!

sffc commented May 11, 2025

Uh oh!

robertbastian May 12, 2025

Choose a reason for hiding this comment

Uh oh!

sffc May 13, 2025

Choose a reason for hiding this comment

Uh oh!

robertbastian May 12, 2025

Choose a reason for hiding this comment

Uh oh!

Manishearth commented May 21, 2025

Uh oh!

robertbastian commented May 21, 2025

Uh oh!

sffc commented May 21, 2025

Uh oh!

Manishearth commented May 21, 2025

Footnotes

Uh oh!

sffc commented May 21, 2025

Uh oh!

sffc commented May 21, 2025

Uh oh!

Manishearth commented May 21, 2025

Footnotes

Uh oh!

Uh oh!