Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for -Zembed-metadata #15378

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Kobzol
Copy link
Contributor

@Kobzol Kobzol commented Apr 2, 2025

What does this PR try to resolve?

This PR adds Cargo integration for the new unstable -Zembed-metadata rustc flag, which was implemented in rust-lang/rust#137535 (tracking issue).

This rustc flag can reduce disk usage of compiled artifacts, and also the size of Rust dynamic library artifacts shipped to users. However, it is not enough to just pass this flag through RUSTFLAGS; it needs to be integrated within Cargo, because it interacts with how the --emit flag is passed to rustc, and also how --extern args are passed to the final linked artifact build by Cargo. Furthermore, using the flag for all crates in a crate graph compiled by Cargo would be suboptimal (this will all be described below).

When you pass -Zembed-metadata=no to rustc, it will not store Rust metadata into the compiled artifact. This is important when compiling libs/rlibs/dylibs, since it reduces their size on disk. However, this also means that everytime we use this flag, we have to make sure that we also:

  • Include metadata in the --emit flag to generate a .rmeta file, otherwise no metadata would be generated whatsoever, which would mean that the artifact wouldn't be usable as a dependency.
  • Pass also --extern <dep>=<path>.rmeta when compiling the final linkable artifact. Before, Cargo would only pass --extern <dep>=<path>.[rlib|so|dll]. Since with -Zembed-metadata=no, the metadata is only in the .rmeta file and not in the rlib/dylib, this is needed to help rustc find out where the metadata lies.
    • Note: this essentially doubles the cmdline length when compiling the final linked artifact. Not sure if that is a concern.

The two points above is what this PR implements, and why this rustc flag needs Cargo integration.

The -Zembed-metadata flag is only passed to libs, rlibs and dylibs. It does not seem to make sense for other crate types. The one situation where it might make sense are proc macros, but according to @bjorn3 (who initially came up with the idea for -Zembed-metadata, it isn't really worth it).

Here is a table that summarizes the changes in passed flags and generated files on disk for rlibs and dylibs:

Crate type Flags Generated files Disk usage
Rlib/Lib (before) --emit=dep-info,metadata,link .rlib (with metadata), .rmeta (for pipelining) -
Rlib/Lib (after) --emit=dep-info,metadata,link -Zembed-metadata=no .rlib (without metadata), .rmeta (for metadata/pipelining) Reduced (metadata no longer duplicated)
Dylib (before) --emit=dep-info,link [.so|.dll] (with metadata) -
Dylib (after) --emit=dep-info,metadata,link -Zembed-metadata=no [.so|.dll] (without metadata), .rmeta Unchanged, but split between two files

Behavior for other target kinds/crate types should be unchanged.

From the table above, we can see two benefits of using -Zembed-metadata=no:

  • For rlibs/dylibs, we no longer store their metadata twice in the target directory, thus reducing target directory size.
  • For dylibs, we store esssentially the same amount of data on disk, but the benefit is that the metadata is now in a separate .rmeta file. This means that you can ship the dylib (.so/.dll) to users without also shipping the metadata. This would slightly reduce e.g. the size of the shipped rustc toolchains (note that the size reduction here is after the toolchain has been already heavily compressed).

Note that if this behavior ever becomes the default, it should be possible to simplify the code quite a bit, and essentially merge the requires_upstream_objects and benefits_from_split_metadata functions.

I did a very simple initial benchmark to evaluate the space savings on cargo itself and hyperqueue (a mid-size crate from my work) using cargo build and cargo build --release with and without -Zembed-metadata=no:
image

For debug/incremental builds, the effect is smaller, as the artifact disk usage is dwarfed by incremental artifacts and debuginfo. But for (non-incremental) release builds, the disk savings (and also performed I/O operations) are significantly reduced.

How should we test and review this PR?

I wrote two basic tests. The second one tests a situation where a crate depends on a dylib dependency, which is quite rare, but the behavior of this has actually changed in this PR (see comparison table above). Testing this on various real-world projects (or even trying to enable it by default across the whole Cargo suite?) might be beneficial.

Unresolved questions

Should we gate this behind an unstable Cargo flag?

Originally, I wanted to always apply -Zembed-metadata=no in nightly cargo/rustc (which is what this PR does). But after implementing it, I wonder if we should perhaps add an unstable flag for it, because:

  • The changes are not completely trivial. It might be better to have the option to test it using a feature flag, so that we can evaluate it effect more easily, rather than just doing it by default for all nightly users. It could also make it a bit easier to dogfood it when compiling rustc itself, although I expect this will also be resolved by the next beta bump, as then the stage0 compiler will understand -Zembed-metadata.
  • It might be tricky to synchronize changes in the compiler's behavior of the -Zembed-metadata flag. If we find some issues and we need to change the behavior, we might have to also make some changes to Cargo. Without making the change at the same time in rustc and Cargo (which we cannot really do currently, because Cargo is not a subtree), it could break nightly toolchains.
  • I guess that a similar concern exists for when/if we actually want to stabilize this functionality in the future, but that would probably be easier than the nightly situation.

Is this a breaking change?

This question only becomes relevant once we start doing this by default on stable (for nightly users, it is relevant immediately if we don't use a cargo flag for it).

With this new behavior, dylibs and rlibs will no longer contain metadata. If they are compiled with Cargo, that shouldn't matter, but other build systems might have to adapt.

Should this become the default?

I think that in terms of disk size usage and performed I/O operations, it is a pure win. It should either generate less disk data (for rlibs) or the ~same amount of data for dylibs (the data will be a bit larger, because the dylib will still contain a metadata stub header, but that's like 50 bytes and doesn't scale with the size of the dylib, so it's negligible).

So I think that eventually, we should just do this by default in Cargo, unless some concerns are found. I suppose that before stabilizing we should also benchmark the effect on compilation performance.

@rustbot
Copy link
Collaborator

rustbot commented Apr 2, 2025

r? @ehuss

rustbot has assigned @ehuss.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added A-build-execution Area: anything dealing with executing the compiler A-cfg-expr Area: Platform cfg expressions A-crate-types Area: crate-type declaration (lib, staticlib, dylib, cdylib, etc.) A-manifest Area: Cargo.toml issues S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 2, 2025
@Kobzol
Copy link
Contributor Author

Kobzol commented Apr 2, 2025

I have already discussed this before with Ed, so:

r? @epage

@rustbot rustbot assigned epage and unassigned ehuss Apr 2, 2025
@ehuss
Copy link
Contributor

ehuss commented Apr 2, 2025

One of the concerns we had about this is how cargo sometimes "uplifts" rlibs to the output directory. I'm not really sure that's a good thing that cargo does it, but it is some historical baggage that we carry. If I understand correctly with this, those rlibs would not be usable. I think that's something we'll probably need to examine on what the best path forward would be.

@bjorn3
Copy link
Member

bjorn3 commented Apr 2, 2025

If I understand correctly with this, those rlibs would not be usable.

The uplifted copy alone already wouldn't be usable if the rlib has any dependencies.

@ehuss
Copy link
Contributor

ehuss commented Apr 2, 2025

The uplifted copy alone already wouldn't be usable if the rlib has any dependencies.

I believe there are some users who use -L with the deps directory which resolves that issue, but AIUI, doesn't work with this.

@Kobzol
Copy link
Contributor Author

Kobzol commented Apr 3, 2025

The uplifted copy alone already wouldn't be usable if the rlib has any dependencies.

I believe there are some users who use -L with the deps directory which resolves that issue, but AIUI, doesn't work with this.

That should be fine. As long as the .rmeta file is next to the .rlib file in the deps directory (which it should, as Cargo AFAIK always passes --emit=metadata for rlibs), then rustc should find the metadata.

@ehuss
Copy link
Contributor

ehuss commented Apr 3, 2025

Try this:

cargo new --lib foo
cd foo
cargo add empty-library
rm src/lib.rs
echo "extern crate empty_library;" > src/lib.rs
cargo b
echo "extern crate foo;" > a.rs
rustc a.rs --crate-type=rlib --extern foo=target/debug/libfoo.rlib -L target/debug/deps

@Kobzol
Copy link
Contributor Author

Kobzol commented Apr 3, 2025

When you use --extern, all -L flags are actually ignored. So the above should be the same as rustc --crate-type=lib --extern foo=target/debug/libfoo.rlib. This invocation now needs to be rustc --crate-type=lib --extern foo=target/debug/libfoo.rlib --extern foo=target/debug/deps/libfoo-<hash>.rmeta, which works (well, sort of, because this invocation actually fails to compile because you also need to pass the path to empty_library).

We could uplift also the .rmeta file (without the hash) to the output directory, same as cargo does for the .rlib, to make the manual invocation a bit easier.

@ehuss
Copy link
Contributor

ehuss commented Apr 8, 2025

We were wondering if it would be possible to avoid using -Zembed-metadata for rlibs when they are uplifted? This might be a temporary solution until we make a more concrete decision on how we handle those.

@Kobzol
Copy link
Contributor Author

Kobzol commented Apr 8, 2025

That should be possible, I can take a look.

I would like to understand the concern better though. Suppose that we make it so that both the .rlib and .rmeta files are uplifted, and any previous manual rustc invocations still work (even if you don't pass --extern=.rmeta, we could do that with a rustc heuristic). Would that resolve the concern? Or do you also consider it problematic that the contents of the uplifted rlib file have changed, if someone was looking into the file manually?

@ehuss
Copy link
Contributor

ehuss commented Apr 8, 2025

The concern is that if someone is using cargo build to obtain the .rlib or .dylib, then they are likely doing something with it, but we don't know for sure what that is. They could be moving the file elsewhere, or doing something with it we're not sure about. If those tools need to also do something with the .rmeta file too, then their tool would get broken (since they aren't expecting a separate file).

Uplifting the .rmeta file is another strong candidate for how to solve this. Assuming someone is just referencing the .rlib via --extern or -L, then I think that should be sufficient. We just have uncertainty if that is all people are doing.

We also have uncertainty about how much this use case should really be supported. To what degree of breakage would we be willing to do?

If it looks like it will be complicated to detect if something is uplifted, then I think I would be fine with just including the .rmeta in the uplift (since that should be easier). I would like to be responsive if anyone reports it as a regression, though.

@Kobzol
Copy link
Contributor Author

Kobzol commented Apr 9, 2025

We also have uncertainty about how much this use case should really be supported. To what degree of breakage would we be willing to do?

The way I see it, the contents of the .rlib/.dylib files are unstable and we don't really promise anything about them. Furthermore, if someone uses Cargo, they could depend on the .rlib/dylib file being uplifted, but not on its contents. If someone is using a different build system, then they would have to opt into -Zembed-metadata=no, so this change shouldn't affect them. That being said, I of course acknowledge Hyrum's law and all, so I understand that you're not willing to do this change lightly!

If it looks like it will be complicated to detect if something is uplifted, then I think I would be fine with just including the .rmeta in the uplift (since that should be easier). I would like to be responsive if anyone reports it as a regression, though.

I think that it shouldn't be that complicated, we could not apply the optimization to the roots. However, this will remove the disk savings effect for dylibs (one of the two motivations mentioned in the PR description) that people ship externally. It's also a bit weird that we apply -Zembed-metadata=no based on the fact if the thing you're building is the root unit being build or not.

Note that just uplifting .rmeta wouldn't solve the issue on its own, we would have to add a heuristic to rustc to try to lookup .rmeta in the same directory as .rlib (it currently doesn't do that, so without --extern=.rmeta it wouldn't work).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-build-execution Area: anything dealing with executing the compiler A-cfg-expr Area: Platform cfg expressions A-crate-types Area: crate-type declaration (lib, staticlib, dylib, cdylib, etc.) A-manifest Area: Cargo.toml issues S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants