-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for -Zembed-metadata
#15378
base: master
Are you sure you want to change the base?
Conversation
I have already discussed this before with Ed, so: r? @epage |
One of the concerns we had about this is how cargo sometimes "uplifts" rlibs to the output directory. I'm not really sure that's a good thing that cargo does it, but it is some historical baggage that we carry. If I understand correctly with this, those rlibs would not be usable. I think that's something we'll probably need to examine on what the best path forward would be. |
The uplifted copy alone already wouldn't be usable if the rlib has any dependencies. |
I believe there are some users who use |
That should be fine. As long as the .rmeta file is next to the .rlib file in the deps directory (which it should, as Cargo AFAIK always passes |
Try this:
|
When you use We could uplift also the |
We were wondering if it would be possible to avoid using |
That should be possible, I can take a look. I would like to understand the concern better though. Suppose that we make it so that both the .rlib and .rmeta files are uplifted, and any previous manual rustc invocations still work (even if you don't pass |
The concern is that if someone is using Uplifting the We also have uncertainty about how much this use case should really be supported. To what degree of breakage would we be willing to do? If it looks like it will be complicated to detect if something is uplifted, then I think I would be fine with just including the |
The way I see it, the contents of the
I think that it shouldn't be that complicated, we could not apply the optimization to the roots. However, this will remove the disk savings effect for dylibs (one of the two motivations mentioned in the PR description) that people ship externally. It's also a bit weird that we apply Note that just uplifting |
What does this PR try to resolve?
This PR adds Cargo integration for the new unstable
-Zembed-metadata
rustc flag, which was implemented in rust-lang/rust#137535 (tracking issue).This rustc flag can reduce disk usage of compiled artifacts, and also the size of Rust dynamic library artifacts shipped to users. However, it is not enough to just pass this flag through
RUSTFLAGS
; it needs to be integrated within Cargo, because it interacts with how the--emit
flag is passed to rustc, and also how--extern
args are passed to the final linked artifact build by Cargo. Furthermore, using the flag for all crates in a crate graph compiled by Cargo would be suboptimal (this will all be described below).When you pass
-Zembed-metadata=no
to rustc, it will not store Rust metadata into the compiled artifact. This is important when compiling libs/rlibs/dylibs, since it reduces their size on disk. However, this also means that everytime we use this flag, we have to make sure that we also:metadata
in the--emit
flag to generate a.rmeta
file, otherwise no metadata would be generated whatsoever, which would mean that the artifact wouldn't be usable as a dependency.--extern <dep>=<path>.rmeta
when compiling the final linkable artifact. Before, Cargo would only pass--extern <dep>=<path>.[rlib|so|dll]
. Since with-Zembed-metadata=no
, the metadata is only in the.rmeta
file and not in the rlib/dylib, this is needed to help rustc find out where the metadata lies.The two points above is what this PR implements, and why this rustc flag needs Cargo integration.
The
-Zembed-metadata
flag is only passed to libs, rlibs and dylibs. It does not seem to make sense for other crate types. The one situation where it might make sense are proc macros, but according to @bjorn3 (who initially came up with the idea for-Zembed-metadata
, it isn't really worth it).Here is a table that summarizes the changes in passed flags and generated files on disk for rlibs and dylibs:
--emit=dep-info,metadata,link
.rlib
(with metadata),.rmeta
(for pipelining)--emit=dep-info,metadata,link -Zembed-metadata=no
.rlib
(without metadata),.rmeta
(for metadata/pipelining)--emit=dep-info,link
[.so|.dll]
(with metadata)--emit=dep-info,metadata,link -Zembed-metadata=no
[.so|.dll]
(without metadata),.rmeta
Behavior for other target kinds/crate types should be unchanged.
From the table above, we can see two benefits of using
-Zembed-metadata=no
:.so
/.dll
) to users without also shipping the metadata. This would slightly reduce e.g. the size of the shipped rustc toolchains (note that the size reduction here is after the toolchain has been already heavily compressed).Note that if this behavior ever becomes the default, it should be possible to simplify the code quite a bit, and essentially merge the
requires_upstream_objects
andbenefits_from_split_metadata
functions.I did a very simple initial benchmark to evaluate the space savings on cargo itself and hyperqueue (a mid-size crate from my work) using

cargo build
andcargo build --release
with and without-Zembed-metadata=no
:For debug/incremental builds, the effect is smaller, as the artifact disk usage is dwarfed by incremental artifacts and debuginfo. But for (non-incremental) release builds, the disk savings (and also performed I/O operations) are significantly reduced.
How should we test and review this PR?
I wrote two basic tests. The second one tests a situation where a crate depends on a dylib dependency, which is quite rare, but the behavior of this has actually changed in this PR (see comparison table above). Testing this on various real-world projects (or even trying to enable it by default across the whole Cargo suite?) might be beneficial.
Unresolved questions
Should we gate this behind an unstable Cargo flag?
Originally, I wanted to always apply
-Zembed-metadata=no
in nightly cargo/rustc (which is what this PR does). But after implementing it, I wonder if we should perhaps add an unstable flag for it, because:-Zembed-metadata
.-Zembed-metadata
flag. If we find some issues and we need to change the behavior, we might have to also make some changes to Cargo. Without making the change at the same time in rustc and Cargo (which we cannot really do currently, because Cargo is not a subtree), it could break nightly toolchains.Is this a breaking change?
This question only becomes relevant once we start doing this by default on stable (for nightly users, it is relevant immediately if we don't use a cargo flag for it).
With this new behavior, dylibs and rlibs will no longer contain metadata. If they are compiled with Cargo, that shouldn't matter, but other build systems might have to adapt.
Should this become the default?
I think that in terms of disk size usage and performed I/O operations, it is a pure win. It should either generate less disk data (for rlibs) or the ~same amount of data for dylibs (the data will be a bit larger, because the dylib will still contain a metadata stub header, but that's like 50 bytes and doesn't scale with the size of the dylib, so it's negligible).
So I think that eventually, we should just do this by default in Cargo, unless some concerns are found. I suppose that before stabilizing we should also benchmark the effect on compilation performance.