Skip to content

Commit 7511627

Browse files
committed
Merge remote-tracking branch 'acrichto/crate-versioning'
2 parents c9c0fdb + 4878811 commit 7511627

File tree

1 file changed

+214
-0
lines changed

1 file changed

+214
-0
lines changed

active/0000-remove-crate-id.md

Lines changed: 214 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,214 @@
1+
- Start Date: (fill me in with today's date, YYYY-MM-DD)
2+
- RFC PR #: (leave this empty)
3+
- Rust Issue #: (leave this empty)
4+
5+
# Summary
6+
7+
* Remove the `crate_id` attribute and knowledge of versions from rustc.
8+
* Add a `#[crate_name]` attribute similar to the old `#[crate_id]` attribute
9+
* Filenames will no longer have versions, nor will symbols
10+
* A new flag, `--extern`, will be used to override searching for external crates
11+
* A new flag, `-C metadata=foo`, used when hashing symbols
12+
13+
# Motivation
14+
15+
The intent of CrateId and its support has become unclear over time as the
16+
initial impetus, `rustpkg`, has faded over time. With `cargo` on the horizon,
17+
doubts have been cast on the compiler's support for dealing with crate
18+
versions and friends. The goal of this RFC is to simplify the compiler's
19+
knowledge about the identity of a crate to allow cargo to do all the necessary
20+
heavy lifting.
21+
22+
This new crate identification is designed to not compromise on the usability of
23+
the compiler independent of cargo. Additionally, all use cases support today
24+
with a CrateId should still be supported.
25+
26+
# Detailed design
27+
28+
A new `#[crate_name]` attribute will be accepted by the compiler, which is the
29+
equivalent of the old `#[crate_id]` attribute, except without the "crate id"
30+
support. This new attribute can have a string value describe a valid crate name.
31+
32+
A crate name must be a valid rust identifier with the exception of allowing the
33+
`-` character after the first character.
34+
35+
```rust
36+
#![crate_name = "foo"]
37+
#![crate_type = "lib"]
38+
39+
pub fn foo() { /* ... */ }
40+
```
41+
42+
## Naming library filenames
43+
44+
Currently, rustc creates filenames for library following this pattern:
45+
46+
```
47+
lib<name>-<version>-<hash>.rlib
48+
```
49+
50+
The current scheme defines `<hash>` to be the hash of the CrateId value. This
51+
naming scheme achieves a number of goals:
52+
53+
* Libraries of the same name can exist next to one another if they have
54+
different versions.
55+
* Libraries of the same name and version, but from different sources, can exist
56+
next to one another due to having different hashes.
57+
* Rust libraries can have very privileged names such as `core` and `std` without
58+
worrying about polluting the global namespace of other system libraries.
59+
60+
One drawback of this scheme is that the output filename of the compiler is
61+
unknown due to the `<hash>` component. One must query `rustc` itself to
62+
determine the name of the library output.
63+
64+
Under this new scheme, the new output filenames by the compiler would be:
65+
66+
```
67+
lib<name>.rlib
68+
```
69+
70+
Note that both the `<version>` and the `<hash>` are missing by default. The
71+
`<version>` was removed because the compiler no longer knows about the version,
72+
and the `<hash>` was removed to make the output filename predictable.
73+
74+
The three original goals can still be satisfied with this simplified naming
75+
scheme. As explained in th enext section, the compiler's "glob pattern" when
76+
searching for a crate named `foo` will be `libfoo*.rlib`, which will help
77+
rationalize some of these conclusions.
78+
79+
* Libraries of the same name can exist next to one another because they can be
80+
manually renamed to have extra data after the `libfoo`, such as the version.
81+
* Libraries of the same name and version, but different source, can also exist
82+
by modifing what comes after `libfoo`, such as including a hash.
83+
* Rust does not need to occupy a privileged namespace as the default rust
84+
installation would include hashes in all the filenames as necessary. More on
85+
this later.
86+
87+
Additionally, with a predictable filename output external tooling should be
88+
easier to write.
89+
90+
## Loading crates
91+
92+
The goal of the crate loading phase of the compiler is to map a set of `extern
93+
crate` statements to (dylib,rlib) pairs that are present on the filesystem. To
94+
do this, the current system matches dependencies via the CrateId syntax:
95+
96+
```rust
97+
extern crate json = "super-fast-json#0.1.0";
98+
```
99+
100+
In today's compiler, this directive indicates that the a filename of the form
101+
`libsuper-fast-json-0.1.0-<hash>.rlib` must be found to be a candidate. Further
102+
checking happens once a candidate is found to ensure that it is indeed a rust
103+
library.
104+
105+
Concerns have been raised that this key point of dependency management is where
106+
the compiler is doing work that is not necessarily its prerogative. In a
107+
cargo-driven world, versions are primarily managed in an external manifest, in
108+
addition to doing other various actions such as renaming packages at compile
109+
time.
110+
111+
One solution would be to add more version management to the compiler, but this
112+
is seen as the compiler delving too far outside what it was initially tasked to
113+
do. With this in mind, this is the new proposal for the `extern crate` syntax:
114+
115+
```rust
116+
extern crate json = "super-fast-json";
117+
```
118+
119+
Notably, the CrateId is removed entirely, along with the version and path
120+
associated with it. The string value of the `extern crate` directive is still
121+
optional (defaulting to the identifier), and the string must be a valid crate
122+
name (as defined above).
123+
124+
The compiler's searching and file matching logic would be altered to only match
125+
crates based on name. If two versions of a crate are found, the compiler will
126+
unconditionally emit an error. It will be up to the user to move the two
127+
libraries on the filesystem and control the `-L` flags to the compiler to enable
128+
disambiguation.
129+
130+
This imples that when the compiler is searching for the crate named `foo`, it
131+
will search all of the lookup paths for files which match the pattern
132+
`libfoo*.{so,rlib}`. This is likely to return many false positives, but they
133+
will be easily weeded out once the compiler realizes that there is no metadata
134+
in the library.
135+
136+
This scheme is strictly less powerful than the previous, but it moves a good
137+
deal of logic from the compiler to cargo.
138+
139+
### Manually specifying dependencies
140+
141+
Cargo is often seen as "expert mode" in its usage of the compiler. Cargo will
142+
always have prior knowledge about what exact versions of a library will be used
143+
for any particular dependency, as well as where the outputs are located.
144+
145+
If the compiler provided no support for loading crates beyond matching
146+
filenames, it would limit many of cargo's use cases. For example, cargo could
147+
not compile a crate with two different versions of an upstream crate.
148+
Additionally, cargo could not substitute `libfast-json` for `libslow-json` at
149+
compile time (assuming they have the same API).
150+
151+
To accomodate an "expert mode" in rustc, the compiler will grow a new command
152+
line flag of the form:
153+
154+
```
155+
--extern json=path/to/libjson
156+
```
157+
158+
This directive will indicate that the library `json` can be found at
159+
`path/to/libjson`. The file extension is not specified, and it is assume that
160+
the rlib/dylib pair are located next to one another at this location (`libjson`
161+
is the file stem).
162+
163+
This will enable cargo to drive how the compiler loads crates by manually
164+
specifying where files are located and exactly what corresponds to what.
165+
166+
## Symbol mangling
167+
168+
Today, mangled symbols contain the version number at the end of the symbol
169+
itself. This was originally intended to tie into Linux's ability to version
170+
symbols, but in retrospect this is generally viewed as over-ambitious as the
171+
support is not currently there, nor does it work on windows or OSX.
172+
173+
Symbols would no longer contain the version number anywhere within them. The
174+
hash at the end of each symbol would only include the crate name and metadata
175+
from the command line. Metadata from the command line will be passed via a new
176+
command line flag, `-C metadata=foo`, which specifies a string to hash.
177+
178+
## The standard rust distribution
179+
180+
The standard distribution would continue to put hashes in filenames manually
181+
because the libraries are intended to occupy a privileged space on the system.
182+
The build system would manually move a file after it was compiled to the correct
183+
destination filename.
184+
185+
# Drawbacks
186+
187+
* The compiler is able to operate fairly well independently of cargo today, and
188+
this scheme would hamstring the compiler by limiting the number of "it just
189+
works" use cases. If cargo is not being used, build systems will likely have
190+
to start using `--extern` to specify dependencies if name conflicts or version
191+
conflicts arise between crates.
192+
193+
* This scheme still has redundancy in the list of dependencies with the external
194+
cargo manifest. The source code would no longer list versions, but the cargo
195+
manifest will contain the same identifier for each dependency that the source
196+
code will contain.
197+
198+
# Alternatives
199+
200+
* The compiler could go in the opposite direction of this proposal, enhancing
201+
`extern crate` instead of simplifying it. The compiler could learn about
202+
things like version ranges and friends, while still maintaining flags to fine
203+
tune its behavior. It is unclear whether this increase in complexity will be
204+
paired with a large enough gain in usability of the compiler independent of
205+
cargo.
206+
207+
# Unresolved questions
208+
209+
* An implementation for the more advanced features of cargo does not currently
210+
exist, to it is unknown whether `--extern` will be powerful enough for cargo
211+
to satisfy all its use cases with.
212+
213+
* Are the string literal parts of `extern crate` justified? Allowing a string
214+
literal just for the `-` character may be overkill.

0 commit comments

Comments
 (0)