Skip to content

Commit 84dd8aa

Browse files
authored
Merge pull request #49 from alexcrichton/pipelining-wg
Add a pipelining WG
2 parents 3b404d5 + 2c7d41e commit 84dd8aa

File tree

5 files changed

+273
-4
lines changed

5 files changed

+273
-4
lines changed

README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -58,6 +58,7 @@ Name | Status | Short
5858
[Parallel-rustc](working-groups/parallel-rustc/) | Active | Making parallel compilation the default for rustc | [#t-compiler/wg-parallel-rustc][parallel-rustc_stream]
5959
[Profile-Guided Optimization](working-groups/pgo/) | Active | Implementing profile-guided optimization for rustc | [#t-compiler/wg-profile-guided-optimization][pgo_stream]
6060
[MIR Optimizations](working-groups/mir-opt/) | Active | Write MIR optimizations and refactor the MIR to be more optimizable. | [#t-compiler/wg-mir-opt][mir-opt-stream]
61+
[Rustc pipelining](working-groups/pipelining/) | Active | Enable Cargo to invoke rustc in a pipelined fashion, speeding up crate graph compiles. | [#t-compiler/wg-pipelining][pipelining-stream]
6162

6263
[nikomatsakis]: https://github.com/nikomatsakis
6364
[cramertj]: https://github.com/cramertj
@@ -84,6 +85,7 @@ Name | Status | Short
8485
[parallel-rustc_stream]: https://rust-lang.zulipchat.com/#narrow/stream/187679-t-compiler.2Fwg-parallel-rustc
8586
[rfc-2229-stream]: https://rust-lang.zulipchat.com/#narrow/stream/189812-t-compiler.2Fwg-rfc-2229
8687
[mir-opt-stream]: https://rust-lang.zulipchat.com/#narrow/stream/189540-t-compiler.2Fwg-mir-opt
88+
[pipelining-stream]: https://rust-lang.zulipchat.com/#narrow/stream/195180-t-compiler.2Fwg-pipelining
8789

8890
## Expert Map
8991

about/triage-meeting.md

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -15,10 +15,11 @@ This section contains the scheduled check-ins for working groups:
1515
- **2019-03-21:** [wg-nll], [wg-traits]
1616
- **2019-03-28:** [wg-parallel-rustc], [wg-pgo]
1717
- **2019-04-04:** [wg-rls-2.0], [wg-meta]
18-
- **2019-04-11:** [wg-mir-opt], [wg-async-await]
19-
- **2019-04-18:** [wg-llvm], [wg-self-profile]
20-
- **2019-04-25:** [wg-rfc-2229], [wg-rls-2.0]
21-
- **2019-05-02:** [wg-meta], [wg-nll]
18+
- **2019-04-11:** [wg-mir-opt], [wg-pipelining]
19+
- **2019-04-18:** [wg-llvm], [wg-async-await],
20+
- **2019-04-25:** [wg-rfc-2229], [wg-self-profile]
21+
- **2019-05-02:** [wg-meta], [wg-rls-2.0]
22+
- **2019-05-09:** [wg-nll]
2223

2324
Looking for a meeting that isn't listed above? Make a PR and extend the list to include that
2425
meeting.
@@ -41,3 +42,4 @@ This section lists check-ins from triage meetings before the check-in schedule w
4142
[wg-parallel-rustc]: ../working-groups/parallel-rustc
4243
[wg-pgo]: ../working-groups/pgo
4344
[wg-mir-opt]: ../working-groups/mir-opt
45+
[wg-pipelining]: ../working-groups/pipelining

working-groups/pipelining/FAQ.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
# Frequently Asked Questions (FAQ)
2+
3+
Looks like no questions have been asked yet! If you have a question, feel free to file an issue or ask in the working group's Zulip stream.

working-groups/pipelining/NOTES.md

Lines changed: 218 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,218 @@
1+
# pipelining Meeting Notes
2+
3+
## Video 2019-04-05
4+
5+
[@alexcrichton] and [@nnethercote] met on video for ~30m and talked about various
6+
aspects of implementing pipelining in the compiler.
7+
8+
[@alexcrichton]: https://github.com/alexcrichton
9+
[@nnethercote]: https://github.com/nnethercote
10+
11+
#### What are metadata/rlibs?
12+
13+
First we talked a bit about what are rlibs/metadata files and how is this all
14+
going to be put together. The recap is:
15+
16+
* Rustc can produce metadata files (`--emit metadata`). These metadata files are
17+
like header files for Rust crates. They're internally a compiler-specific
18+
binary format and cannot be inspected.
19+
20+
* Rustc can also produce rlibs (`--emit link --crate-type lib`). An rlib is an
21+
archive (a `*.a` file) which contains three things:
22+
23+
* Object code (`*.o`)
24+
* Compressed bytecode (`*.bc.z`)
25+
* Metadata (`metadata.bin`)
26+
27+
The metadata included here is the same as `--emit metadata`
28+
29+
When you type `cargo build`, Cargo is likely building an `rlib` for almost all
30+
library crates in use. When you type `cargo check` Cargo produces metadata files
31+
for all crates.
32+
33+
#### How are we going to pipeline?
34+
35+
The goal of pipelining is similar to CPU instruction pipelining, which is to
36+
fill up available hardware with as much work as possible. This can increase
37+
overall throughput without actually speeding up the intermediate operations. For
38+
example let's say your compile looks like this:
39+
40+
41+
```
42+
meta meta
43+
[-libA----|--------][-libB----|--------][-binary-----------]
44+
0s 5s 10s 15s 20s 30s
45+
```
46+
47+
Here we're have a `binary` which depends on `libB` which depends on `libA`. The
48+
whole compile currently takes 30s, but as noted here the metadata files for
49+
libraries are available before the compilation is finished.
50+
51+
Fundamentally all rustc needs to produce an rlib is the `metadata.bin` file from
52+
upstream crates. In other words, to compile `libB`, all we need is the metadata
53+
from `libA`, not the entire rlib. We can theoretically restructure the
54+
compilation like so:
55+
56+
```
57+
[-libA----|--------]
58+
[-libB----|--------]
59+
[-binary-----------]
60+
0s 5s 10s 15s 20s
61+
```
62+
63+
By starting subsequent compilations as soon as metadata is available, we shaved
64+
10 seconds off this compilation. We also did that for free! Furthermore
65+
we're able to use 2 parallel rustc processes at times instead of having
66+
everything be serial.
67+
68+
69+
#### Compromise: linking is hard
70+
71+
Although we can shave 10s off compilation as shown above, it's likely going to
72+
be very difficult to get the full wins there. There's a caveat when compiling
73+
`binary` that we do actually need the `*.rlib` files to link. We don't need them
74+
to typecheck and such, but the linking phase needs them.
75+
76+
Now linking is the final stage of the compiler, so it's only very late that we
77+
end up needing all of the dependencies. This would require some degree of
78+
synchronization still, though, where rustc needs to know it cannot proceed until
79+
Cargo instructs it to.
80+
81+
As a result, the current thinking is to compromise here and simply ignore
82+
pipelining for "linkable" crates. Crates that produce binaries, dylibs,
83+
proc-macros, etc, will all wait for all their dependencies to finish before
84+
proceeding, even if they could get some work done ahead of time. As a result the
85+
target compilation timeline for our example above looks like:
86+
87+
```
88+
[-libA----|--------]
89+
[-libB----|--------]
90+
[-binary-----------]
91+
0s 5s 10s 15s 25s
92+
```
93+
94+
but we're still saving time! Typically a dependency graph in Rust is far deeper
95+
than three crates, so the compile time wins are expected to be much larger.
96+
97+
#### Step 1: What architecture is used to pipeline rustc?
98+
99+
The first thing we then talked about was how rustc was going to be invoked in a
100+
pipelined fashion. There were two primary candidates we figured could be
101+
implemented:
102+
103+
##### (a) Run rustc twice
104+
105+
One option is to literally run `rustc --emit metadata foo.rs` and then
106+
subsequently execute `rustc --emit link foo.rs`. The second command is in theory
107+
accelerated by incremental compilation artifacts produced by the first command.
108+
109+
**Pros**:
110+
111+
* Feels "pure" from a build system perspective as it keeps rustc in line with
112+
basically all other build tools, you run it to completion and don't care about
113+
what happens in the middle.
114+
115+
**Cons**:
116+
117+
* We're unlikely to reap full benefits from this strategy. The second `rustc`
118+
command has to redo quite a bit of work to get back to the point the first
119+
command was at, and it's not an instantatenous piece of work even with
120+
incremental. As a result this may run a risk of slowing down compiles because
121+
the second command takes so long to start up.
122+
123+
##### (b) Signal Cargo when metadata is ready
124+
125+
The second option is for rustc to continue in-process after it produces metadata
126+
and go on to produce the final rlib. The compiler would, however, send a signal
127+
to Cargo (somehow) that metadata is ready to go.
128+
129+
**Pros**:
130+
131+
* This should get us the full speed of pipelined compilation. There's no
132+
"startup time" for the work involved in producing the rlib since it's already
133+
all in-process in rustc.
134+
135+
**Cons**:
136+
137+
* This is going to be significantly more difficult for other build systems to
138+
get integrated (those that aren't Cargo).
139+
140+
Overall we decided that this option was the route to pursue due to the speed
141+
wins likely to be gained.
142+
143+
#### Step 2: work with only metadata as input
144+
145+
@alexcrichton claimed that rustc cannot produce an rlib today with only
146+
`*.rmeta` files as input. After some testing, it was found that this was a false
147+
claim. Invocations like so can produce working rlibs:
148+
149+
150+
```
151+
$ rustc libA.rs --emit metadata,link --crate-type lib
152+
$ rm libA.rlib
153+
$ rustc libB.rs --emit metadata,link --crate-type lib --extern libA=liblibA.rmeta
154+
```
155+
156+
So that means this step is already done! The compiler is already capable of
157+
implementing the pipelining showed above where it can be invoked in parallel by
158+
Cargo.
159+
160+
#### Step 3: telling Cargo when metadata is ready
161+
162+
The next (and final) piece of implementation needed in rustc is that the
163+
compiler has to somehow tell Cargo when metadata is available on the filesystem.
164+
Cargo needs some mechanism to know when to start spawning more rustc processes
165+
(if possible), and it currently has none without watching the filesystem.
166+
167+
There are two primary ways we could implement this:
168+
169+
##### (a) Use a TCP server
170+
171+
A simple option would be for Cargo to start a small TCP server locally whenever
172+
it builds. The compiler would then connect to this server whenever a metadata
173+
file is ready to go and tell Cargo that it can proceed.
174+
175+
**Pros**:
176+
177+
* Relatively simple to implement in Cargo and rustc
178+
* Should work on all platforms
179+
180+
**Cons**:
181+
182+
* The compiler has to somehow tell Cargo which compiler it is (disambiguating
183+
from other parallel invocations)
184+
* This is a weird interface without really much precedent. It's unclear how
185+
other build systems would take advantage of it easily. It just "feels wrong"
186+
and "icky".
187+
188+
##### (b) Print a JSON message when metadata is ready
189+
190+
An alternative solution proposed by @ehuss is that the compiler could print a
191+
message on stdout/stderr to Cargo whenever a file has been produced. Cargo
192+
already does this, for example, when invoked with `--message-format=json`. The
193+
compiler already emits errors as JSON blobs with `--error-format=json`, although
194+
the compiler doesn't emit other information via JSON right now.
195+
196+
**Pros**:
197+
198+
* Feels like a clean solution. No need for Cargo to figure out what rustc is
199+
printing what (it knows that from which process printed).
200+
* Pretty easy to implement in rustc, just another JSON message somewhere.
201+
* Should be somewhat usable by other build systems as it's pretty standard to
202+
listen to stderr/stdout from spawned processes.
203+
204+
**Cons**:
205+
206+
* Cargo would have to always invoke the compiler with `--error-format=json`.
207+
Cargo does not currently do this to ensure that compiler error diagnostics are
208+
rendered to the screen correctly (aka are colorized and formatted correctly).
209+
A [recent PR to rustc](https://github.com/rust-lang/rust/pull/59128) shows
210+
hope for Cargo to be able to do this, although it may take time to implement
211+
and stabilize that. This would become a required blocker to enabling pipelined
212+
compilation.
213+
* A JSON message format for rustc would need to be designed. There's no
214+
precedent to draw from in rustc yet to emit arbitrary JSON messages about
215+
progress so far. There's likely some desire to do so though!
216+
217+
We decided this is the route to go as it seems the most viable for
218+
stabilization.

working-groups/pipelining/README.md

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
# pipelining Working Group
2+
![working group status: active][status]
3+
4+
- **Leads:** [@alexcrichton][alexcrichton], [@nnethercote][nnethercote]
5+
- **Meeting Notes:** [All](NOTES.md)
6+
7+
[nnethercote]: https://github.com/nnethercote
8+
[alexcrichton]: https://github.com/alexcrichton
9+
[status]: https://img.shields.io/badge/status-active-brightgreen.svg?style=for-the-badge
10+
11+
## What is the goal of this working group?
12+
This working group aims to accomplish the following:
13+
14+
- Enable Cargo to invoke rustc in a pipelined fashion. Specifically enabling
15+
Cargo to start compiling dependants of a crate as soon as metadata becomes
16+
available, rather than waiting until the entire compilation is done before
17+
proceeding.
18+
19+
# How can I get involved?
20+
21+
If you are interested in getting involved in this working group, you are welcome
22+
to introduce yourself in the Zulip stream. This working group is relatively
23+
small in scope, though, so we may not have work for everyone to take on!
24+
25+
- **Desired experience level:** Experienced
26+
- **Relevant repositories:** [`rust-lang/rust`][repo]
27+
- **Zulip stream:** [`#t-compiler/wg-pipelining`][zulip] on Zulip
28+
29+
[repo]: https://github.com/rust-lang/rust
30+
[zulip]: https://rust-lang.zulipchat.com/#narrow/stream/195180-t-compiler.2Fwg-pipelining
31+
32+
## What if I don't have much time?
33+
34+
We're not quite at the point where we can test out support in rustc, but when
35+
that's ready we can fill in some instructions here!
36+
37+
## Are there any resources so I can get up to speed?
38+
39+
There are some resources available for those interested in contributing to get
40+
some background and context:
41+
42+
- [Initial Cargo issue](https://github.com/rust-lang/cargo/issues/6660)
43+
- [Tracking issue on rustc side of things](https://github.com/rust-lang/rust/issues/58465)
44+
- [Old cargo tracking issue](https://github.com/rust-lang/cargo/issues/4831)

0 commit comments

Comments
 (0)