Skip to content

Commit 487bdeb

Browse files
committed
Improve ordering and naming of CGUs for non-incremental builds.
Currently there are two problems. First, the CGUS don't end up in size order. The merging loop does sort by size on each iteration, but we don't sort after the final merge, so typically there is one CGU out of place. (And sometimes we don't enter the merging loop at all, in which case they end up in random order.) Second, we then assign names that differ only by a numeric suffix, and then we sort them lexicographically by name, giving us an order like this: regex.f10ba03eb5ec7975-cgu.1 regex.f10ba03eb5ec7975-cgu.10 regex.f10ba03eb5ec7975-cgu.11 regex.f10ba03eb5ec7975-cgu.12 regex.f10ba03eb5ec7975-cgu.13 regex.f10ba03eb5ec7975-cgu.14 regex.f10ba03eb5ec7975-cgu.15 regex.f10ba03eb5ec7975-cgu.2 regex.f10ba03eb5ec7975-cgu.3 regex.f10ba03eb5ec7975-cgu.4 regex.f10ba03eb5ec7975-cgu.5 regex.f10ba03eb5ec7975-cgu.6 regex.f10ba03eb5ec7975-cgu.7 regex.f10ba03eb5ec7975-cgu.8 regex.f10ba03eb5ec7975-cgu.9 These two problems are really annoying when debugging and profiling the CGUs. This commit ensures CGUs are sorted by name *and* reverse sorted by size. This involves (a) one extra sort by size operation, and (b) padding the numeric indices with zeroes, e.g. `regex.f10ba03eb5ec7975-cgu.01`. (Note that none of this applies for incremental builds, where a different hash-based CGU naming scheme is used.)
1 parent 8084f39 commit 487bdeb

File tree

1 file changed

+27
-6
lines changed

1 file changed

+27
-6
lines changed

compiler/rustc_monomorphize/src/partitioning.rs

+27-6
Original file line numberDiff line numberDiff line change
@@ -368,6 +368,7 @@ fn merge_codegen_units<'tcx>(
368368

369369
let cgu_name_builder = &mut CodegenUnitNameBuilder::new(cx.tcx);
370370

371+
// Rename the newly merged CGUs.
371372
if cx.tcx.sess.opts.incremental.is_some() {
372373
// If we are doing incremental compilation, we want CGU names to
373374
// reflect the path of the source level module they correspond to.
@@ -404,18 +405,38 @@ fn merge_codegen_units<'tcx>(
404405
}
405406
}
406407
}
408+
409+
// A sorted order here ensures what follows can be deterministic.
410+
codegen_units.sort_by(|a, b| a.name().as_str().cmp(b.name().as_str()));
407411
} else {
408-
// If we are compiling non-incrementally we just generate simple CGU
409-
// names containing an index.
412+
// When compiling non-incrementally, we rename the CGUS so they have
413+
// identical names except for the numeric suffix, something like
414+
// `regex.f10ba03eb5ec7975-cgu.N`, where `N` varies.
415+
//
416+
// It is useful for debugging and profiling purposes if the resulting
417+
// CGUs are sorted by name *and* reverse sorted by size. (CGU 0 is the
418+
// biggest, CGU 1 is the second biggest, etc.)
419+
//
420+
// So first we reverse sort by size. Then we generate the names with
421+
// zero-padded suffixes, which means they are automatically sorted by
422+
// names. The numeric suffix width depends on the number of CGUs, which
423+
// is always greater than zero:
424+
// - [1,9] CGUS: `0`, `1`, `2`, ...
425+
// - [10,99] CGUS: `00`, `01`, `02`, ...
426+
// - [100,999] CGUS: `000`, `001`, `002`, ...
427+
// - etc.
428+
//
429+
// If we didn't zero-pad the sorted-by-name order would be `XYZ-cgu.0`,
430+
// `XYZ-cgu.1`, `XYZ-cgu.10`, `XYZ-cgu.11`, ..., `XYZ-cgu.2`, etc.
431+
codegen_units.sort_by_key(|cgu| cmp::Reverse(cgu.size_estimate()));
432+
let num_digits = codegen_units.len().ilog10() as usize + 1;
410433
for (index, cgu) in codegen_units.iter_mut().enumerate() {
434+
let suffix = format!("{index:0num_digits$}");
411435
let numbered_codegen_unit_name =
412-
cgu_name_builder.build_cgu_name_no_mangle(LOCAL_CRATE, &["cgu"], Some(index));
436+
cgu_name_builder.build_cgu_name_no_mangle(LOCAL_CRATE, &["cgu"], Some(suffix));
413437
cgu.set_name(numbered_codegen_unit_name);
414438
}
415439
}
416-
417-
// A sorted order here ensures what follows can be deterministic.
418-
codegen_units.sort_by(|a, b| a.name().as_str().cmp(b.name().as_str()));
419440
}
420441

421442
fn internalize_symbols<'tcx>(

0 commit comments

Comments
 (0)