Skip to content

Commit a09fbe2

Browse files
committed
Auto merge of #145910 - saethlin:ignore-intrinsic-calls, r=cjgillot
Ignore intrinsic calls in cross-crate-inlining cost model I noticed in a side project that a function which just compares to `[u64; 2]` for equality is not cross-crate-inlinable. That was surprising to me because I didn't think that code contained a function call, but of course our array comparisons are lowered to an intrinsic. Intrinsic calls don't make a function no longer a leaf, so it makes sense to add this as an exception to the "only leaves" cross-crate-inline heuristic. This is the useful compare link: https://perf.rust-lang.org/compare.html?start=7cb1a81145a739c4fd858abe3c624ce8e6e5f9cd&end=c3f0a64dbf9fba4722dacf8e39d2fe00069c995e&stat=instructions%3Au because it disables CGU merging in both commits, so effects that cause changes in the sysroot to perturb partitioning downstream are excluded. Perturbations to what is and isn't cross-crate-inlinable in the sysroot has chaotic effects on what items are in which CGUs after merging. It looks like before this PR by sheer luck some of the CGUs dirtied by the patch in eza incr-unchanged happened to be merged together, and with this PR they are not. The perf runs on this PR point to a nice runtime performance improvement.
2 parents 2f3f27b + ab91a63 commit a09fbe2

File tree

6 files changed

+25
-1
lines changed

6 files changed

+25
-1
lines changed

compiler/rustc_mir_transform/src/cross_crate_inline.rs

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -135,7 +135,16 @@ impl<'tcx> Visitor<'tcx> for CostChecker<'_, 'tcx> {
135135
}
136136
}
137137
}
138-
TerminatorKind::Call { unwind, .. } => {
138+
TerminatorKind::Call { ref func, unwind, .. } => {
139+
// We track calls because they make our function not a leaf (and in theory, the
140+
// number of calls indicates how likely this function is to perturb other CGUs).
141+
// But intrinsics don't have a body that gets assigned to a CGU, so they are
142+
// ignored.
143+
if let Some((fn_def_id, _)) = func.const_fn_def()
144+
&& self.tcx.has_attr(fn_def_id, sym::rustc_intrinsic)
145+
{
146+
return;
147+
}
139148
self.calls += 1;
140149
if let UnwindAction::Cleanup(_) = unwind {
141150
self.landing_pads += 1;

tests/assembly-llvm/breakpoint.rs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@
99
// CHECK-LABEL: use_bp
1010
// aarch64: brk #0xf000
1111
// x86_64: int3
12+
#[inline(never)]
1213
pub fn use_bp() {
1314
core::arch::breakpoint();
1415
}

tests/assembly-llvm/simd/reduce-fadd-unordered.rs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@ use std::simd::*;
1616
// It would emit about an extra fadd, depending on the architecture.
1717

1818
// CHECK-LABEL: reduce_fadd_negative_zero
19+
#[inline(never)]
1920
pub unsafe fn reduce_fadd_negative_zero(v: f32x4) -> f32 {
2021
// x86_64: addps
2122
// x86_64-NEXT: movshdup

tests/codegen-llvm/cross-crate-inlining/auxiliary/leaf.rs

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,3 +18,8 @@ pub fn stem_fn() -> String {
1818
fn inner() -> String {
1919
String::from("test")
2020
}
21+
22+
// This function's optimized MIR contains a call, but it is to an intrinsic.
23+
pub fn leaf_with_intrinsic(a: &[u64; 2], b: &[u64; 2]) -> bool {
24+
a == b
25+
}

tests/codegen-llvm/cross-crate-inlining/leaf-inlining.rs

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,3 +18,10 @@ pub fn stem_outer() -> String {
1818
// CHECK: call {{.*}}stem_fn
1919
leaf::stem_fn()
2020
}
21+
22+
// Check that we inline functions that call intrinsics
23+
#[no_mangle]
24+
pub fn leaf_with_intrinsic_outer(a: &[u64; 2], b: &[u64; 2]) -> bool {
25+
// CHECK-NOT: call {{.*}}leaf_with_intrinsic
26+
leaf::leaf_with_intrinsic(a, b)
27+
}

tests/codegen-llvm/default-visibility.rs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,7 @@ pub static tested_symbol: [u8; 6] = *b"foobar";
3232
// INTERPOSABLE: @{{.*}}default_visibility{{.*}}tested_symbol{{.*}} = constant
3333
// DEFAULT: @{{.*}}default_visibility{{.*}}tested_symbol{{.*}} = constant
3434

35+
#[inline(never)]
3536
pub fn do_memcmp(left: &[u8], right: &[u8]) -> i32 {
3637
left.cmp(right) as i32
3738
}

0 commit comments

Comments
 (0)