Skip to content

set subsections_via_symbols for ld64 helper sections #139752

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 25, 2025

Conversation

usamoi
Copy link
Contributor

@usamoi usamoi commented Apr 13, 2025

closes #139744
cc @madsmtm

@rustbot
Copy link
Collaborator

rustbot commented Apr 13, 2025

r? @Nadrieril

rustbot has assigned @Nadrieril.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Apr 13, 2025
@rustbot
Copy link
Collaborator

rustbot commented Apr 13, 2025

Some changes occurred in compiler/rustc_codegen_ssa

cc @WaffleLapkin

@rustbot rustbot added the A-run-make Area: port run-make Makefiles to rmake.rs label Apr 13, 2025
@rustbot

This comment was marked as outdated.

Copy link
Contributor

@madsmtm madsmtm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the ping!

@rustbot label O-apple O-linkage

Comment on lines 224 to 231
if binary_format == BinaryFormat::MachO {
file.set_subsections_via_symbols();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear to me that this needs to be in create_object_file? Maybe it would be better to only use it in add_linked_symbol_object?

Also, MH_SUBSECTIONS_VIA_SYMBOLS is vastly under-documented, I'd really like to see a comment here explaining why it's safe for us to use.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Object::set_subsections_via_symbols says it should be called before add_section or add_subsection, so I feel it better to put it here (not error-prone).

I don't know whether it's safe or not, either. I searched on the Internet and found that this is the only way for Mach-O to implement GC.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My problem with having it in create_object_file is that it may negatively affect the other object files we create (which again is hard to tell for sure, since the docs around it are so limited).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! I didn't notice that this function has multiple callers.

I will add a parameter to control this behavior.

Copy link
Contributor

@madsmtm madsmtm Apr 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eh, I'd still have preferred to just change add_linked_symbol_object, it'd be unexpected if create_object_file added any sections or subsections, it's literally prefixed "create" as in "only create, do not fill".

But will leave it up to the compiler reviewer to decide, this is a nit anyhow.

Comment on lines 1 to 8
unsafe extern "C" {
unsafe static UNDEFINED: usize;
}

#[unsafe(no_mangle)]
pub fn used() {
println!("UNDEFINED = {}", unsafe { UNDEFINED });
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, I'm kind of surprised that this worked in the past. Could you describe the use-case?

I'd also be interested, does this pattern work on other platforms?

Copy link
Contributor Author

@usamoi usamoi Apr 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's used by https://github.com/pgcentralfoundation/pgrx. It's a framework for developing PostgreSQL extensions. This trick is used for writing hybrid (dylib and SQL generation) code in a library. Code about SQL generation could be compiled to an executable, since dylib-related code are GC-ed.

This should be reasonable usage. Please see #95604 and #95363 (comment).

Yes. It works for Linux, FreeBSD, and Windows, too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the answer. Might make sense to leave a comment in the file about this, and perhaps link to #139744, stating that it is a regression test for that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, is the expected behaviour for the symbol to still be present, or do we expect the linker to completely strip it out? E.g. would dlsym(RTLD_DEFAULT, "used") work?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The symbol is present in fhe dynamic library. It's GC-ed iff the final artifact is the executable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An idea then would perhaps be to add another test (or use UI-test revisions) with #![crate_type = "dylib"] and //@ dont-check-compiler-stderr that expectedly fails?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I meant having test that fails when trying to link undefined symbols in a dylib

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is added.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@rustbot rustbot added the O-apple Operating system: Apple (macOS, iOS, tvOS, visionOS, watchOS) label Apr 13, 2025
@jieyouxu
Copy link
Member

cc @bjorn3 @petrochenkov in case this is problematic

#[unsafe(no_mangle)]
pub unsafe fn used() {
println!("THIS_SYMBOL_SHOULD_BE_UNDEFINED = {}", unsafe { THIS_SYMBOL_SHOULD_BE_UNDEFINED });
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I know that #[no_mangle] implies it, but could we also add an explicit test for #[used]?

Copy link
Contributor Author

@usamoi usamoi Apr 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#[used] emits errors, while #[no_mangle] does not, on MacOS.

Don't know why.

Edit: using pr + lld, nightly + lld, stable + lld, stable + ld64, #[used] emits errors, too.

Copy link
Contributor Author

@usamoi usamoi Apr 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know why now.

On MacOS, #[used] emits llvm.used. So #[used(compiler)] is needed here. Is it expected behavior?

Copy link
Member

@bjorn3 bjorn3 Apr 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL that #[used] is an alias for #[used(linker)] on some platforms:

// Unfortunately, unconditionally using `llvm.used` causes
// issues in handling `.init_array` with the gold linker,
// but using `llvm.compiler.used` caused a nontrivial amount
// of unintentional ecosystem breakage -- particularly on
// Mach-O targets.
//
// As a result, we emit `llvm.compiler.used` only on ELF
// targets. This is somewhat ad-hoc, but actually follows
// our pre-LLVM 13 behavior (prior to the ecosystem
// breakage), and seems to match `clang`'s behavior as well
// (both before and after LLVM 13), possibly because they
// have similar compatibility concerns to us. See
// https://github.com/rust-lang/rust/issues/47384#issuecomment-1019080146
// and following comments for some discussion of this, as
// well as the comments in `rustc_codegen_llvm` where these
// flags are handled.
//
// Anyway, to be clear: this is still up in the air
// somewhat, and is subject to change in the future (which
// is a good thing, because this would ideally be a bit
// more firmed up).
let is_like_elf = !(tcx.sess.target.is_like_darwin
|| tcx.sess.target.is_like_windows
|| tcx.sess.target.is_like_wasm);
codegen_fn_attrs.flags |= if is_like_elf {
CodegenFnAttrFlags::USED
} else {
CodegenFnAttrFlags::USED_LINKER
};
Eventually #[used] should be changed to #[used(linker)] unconditionally anyway. Gold is deprecated upstream and broken with current rustc versions anyway: #139425

@usamoi usamoi force-pushed the macos-used branch 2 times, most recently from 6e175fc to 7d10cf7 Compare April 14, 2025 05:34
@jieyouxu jieyouxu added the A-linkage Area: linking into static, shared libraries and binaries label Apr 14, 2025
@jieyouxu jieyouxu removed the A-run-make Area: port run-make Makefiles to rmake.rs label Apr 14, 2025
@Nadrieril
Copy link
Member

r? codegen

@rustbot rustbot assigned saethlin and unassigned Nadrieril Apr 16, 2025
@usamoi
Copy link
Contributor Author

usamoi commented Apr 22, 2025

I hope this gets a beta backport to prevent pgrx from being broken on macOS.

@rustbot label +beta-nominated

@rustbot rustbot added the beta-nominated Nominated for backporting to the compiler in the beta channel. label Apr 22, 2025
@@ -221,6 +224,11 @@ pub(crate) fn create_object_file(sess: &Session) -> Option<write::Object<'static

file.set_macho_build_version(macho_object_build_version_for_target(sess))
}
if binary_format == BinaryFormat::MachO {
if set_subsections_via_symbols {
file.set_subsections_via_symbols();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still needs a documentation comment IMO explaining why we do this.

@saethlin
Copy link
Member

Looks good! Thanks @madsmtm for also reviewing ❤️

@bors r=saethlin,madsmtm

@bors
Copy link
Collaborator

bors commented Apr 23, 2025

📌 Commit e18e599 has been approved by saethlin,madsmtm

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 24, 2025
@jieyouxu
Copy link
Member

This was discussed in today's compiler triage meeting. The compiler team will revisit this beta backport decision next week after this PR merges.

bors added a commit to rust-lang-ci/rust that referenced this pull request Apr 24, 2025
set subsections_via_symbols for ld64 helper sections

closes rust-lang#139744
cc `@madsmtm`
@bors
Copy link
Collaborator

bors commented Apr 24, 2025

⌛ Testing commit 64ba44f with merge 467cebc...

@rust-log-analyzer

This comment has been minimized.

@bors
Copy link
Collaborator

bors commented Apr 24, 2025

💔 Test failed - checks-actions

@bors bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Apr 24, 2025
@usamoi
Copy link
Contributor Author

usamoi commented Apr 24, 2025

On windows-gnu, the linker removes statics marked with #[used], but does not functions marked with #[used] or referenced by statics marked with #[used]. Not interested in this target, so I mark the test as ignore-windows-gnu for now.

@saethlin
Copy link
Member

Can you add the reason windows-gnu is special for this test to the magic comment?

@saethlin
Copy link
Member

@bors r=saethlin,madsmtm

@bors
Copy link
Collaborator

bors commented Apr 24, 2025

📌 Commit b1a3831 has been approved by saethlin,madsmtm

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 24, 2025
@bors
Copy link
Collaborator

bors commented Apr 24, 2025

⌛ Testing commit b1a3831 with merge 847e3ee...

@bors
Copy link
Collaborator

bors commented Apr 25, 2025

☀️ Test successful - checks-actions
Approved by: saethlin,madsmtm
Pushing 847e3ee to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Apr 25, 2025
@bors bors merged commit 847e3ee into rust-lang:master Apr 25, 2025
7 checks passed
@rustbot rustbot added this to the 1.88.0 milestone Apr 25, 2025
Copy link

What is this? This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.

Comparing d7ea436 (parent) -> 847e3ee (this PR)

Test differences

Show 6 test diffs

Stage 1

  • [ui] tests/ui/linking/cdylib-no-mangle.rs: [missing] -> ignore (only executed when the target vendor is Apple) (J1)
  • [ui] tests/ui/linking/executable-no-mangle-strip.rs: [missing] -> pass (J1)

Stage 2

  • [ui] tests/ui/linking/executable-no-mangle-strip.rs: [missing] -> pass (J0)
  • [ui] tests/ui/linking/cdylib-no-mangle.rs: [missing] -> pass (J2)
  • [ui] tests/ui/linking/executable-no-mangle-strip.rs: [missing] -> ignore (ignored when the operating system and target environment are windows-gnu (only statics marked with used can be GC-ed on windows-gnu)) (J3)
  • [ui] tests/ui/linking/cdylib-no-mangle.rs: [missing] -> ignore (only executed when the target vendor is Apple) (J4)

Job group index

Test dashboard

Run

cargo run --manifest-path src/ci/citool/Cargo.toml -- \
    test-dashboard 847e3ee6b0e614937eee4e6d8f61094411eadcc0 --output-dir test-dashboard

And then open test-dashboard/index.html in your browser to see an overview of all executed tests.

Job duration changes

  1. dist-aarch64-linux: 5203.0s -> 7914.2s (52.1%)
  2. dist-x86_64-apple: 9270.3s -> 11170.0s (20.5%)
  3. dist-loongarch64-linux: 6264.6s -> 7264.5s (16.0%)
  4. aarch64-apple: 4275.1s -> 3670.3s (-14.1%)
  5. dist-x86_64-linux: 4945.4s -> 5538.0s (12.0%)
  6. x86_64-apple-2: 4757.0s -> 5185.7s (9.0%)
  7. x86_64-gnu-aux: 5920.6s -> 6338.0s (7.1%)
  8. dist-x86_64-mingw: 7686.2s -> 8201.0s (6.7%)
  9. x86_64-gnu-distcheck: 4707.8s -> 4422.4s (-6.1%)
  10. dist-x86_64-netbsd: 4991.9s -> 5292.1s (6.0%)
How to interpret the job duration changes?

Job durations can vary a lot, based on the actual runner instance
that executed the job, system noise, invalidated caches, etc. The table above is provided
mostly for t-infra members, for simpler debugging of potential CI slow-downs.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (847e3ee): comparison URL.

Overall result: ❌✅ regressions and improvements - please read the text below

Our benchmarks found a performance regression caused by this PR.
This might be an actual regression, but it can also be just noise.

Next Steps:

  • If the regression was expected or you think it can be justified,
    please write a comment with sufficient written justification, and add
    @rustbot label: +perf-regression-triaged to it, to mark the regression as triaged.
  • If you think that you know of a way to resolve the regression, try to create
    a new PR with a fix for the regression.
  • If you do not understand the regression or you think that it is just noise,
    you can ask the @rust-lang/wg-compiler-performance working group for help (members of this group
    were already notified of this PR).

@rustbot label: +perf-regression
cc @rust-lang/wg-compiler-performance

Instruction count

This is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.

mean range count
Regressions ❌
(primary)
0.3% [0.2%, 0.3%] 2
Regressions ❌
(secondary)
0.3% [0.2%, 0.6%] 30
Improvements ✅
(primary)
-0.2% [-0.2%, -0.2%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.1% [-0.2%, 0.3%] 3

Max RSS (memory usage)

Results (primary -0.4%, secondary -1.7%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
2.1% [1.7%, 2.4%] 2
Regressions ❌
(secondary)
2.1% [1.6%, 2.3%] 3
Improvements ✅
(primary)
-1.2% [-3.6%, -0.4%] 6
Improvements ✅
(secondary)
-3.9% [-7.8%, -2.0%] 5
All ❌✅ (primary) -0.4% [-3.6%, 2.4%] 8

Cycles

Results (primary 0.7%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.7% [0.5%, 0.9%] 4
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.7% [0.5%, 0.9%] 4

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 777.966s -> 776.153s (-0.23%)
Artifact size: 365.11 MiB -> 365.17 MiB (0.02%)

@rustbot rustbot added the perf-regression Performance regression. label Apr 25, 2025
@rylev
Copy link
Member

rylev commented Apr 29, 2025

I don't really understand how this change could have the performance impact seen here. Some of the regressions do seem like pure noise (with the next run returning back to the previous baseline), but not all of them.

@usamoi @saethlin would you agree that the regressions here are unlikely to be actually caused by this change?

@madsmtm
Copy link
Contributor

madsmtm commented Apr 29, 2025

This should only affect macOS/Apple platforms, and I don't think the benchmarks are run on those?

@rylev
Copy link
Member

rylev commented Apr 29, 2025

That is correct. I'll mark this as triaged

@rustbot label: +perf-regression-triaged

@rustbot rustbot added the perf-regression-triaged The performance regression has been triaged. label Apr 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-linkage Area: linking into static, shared libraries and binaries beta-nominated Nominated for backporting to the compiler in the beta channel. merged-by-bors This PR was explicitly merged by bors. O-apple Operating system: Apple (macOS, iOS, tvOS, visionOS, watchOS) perf-regression Performance regression. perf-regression-triaged The performance regression has been triaged. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

functions marked with #[no_mangle] cannot be GC-ed on MacOS