Description
Overview
On macOS, we currently generate debug information combined into the executable. This is not Apple's convention, and it's been difficult to make the platform toolchain happy with the combined debug info. With Apple's new linker to be released in Xcode 15, it is even harder. We propose to generate debug info in a separate file on macOS, following the system convention.
Background and context
Platform conventions
Currently, on macOS, for DWARF debug information, the Go toolchain generates it in the executable as a __DWARF
segment, similar to what we do on other platforms. However, this is not Apple's convention for its C toolchain. Instead, the C toolchain often creates debug info in a separate file/directory.
Specifically, for C compilation with debug info enabled,
- the C compiler generates object files that contains debug info in
debug_*
sections - the C linker links the object files to an executable without debug info, but with STAB symbols referencing the object files
- optionally another program
dsymutil
can be run on the executable, which extracts the debug info from the object files and stores it into either a dSYM directory or a single file. If this is done, the STAB symbol and object files are no longer needed, and can be stripped/deleted.
Combined DWARF in Go
For Go toolchain, we currently generate debug information combined into the executable, similar to what we do on other platforms. In internal linking mode, the Go linker directly produces a binary with debug info combined into the executable as a __DWARF
segment. In external linking mode, the Go linker
- passes Go and C object files with debug info to the C linker, which produces an executable with STAB symbols
- run
dsymutil
to extract the debug info - strip the STAB symbols (which contains object file paths which are nondeterministic)
- post-edit the executable, combine the debug info to the executable
- (delete the temp directory containing Go and C objects)
While it is simple in internal linking mode, in external linking mode this process is a bit convoluted.
Combining DWARF into the executable requires post-edit the executable, adding a __DWARF
segment, which requires editing the program header, and some other data. The Mach-O loader in the platform's static linker and dynamic linker have a number of integrity checks for the program, which generally doesn't like an extra unmapped segment. The code in the Go linker that adds the segment has been revised several times to make the dynamic linker happy.
With Apple's new linker to be released in Xcode 15, there are even more checks and it is hard to work around all the requirements. Currently, if one builds Go code into a c-shared object, then link with C code using Apple's linker, it will reject the shared object produced by the Go toolchain (see also #61229). We could potentially try harder to work around more checks (if possible). But it may get harder and harder in the future and eventually be forced to change.
Debugger support
For the debugger side, the system's default debugger, LLDB, understands the C toolchain's convention. When debugging an executable (say x
),
- it can automatically find debug info combined in the executable
- it can automatically find debug info from object files referenced by the STAB symbols
- it can automatically find debug info from the dSYM directory
x.dSYM
- or the debug info file can be specified with
target create --symfile
command.
Notably, LLDB doesn't understand compressed DWARF which we generate by default. So currently Go programs do not work out of box with LLDB. (An easy workaround is-ldflags=-compressdwarf=0
).
Delve, a commonly used debugger for Go programs, understand the DWARF combined in the executable, and also the compressed DWARF. So Delve works for Go programs out of box.
Proposed changes
We propose that the Go toolchain switches to generate split DWARF on macOS, following the platform conventions. This would make Go toolchain more consistent with Apple's convention, and behave more similar to the system C toolchain. We would no longer need to "fight against" the checks in Mach-O loader in the system static and dynamic linker. So it will be more forward compatible against platform updates.
Naming convention
Following the system convention, for an executable named x
we will generate a directory named x.dSYM
which contains a DWARF file at x.dSYM/Contents/Resources/DWARF/x
. In the system convention, there are other files in the dSYM
directory (a Info.plist
file and a relocation file), which are irrelevant to DWARF. We may skip them for now. We could consider generating them if it is needed in the future. (For c-archive build mode, as we produce C objects, which contain combined DWARF in the C toolchain's convention, we will continue to do so.)
We could also consider using a different naming convention, e.g. for an executable named x
we will generate a single DWARF file named x.dwarf
. LLDB would not load it automatically. But as LLDB already does not work out of box (due to compressions), maybe this is not too bad. One needs to pass the --symfile
flag. Feedback welcome.
Go linker
The Go linker will generate split DWARF on macOS.
- In internal linking mode the Go linker will emit an executable (without DWARF) and a separate DWARF file.
- In external linking mode the Go linker will invoke the C linker to emit an executable and invoke
dsymutil
to generate a DWARF file; this is the same as before, but the Go linker will not post-edit the executable to combine the DWARF back into the executable.
The go
command
The go
command needs to understand that we now generate two output files, the executable and the DWARF file (in the case of c-shared build mode, three files: the shared object, the C header file, and the DWARF file). It needs to copy them from the temporary directory where the build is performed to the output directory. Specifically for file naming,
go build
without the-o
flag will generate executable<exe>
(which is the default name matching the main module or.go
file name) and a DWARF file in<exe>.dSYM
go build -o <exe>
will generate executable<exe>
and a DWARF file in<exe>.dSYM
go build -o <dir>
will generate executable<dir>/<exe>
and a DWARF file in<dir>/<exe>.dSYM
(where<exe>
is the default name based on the main module or.go
file name)- a special case for
go build -o /dev/null
, which generates no file
go test -c
will follow the similar naming convention.
In order not to clutter directories that contains installed binaries like $HOME/bin
, we propose that go install
will have DWARF disabled by default (by passing the -w
flag to the linker). One can still explicitly ask for DWARF by passing -ldflags=-w=0
(the -w
flag disables DWARF, -w=0
negates it).
There is a prior art for emitting two output files: in c-shared build mode go build
command generates a C shared object (usually named with .so
) and a C header file (usually named with .h
). So outputting two files isn't completely new. Maybe it could be implemented similarly.
go clean
will also understand the naming convention, and remove the DWARF file if it is invoked to remove the executable file.
Build cache
Executables are not cached. So the DWARF file will not be cached, either. However, for executables the go
command checks if the output file already exists and contains the expected build ID, and if so, it will assume it is up to date and not relink it. With split DWARF, we propose that it will also check if the DWARF file is up to date (the DWARF file will probably also contain the build ID so it can be checked, details TBD). If either the executable or the DWARF file is not up to date, it will relink and generate both.
Debugger support
With this change, LLDB understands the naming convention so it should still be able to load the DWARF info automatically (if it is not compressed). If either the executable or the DWARF file is moved or renamed, it can still be loaded with the --symfile
flag.
Delve will need to be updated to understand the naming convention, finding the DWARF file from the dSYM directory. We suggest it also provides a way (e.g. a command line flag, if it does not already have one) to explicitly specify the DWARF file's location, in case that the user wants to move or rename the file.
debug/macho
package
Currently, for a Mach-O executable with combined DWARF, the debug/macho.(*File).DWARF
function can load the debug information. With split DWARF, the binary will not contain DWARF, so it cannot be loaded from the same macho.File
. One could open another macho.File
for the DWARF file.
If the macho.File
is from an OS file (e.g. opened from macho.Open
), it may be possible that the macho
package automatically tries to find the split DWARF from the DWARF file following the naming convention. Then the user won't need to open another file. On the other hand, automatically opening another file seems a but magic. Feedback welcome.
If accepted, we plan to implement this in Go 1.22.
Thanks.
cc @golang/compiler @rsc @bcmills @aarzilli @derekparker @archanaravindar
Metadata
Metadata
Assignees
Type
Projects
Status
Status