Skip to content

Incorrect line number in debuginfo when duplicate callsites are merged together at -O1 #65667

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wesleywiser opened this issue Sep 7, 2023 · 9 comments

Comments

@wesleywiser
Copy link
Member

When the following IR is compiled with llc -O1 -trap-unreachable, an incorrect line number is generated in the debuginfo for the call to panic_test2::do_panic:

IR
source_filename = "panic_test2.6f3b49f66efc8fc7-cgu.0"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

@alloc_14c43fe6be9850e9c6ac099b83b2e4e2 = private unnamed_addr constant <{ [7 x i8] }> <{ [7 x i8] c"unknown" }>, align 1
@alloc_1a2b9f3efbe1a8edd339fa75af2334ed = private unnamed_addr constant <{ [5 x i8] }> <{ [5 x i8] c"hello" }>, align 1

; Function Attrs: inlinehint noreturn nounwind nonlazybind
define fastcc void @bad_debug_info(ptr noalias nocapture noundef readonly align 8 dereferenceable(16) %_1) unnamed_addr #4 !dbg !85 {
start:
  %_11 = alloca { ptr, i64 }, align 8
  %_7 = alloca { ptr, i64 }, align 8
  %0 = getelementptr inbounds { ptr, i64 }, ptr %_1, i64 0, i32 1, !dbg !87
  %_13.1 = load i64, ptr %0, align 8, !dbg !87, !noundef !14
; call panic_test2::condition
  %1 = tail call fastcc ptr @_ZN11panic_test29condition17haa518b986e55a3c9E(i64 noundef %_13.1) #9, !dbg !89
  %.not = icmp eq ptr %1, null, !dbg !90
  br i1 %.not, label %bb3, label %bb2, !dbg !90

bb2:                                              ; preds = %start
  call void @llvm.lifetime.start.p0(i64 16, ptr nonnull %_7), !dbg !91
  store ptr %1, ptr %_7, align 8, !dbg !91
  %2 = getelementptr inbounds { ptr, i64 }, ptr %_7, i64 0, i32 1, !dbg !91
  store i64 5, ptr %2, align 8, !dbg !91
; call panic_test2::do_panic
  call fastcc void @_ZN11panic_test28do_panic17h2c0ead66774c0583E(ptr noalias noundef nonnull readonly align 8 dereferenceable(16) %_7) #10, !dbg !92
  unreachable, !dbg !92

bb3:                                              ; preds = %start
  call void @llvm.lifetime.start.p0(i64 16, ptr nonnull %_11), !dbg !93
  store ptr @alloc_14c43fe6be9850e9c6ac099b83b2e4e2, ptr %_11, align 8, !dbg !93
  %3 = getelementptr inbounds { ptr, i64 }, ptr %_11, i64 0, i32 1, !dbg !93
  store i64 7, ptr %3, align 8, !dbg !93
; call panic_test2::do_panic
  call fastcc void @_ZN11panic_test28do_panic17h2c0ead66774c0583E(ptr noalias noundef nonnull readonly align 8 dereferenceable(16) %_11) #10, !dbg !94
  unreachable, !dbg !94
}

; panic_test2::condition
; Function Attrs: mustprogress nofree noinline norecurse nosync nounwind nonlazybind willreturn memory(none)
define internal fastcc ptr @_ZN11panic_test29condition17haa518b986e55a3c9E(i64 noundef %s.1) unnamed_addr #6 {
start:
  %0 = icmp eq i64 %s.1, 0
  %alloc_1a2b9f3efbe1a8edd339fa75af2334ed. = select i1 %0, ptr @alloc_1a2b9f3efbe1a8edd339fa75af2334ed, ptr null
  ret ptr %alloc_1a2b9f3efbe1a8edd339fa75af2334ed.
}

; panic_test2::do_panic
; Function Attrs: noreturn nounwind nonlazybind
define internal fastcc void @_ZN11panic_test28do_panic17h2c0ead66774c0583E(ptr noalias noundef readonly align 8 dereferenceable(16) %h) unnamed_addr #5 {
start:
  unreachable
}

; Function Attrs: mustprogress nocallback nofree nosync nounwind willreturn memory(argmem: readwrite)
declare void @llvm.lifetime.start.p0(i64 immarg, ptr nocapture) #8

attributes #1 = { nounwind nonlazybind "probe-stack"="inline-asm" "target-cpu"="x86-64" }
attributes #5 = { noreturn nounwind nonlazybind "probe-stack"="inline-asm" "target-cpu"="x86-64" }
attributes #6 = { mustprogress nofree noinline norecurse nosync nounwind nonlazybind willreturn memory(none) "probe-stack"="inline-asm" "target-cpu"="x86-64" }
attributes #8 = { mustprogress nocallback nofree nosync nounwind willreturn memory(argmem: readwrite) }
attributes #9 = { nounwind }
attributes #10 = { noreturn nounwind }

!llvm.module.flags = !{!3, !4}
!llvm.ident = !{!5}
!llvm.dbg.cu = !{!6}

!3 = !{i32 2, !"Dwarf Version", i32 4}
!4 = !{i32 2, !"Debug Info Version", i32 3}
!5 = !{!"rustc version 1.73.0-beta.4 (9f37cd4f7 2023-09-01)"}
!6 = distinct !DICompileUnit(language: DW_LANG_Rust, file: !7, producer: "clang LLVM (rustc version 1.73.0-beta.4 (9f37cd4f7 2023-09-01))", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, splitDebugInlining: false)
!7 = !DIFile(filename: "src/main.rs/@/panic_test2.6f3b49f66efc8fc7-cgu.0", directory: "/tmp/panic_test")
!13 = !DISubroutineType(types: !14)
!14 = !{}
!80 = !DIFile(filename: "src/main.rs", directory: "/tmp/panic_test", checksumkind: CSK_MD5, checksum: "7cd0751a66038391ee52be036e31f480")
!81 = !DINamespace(name: "panic_test2", scope: null)
!85 = distinct !DISubprogram(name: "bad_debug_info", linkageName: "bad_debug_info", scope: !86, file: !80, line: 14, type: !13, scopeLine: 14, flags: DIFlagPrototyped | DIFlagNoReturn, spFlags: DISPFlagLocalToUnit | DISPFlagDefinition | DISPFlagOptimized, unit: !6, templateParams: !14)
!86 = !DINamespace(name: "never_returns", scope: !81)
!87 = !DILocation(line: 15, column: 36, scope: !88)
!88 = distinct !DILexicalBlock(scope: !85, file: !80, line: 15, column: 39)
!89 = !DILocation(line: 15, column: 26, scope: !88)
!90 = !DILocation(line: 15, column: 16, scope: !88)
!91 = !DILocation(line: 16, column: 23, scope: !88)
!92 = !DILocation(line: 16, column: 13, scope: !88)
!93 = !DILocation(line: 18, column: 23, scope: !85)
!94 = !DILocation(line: 18, column: 13, scope: !85)

godbolt

Notice that in the source IR, both calls to do_panic have the correct line numbers:

...
bb2:                                              ; preds = %start
...
; call panic_test2::do_panic
  call fastcc void @_ZN11panic_test28do_panic17h2c0ead66774c0583E(ptr noalias noundef nonnull readonly align 8 dereferenceable(16) %_7) #10, !dbg !92
...

bb3:                                              ; preds = %start
...
; call panic_test2::do_panic
  call fastcc void @_ZN11panic_test28do_panic17h2c0ead66774c0583E(ptr noalias noundef nonnull readonly align 8 dereferenceable(16) %_11) #10, !dbg !94
...
!85 = distinct !DISubprogram(name: "bad_debug_info", linkageName: "bad_debug_info", scope: !86, file: !80, line: 14, type: !13, scopeLine: 14, flags: DIFlagPrototyped | DIFlagNoReturn, spFlags: DISPFlagLocalToUnit | DISPFlagDefinition | DISPFlagOptimized, unit: !6, templateParams: !14)
!88 = distinct !DILexicalBlock(scope: !85, file: !80, line: 15, column: 39)
!92 = !DILocation(line: 16, column: 13, scope: !88)
!94 = !DILocation(line: 18, column: 13, scope: !85)

However, when lowering to machine code, these line numbers are erased and replaced with line 0:

...
        .loc    1 0 23 is_stmt 0                # src/main.rs:0:23
        leaq    8(%rsp), %rdi
        callq   panic_test2::do_panic::h2c0ead66774c0583
        ud2
...

Looking at the optimization passes in godbolt, it seems that the "Control Flow Optimizer (branch-folder)" pass is responsible for merging the calls together and removing the correct line numbers.

@llvmbot
Copy link
Member

llvmbot commented Sep 7, 2023

@llvm/issue-subscribers-debuginfo

@jmorse
Copy link
Member

jmorse commented Sep 7, 2023

Thanks for pinning down the issue; unfortunately this is the correct debug-info behaviour for the given optimisation. Branch-folding is merging together two identical function tails (the LEA, call, and trap being identical and terminal) and there isn't a correct single source line that can be attributed to those instructions. If either path is picked (line 16 or 18) then when taking the other path through the program a false + misleading line number would be displayed in the stack trace. Line zero, signalling that the location is compiler generated and not a "real" source line, is the least worst output in this scenario.

The main blocker on improving this is that there aren't many good ways of describing ambiguous source locations in DWARF, or conventions on how they should be interpreted by consumers, which is an ongoing pain point.

@bjorn3
Copy link

bjorn3 commented Sep 7, 2023

Why is the line set to 0 but the column to 23 which doesn't match either call site?

@pogo59
Copy link
Collaborator

pogo59 commented Sep 8, 2023

Most likely, if you look at the last .loc directive before the line-0 directive, it had column 23. We don't reset the column number because it takes up extra space in the DWARF encoding of the line table.

@wesleywiser
Copy link
Member Author

Thanks @jmorse! I suspected that was the case when filing and couldn't think of a good resolution other than simply picking one of the lines.

I'll note that gdb seems to pick a nearby, non-zero line from the line table where as lldb just reports zero. It might be reasonable to pick one of the lines rather than zero as users debugging optimized code are already used to the current line jumping around somewhat unexpectedly. That might be more intuitive for them than the current line simply disappearing entirely but the exact experience there is debugger specific.

@nnethercote
Copy link

It might be reasonable to pick one of the lines rather than zero

I filed an upstream issue related to this problem almost four years ago. The context for that is profiling rather than debugging, and I agree that picking one of the lines would be much better in that context. Even if the chosen line is slightly wrong it will still be in the same function. In contrast, zero only gives file-level data, which is much less useful when profiling.

@jmorse
Copy link
Member

jmorse commented Sep 11, 2023

The complete lack of source information for line-zero is indeed a serious problem -- I think (98% confidence) that we include the instruction in the lexical scope that's most-common to the merged instructions, so technically a debugger would be able to present something, however I know very little about the consumer side of things. We've steered clear of presenting sometimes-false information to developers in the past (when the compiler makes decisions) as we feel that it'll undermine developers' confidence in the accuracy of debug-info, doing more harm than good.

I'm going to close this as it's a known problem, but please do re-open if there's more to say. There's a sort-of vague plan of how we might address this in the future: we've got a prototype for the two-level-line-tables proposal that might be in DWARF6, that can express the ambiguous source-location problem, and there's the possibility that consumers could use the last-branch-record feature of modern processors to disambiguate merged source locations like these. It would require much more work, but in theory could address these issues.

@jmorse jmorse closed this as completed Sep 11, 2023
@pogo59
Copy link
Collaborator

pogo59 commented Sep 11, 2023

You could try -fdebug-info-for-profiling although offhand I don't remember exactly what that does.

@dwblaikie
Copy link
Collaborator

You could try -fdebug-info-for-profiling although offhand I don't remember exactly what that does.

Nothing helpful in this regard. (it adds things like discriminators to the line table and start lines to subprogram definitions even in -gmlt - to help account for code drift/unrelated source changes by computing function relative line numbers with that)

Both for users and profiles we generally consider it worse/bad to pick one of the locations of instructions merged from basic blocks - for interactive users, they may be mislead into believing some code is reachable when it isn't, and for profile guided optimizations, similarly - it'd be bad if they were lead to believe that one branch was 100% hot and the other 100% cold, despite that not being true, just because they sampled a merged instruction.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants