Skip to content

Conversation

@CarlosAlbertoEnciso
Copy link
Member

@CarlosAlbertoEnciso CarlosAlbertoEnciso commented Apr 11, 2025

Add support for the LLVM IR format and be able to generate logical views.

Both textual representation (.ll) and bitcode (.bc) format are supported.

@llvmbot
Copy link
Member

llvmbot commented Apr 11, 2025

@llvm/pr-subscribers-debuginfo

Author: Carlos Alberto Enciso (CarlosAlbertoEnciso)

Changes

Add support for the LLVM IR format and be able to generate logical views.

Both textual representation (.ll) and bitcode (.bc) format are supported.

Note: This patch requires:

Add DebugSSAUpdater class to track debug value liveness
#135349


Patch is 226.09 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/135440.diff

35 Files Affected:

  • (modified) llvm/docs/CommandGuide/llvm-debuginfo-analyzer.rst (+165-4)
  • (modified) llvm/include/llvm/DebugInfo/LogicalView/Core/LVReader.h (+25-1)
  • (modified) llvm/include/llvm/DebugInfo/LogicalView/Core/LVSupport.h (+13)
  • (modified) llvm/include/llvm/DebugInfo/LogicalView/LVReaderHandler.h (+9-3)
  • (modified) llvm/include/llvm/DebugInfo/LogicalView/Readers/LVBinaryReader.h (+1-11)
  • (modified) llvm/include/llvm/DebugInfo/LogicalView/Readers/LVDWARFReader.h (-10)
  • (added) llvm/include/llvm/DebugInfo/LogicalView/Readers/LVIRReader.h (+300)
  • (modified) llvm/lib/DebugInfo/LogicalView/CMakeLists.txt (+2)
  • (modified) llvm/lib/DebugInfo/LogicalView/Core/LVElement.cpp (+8-4)
  • (modified) llvm/lib/DebugInfo/LogicalView/Core/LVReader.cpp (+271)
  • (modified) llvm/lib/DebugInfo/LogicalView/LVReaderHandler.cpp (+37-8)
  • (modified) llvm/lib/DebugInfo/LogicalView/Readers/LVBinaryReader.cpp (-24)
  • (modified) llvm/lib/DebugInfo/LogicalView/Readers/LVDWARFReader.cpp (+6-218)
  • (added) llvm/lib/DebugInfo/LogicalView/Readers/LVIRReader.cpp (+2348)
  • (added) llvm/test/tools/llvm-debuginfo-analyzer/IR/01-ir-compare-logical-elements.test (+121)
  • (added) llvm/test/tools/llvm-debuginfo-analyzer/IR/01-ir-print-basic-details.test (+76)
  • (added) llvm/test/tools/llvm-debuginfo-analyzer/IR/01-ir-select-logical-elements.test (+122)
  • (added) llvm/test/tools/llvm-debuginfo-analyzer/IR/02-ir-logical-lines.test (+60)
  • (added) llvm/test/tools/llvm-debuginfo-analyzer/IR/03-ir-incorrect-lexical-scope-typedef.test (+131)
  • (added) llvm/test/tools/llvm-debuginfo-analyzer/IR/04-ir-missing-nested-enumerators.test (+144)
  • (added) llvm/test/tools/llvm-debuginfo-analyzer/IR/05-ir-incorrect-lexical-scope-variable.test (+113)
  • (added) llvm/test/tools/llvm-debuginfo-analyzer/IR/06-ir-full-logical-view.test (+118)
  • (added) llvm/test/tools/llvm-debuginfo-analyzer/IR/07-ir-debug-formats.test (+48)
  • (added) llvm/test/tools/llvm-debuginfo-analyzer/IR/Inputs/hello-world-clang.ll (+49)
  • (added) llvm/test/tools/llvm-debuginfo-analyzer/IR/Inputs/pr-43860-clang.ll (+92)
  • (added) llvm/test/tools/llvm-debuginfo-analyzer/IR/Inputs/pr-44884-clang.ll (+98)
  • (added) llvm/test/tools/llvm-debuginfo-analyzer/IR/Inputs/pr-46466-clang.ll (+44)
  • (added) llvm/test/tools/llvm-debuginfo-analyzer/IR/Inputs/test-clang-intrinsics.ll (+87)
  • (added) llvm/test/tools/llvm-debuginfo-analyzer/IR/Inputs/test-clang.ll (+83)
  • (modified) llvm/tools/llvm-debuginfo-analyzer/CMakeLists.txt (+2)
  • (modified) llvm/unittests/DebugInfo/LogicalView/CMakeLists.txt (+3)
  • (added) llvm/unittests/DebugInfo/LogicalView/IRReaderTest.cpp (+355)
  • (modified) llvm/unittests/DebugInfo/LogicalView/Inputs/README.md (+11)
  • (added) llvm/unittests/DebugInfo/LogicalView/Inputs/test-clang.bc ()
  • (added) llvm/unittests/DebugInfo/LogicalView/Inputs/test-clang.ll (+83)
diff --git a/llvm/docs/CommandGuide/llvm-debuginfo-analyzer.rst b/llvm/docs/CommandGuide/llvm-debuginfo-analyzer.rst
index 60fa024db5e99..c5d3293c02f57 100644
--- a/llvm/docs/CommandGuide/llvm-debuginfo-analyzer.rst
+++ b/llvm/docs/CommandGuide/llvm-debuginfo-analyzer.rst
@@ -13,10 +13,11 @@ SYNOPSIS
 DESCRIPTION
 -----------
 :program:`llvm-debuginfo-analyzer` parses debug and text sections in
-binary object files and prints their contents in a logical view, which
-is a human readable representation that closely matches the structure
-of the original user source code. Supported object file formats include
-ELF, Mach-O, WebAssembly, PDB and COFF.
+binary object files and textual IR representations and prints their
+contents in a logical view, which is a human readable representation
+that closely matches the structure of the original user source code.
+Supported object file formats include ELF, Mach-O, WebAssembly, PDB,
+COFF, IR (textual representation and bitcode).
 
 The **logical view** abstracts the complexity associated with the
 different low-level representations of the debugging information that
@@ -2124,6 +2125,138 @@ layout and given the number of matches.
   -----------------------------
   Total           71          8
 
+IR (Textual representation and bitcode) SUPPORT
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The below example is used to show the IR output generated by
+:program:`llvm-debuginfo-analyzer`. We compiled the example for a
+IR 64-bit target with Clang (-O0 -g --target=x86_64-linux):
+
+.. code-block:: c++
+
+  1  using INTPTR = const int *;
+  2  int foo(INTPTR ParamPtr, unsigned ParamUnsigned, bool ParamBool) {
+  3    if (ParamBool) {
+  4      typedef int INTEGER;
+  5      const INTEGER CONSTANT = 7;
+  6      return CONSTANT;
+  7    }
+  8    return ParamUnsigned;
+  9  }
+
+PRINT BASIC DETAILS
+^^^^^^^^^^^^^^^^^^^
+The following command prints basic details for all the logical elements
+sorted by the debug information internal offset; it includes its lexical
+level and debug info format.
+
+.. code-block:: none
+
+  llvm-debuginfo-analyzer --attribute=level,format
+                          --output-sort=offset
+                          --print=scopes,symbols,types,lines,instructions
+                          test-clang.ll
+
+or
+
+.. code-block:: none
+
+  llvm-debuginfo-analyzer --attribute=level,format
+                          --output-sort=offset
+                          --print=elements
+                          test-clang.ll
+
+Each row represents an element that is present within the debug
+information. The first column represents the scope level, followed by
+the associated line number (if any), and finally the description of
+the element.
+
+.. code-block:: none
+
+  Logical View:
+  [000]           {File} 'test-clang.ll' -> Textual IR
+
+  [001]             {CompileUnit} 'test.cpp'
+  [002]     2         {Function} extern not_inlined 'foo' -> 'int'
+  [003]                 {Block}
+  [004]     5             {Variable} 'CONSTANT' -> 'const INTEGER'
+  [004]     5             {Line}
+  [004]                   {Code} 'store i32 7, ptr %CONSTANT, align 4, !dbg !32'
+  [004]     6             {Line}
+  [004]                   {Code} 'store i32 7, ptr %retval, align 4, !dbg !33'
+  [004]     6             {Line}
+  [004]                   {Code} 'br label %return, !dbg !33'
+  [003]     2           {Parameter} 'ParamPtr' -> 'INTPTR'
+  [003]     2           {Parameter} 'ParamUnsigned' -> 'unsigned int'
+  [003]     2           {Parameter} 'ParamBool' -> 'bool'
+  [003]     4           {TypeAlias} 'INTEGER' -> 'int'
+  [003]     2           {Line}
+  [003]                 {Code} '%retval = alloca i32, align 4'
+  [003]                 {Code} '%ParamPtr.addr = alloca ptr, align 8'
+  [003]                 {Code} '%ParamUnsigned.addr = alloca i32, align 4'
+  [003]                 {Code} '%ParamBool.addr = alloca i8, align 1'
+  [003]                 {Code} '%CONSTANT = alloca i32, align 4'
+  [003]                 {Code} 'store ptr %ParamPtr, ptr %ParamPtr.addr, align 8'
+  [003]                 {Code} 'store i32 %ParamUnsigned, ptr %ParamUnsigned.addr, align 4'
+  [003]                 {Code} '%storedv = zext i1 %ParamBool to i8'
+  [003]                 {Code} 'store i8 %storedv, ptr %ParamBool.addr, align 1'
+  [003]     8           {Line}
+  [003]                 {Code} '%1 = load i32, ptr %ParamUnsigned.addr, align 4, !dbg !34'
+  [003]     8           {Line}
+  [003]                 {Code} 'store i32 %1, ptr %retval, align 4, !dbg !35'
+  [003]     8           {Line}
+  [003]                 {Code} 'br label %return, !dbg !35'
+  [003]     9           {Line}
+  [003]                 {Code} '%2 = load i32, ptr %retval, align 4, !dbg !36'
+  [003]     9           {Line}
+  [003]                 {Code} 'ret i32 %2, !dbg !36'
+  [003]     3           {Line}
+  [003]     3           {Line}
+  [003]     3           {Line}
+  [003]                 {Code} 'br i1 %loadedv, label %if.then, label %if.end, !dbg !26'
+  [002]     1         {TypeAlias} 'INTPTR' -> '* const int'
+
+SELECT LOGICAL ELEMENTS
+^^^^^^^^^^^^^^^^^^^^^^^
+The following prints all *instructions*, *symbols* and *types* that
+contain **'block'** or **'.store'** in their names or types, using a tab
+layout and given the number of matches.
+
+.. code-block:: none
+
+  llvm-debuginfo-analyzer --attribute=level
+                          --select-nocase --select-regex
+                          --select=LOAD --select=store
+                          --report=list
+                          --print=symbols,types,instructions,summary
+                          test-clang.ll
+
+  Logical View:
+  [000]           {File} 'test-clang.ll'
+
+  [001]           {CompileUnit} 'test.cpp'
+  [003]           {Code} '%0 = load i8, ptr %ParamBool.addr, align 1, !dbg !26'
+  [003]           {Code} '%1 = load i32, ptr %ParamUnsigned.addr, align 4, !dbg !34'
+  [003]           {Code} '%2 = load i32, ptr %retval, align 4, !dbg !36'
+  [004]           {Code} '%loadedv = trunc i8 %0 to i1, !dbg !26'
+  [003]           {Code} '%storedv = zext i1 %ParamBool to i8'
+  [003]           {Code} 'br i1 %loadedv, label %if.then, label %if.end, !dbg !26'
+  [003]           {Code} 'store i32 %1, ptr %retval, align 4, !dbg !35'
+  [003]           {Code} 'store i32 %ParamUnsigned, ptr %ParamUnsigned.addr, align 4'
+  [004]           {Code} 'store i32 7, ptr %CONSTANT, align 4, !dbg !32'
+  [004]           {Code} 'store i32 7, ptr %retval, align 4, !dbg !33'
+  [003]           {Code} 'store i8 %storedv, ptr %ParamBool.addr, align 1'
+  [003]           {Code} 'store ptr %ParamPtr, ptr %ParamPtr.addr, align 8'
+
+  -----------------------------
+  Element      Total    Printed
+  -----------------------------
+  Scopes           5          0
+  Symbols          4          0
+  Types            2          0
+  Lines           22         12
+  -----------------------------
+  Total           33         12
+
 COMPARISON MODE
 ^^^^^^^^^^^^^^^
 Given the previous example we found the above debug information issue
@@ -2197,6 +2330,34 @@ giving more context by swapping the reference and target object files.
 The output shows the merging view path (reference and target) with the
 missing and added elements.
 
+.. code-block:: none
+
+  llvm-debuginfo-analyzer --attribute=level,format
+                          --compare=types
+                          --report=view
+                          --print=symbols,types
+                          test-clang.bc test-dwarf-gcc.o
+
+  Reference: 'test-clang.bc'
+  Target:    'test-dwarf-gcc.o'
+
+  Logical View:
+   [000]           {File} 'test-clang.bc' -> Bitcode IR
+
+   [001]             {CompileUnit} 'test.cpp'
+   [002]     1         {TypeAlias} 'INTPTR' -> '* const int'
+   [002]     2         {Function} extern not_inlined 'foo' -> 'int'
+   [003]                 {Block}
+   [004]     5             {Variable} 'CONSTANT' -> 'const INTEGER'
+  +[004]     4             {TypeAlias} 'INTEGER' -> 'int'
+   [003]     2           {Parameter} 'ParamBool' -> 'bool'
+   [003]     2           {Parameter} 'ParamPtr' -> 'INTPTR'
+   [003]     2           {Parameter} 'ParamUnsigned' -> 'unsigned int'
+  -[003]     4           {TypeAlias} 'INTEGER' -> 'int'
+
+The same output but this time comparing the Clang bitcode with the
+binary object (DWARF) generated by GCC.
+
 LOGICAL ELEMENTS
 """"""""""""""""
 It compares individual logical elements without considering if their
diff --git a/llvm/include/llvm/DebugInfo/LogicalView/Core/LVReader.h b/llvm/include/llvm/DebugInfo/LogicalView/Core/LVReader.h
index 9ce26398e48df..c848ea44f63a7 100644
--- a/llvm/include/llvm/DebugInfo/LogicalView/Core/LVReader.h
+++ b/llvm/include/llvm/DebugInfo/LogicalView/Core/LVReader.h
@@ -56,7 +56,7 @@ class LVSplitContext final {
 
 /// The logical reader owns of all the logical elements created during
 /// the debug information parsing. For its creation it uses a specific
-///  bump allocator for each type of logical element.
+/// bump allocator for each type of logical element.
 class LVReader {
   LVBinaryType BinaryType;
 
@@ -121,7 +121,24 @@ class LVReader {
 
 #undef LV_OBJECT_ALLOCATOR
 
+  // Scopes with ranges for current compile unit. It is used to find a line
+  // giving its exact or closest address. To support comdat functions, all
+  // addresses for the same section are recorded in the same map.
+  using LVSectionRanges = std::map<LVSectionIndex, std::unique_ptr<LVRange>>;
+  LVSectionRanges SectionRanges;
+
 protected:
+  // Current elements during the processing of a DIE/MDNode.
+  LVElement *CurrentElement = nullptr;
+  LVScope *CurrentScope = nullptr;
+  LVSymbol *CurrentSymbol = nullptr;
+  LVType *CurrentType = nullptr;
+  LVLine *CurrentLine = nullptr;
+  LVOffset CurrentOffset = 0;
+
+  // Address ranges collected for current DIE/MDNode/AST Node.
+  std::vector<LVAddressRange> CurrentRanges;
+
   LVScopeRoot *Root = nullptr;
   std::string InputFilename;
   std::string FileFormatName;
@@ -132,11 +149,18 @@ class LVReader {
   // Only for ELF format. The CodeView is handled in a different way.
   LVSectionIndex DotTextSectionIndex = UndefinedSectionIndex;
 
+  void addSectionRange(LVSectionIndex SectionIndex, LVScope *Scope);
+  void addSectionRange(LVSectionIndex SectionIndex, LVScope *Scope,
+                       LVAddress LowerAddress, LVAddress UpperAddress);
+  LVRange *getSectionRanges(LVSectionIndex SectionIndex);
+
   // Record Compilation Unit entry.
   void addCompileUnitOffset(LVOffset Offset, LVScopeCompileUnit *CompileUnit) {
     CompileUnits.emplace(Offset, CompileUnit);
   }
 
+  LVElement *createElement(dwarf::Tag Tag);
+
   // Create the Scope Root.
   virtual Error createScopes() {
     Root = createScopeRoot();
diff --git a/llvm/include/llvm/DebugInfo/LogicalView/Core/LVSupport.h b/llvm/include/llvm/DebugInfo/LogicalView/Core/LVSupport.h
index 8269344fe6efe..28f6dbcf4e2ae 100644
--- a/llvm/include/llvm/DebugInfo/LogicalView/Core/LVSupport.h
+++ b/llvm/include/llvm/DebugInfo/LogicalView/Core/LVSupport.h
@@ -99,6 +99,19 @@ template <typename T> class LVProperties {
 #define KIND_3(ENUM, FIELD, F1, F2, F3)                                        \
   BOOL_BIT_3(Kinds, ENUM, FIELD, F1, F2, F3)
 
+const int DEC_WIDTH = 8;
+inline FormattedNumber decValue(uint64_t N, unsigned Width = DEC_WIDTH) {
+  return format_decimal(N, Width);
+}
+
+// Output the decimal representation of 'Value'.
+inline std::string decString(uint64_t Value, size_t Width = DEC_WIDTH) {
+  std::string String;
+  raw_string_ostream Stream(String);
+  Stream << decValue(Value, Width);
+  return Stream.str();
+}
+
 const int HEX_WIDTH = 12;
 inline FormattedNumber hexValue(uint64_t N, unsigned Width = HEX_WIDTH,
                                 bool Upper = false) {
diff --git a/llvm/include/llvm/DebugInfo/LogicalView/LVReaderHandler.h b/llvm/include/llvm/DebugInfo/LogicalView/LVReaderHandler.h
index bf30501d00c1f..c743e517e371b 100644
--- a/llvm/include/llvm/DebugInfo/LogicalView/LVReaderHandler.h
+++ b/llvm/include/llvm/DebugInfo/LogicalView/LVReaderHandler.h
@@ -17,6 +17,7 @@
 #include "llvm/DebugInfo/LogicalView/Core/LVReader.h"
 #include "llvm/DebugInfo/PDB/Native/PDBFile.h"
 #include "llvm/Object/Archive.h"
+#include "llvm/Object/IRObjectFile.h"
 #include "llvm/Object/MachOUniversal.h"
 #include "llvm/Object/ObjectFile.h"
 #include "llvm/Support/MemoryBuffer.h"
@@ -29,7 +30,9 @@ namespace logicalview {
 
 using LVReaders = std::vector<std::unique_ptr<LVReader>>;
 using ArgVector = std::vector<std::string>;
-using PdbOrObj = PointerUnion<object::ObjectFile *, pdb::PDBFile *>;
+using PdbOrObjOrIr =
+    PointerUnion<object::ObjectFile *, pdb::PDBFile *, object::IRObjectFile *,
+                 MemoryBufferRef *, StringRef *>;
 
 // This class performs the following tasks:
 // - Creates a logical reader for every binary file in the command line,
@@ -60,9 +63,12 @@ class LVReaderHandler {
                      object::Binary &Binary);
   Error handleObject(LVReaders &Readers, StringRef Filename, StringRef Buffer,
                      StringRef ExePath);
+  Error handleObject(LVReaders &Readers, StringRef Filename,
+                     MemoryBufferRef Buffer);
 
-  Error createReader(StringRef Filename, LVReaders &Readers, PdbOrObj &Input,
-                     StringRef FileFormatName, StringRef ExePath = {});
+  Error createReader(StringRef Filename, LVReaders &Readers,
+                     PdbOrObjOrIr &Input, StringRef FileFormatName,
+                     StringRef ExePath = {});
 
 public:
   LVReaderHandler() = delete;
diff --git a/llvm/include/llvm/DebugInfo/LogicalView/Readers/LVBinaryReader.h b/llvm/include/llvm/DebugInfo/LogicalView/Readers/LVBinaryReader.h
index 9cda64e33ddf7..bd897a9944bf7 100644
--- a/llvm/include/llvm/DebugInfo/LogicalView/Readers/LVBinaryReader.h
+++ b/llvm/include/llvm/DebugInfo/LogicalView/Readers/LVBinaryReader.h
@@ -25,6 +25,7 @@
 #include "llvm/MC/MCSubtargetInfo.h"
 #include "llvm/MC/TargetRegistry.h"
 #include "llvm/Object/COFF.h"
+#include "llvm/Object/IRObjectFile.h"
 #include "llvm/Object/ObjectFile.h"
 
 namespace llvm {
@@ -93,12 +94,6 @@ class LVBinaryReader : public LVReader {
       SectionAddresses.emplace(Section.getAddress(), Section);
   }
 
-  // Scopes with ranges for current compile unit. It is used to find a line
-  // giving its exact or closest address. To support comdat functions, all
-  // addresses for the same section are recorded in the same map.
-  using LVSectionRanges = std::map<LVSectionIndex, std::unique_ptr<LVRange>>;
-  LVSectionRanges SectionRanges;
-
   // Image base and virtual address for Executable file.
   uint64_t ImageBaseAddress = 0;
   uint64_t VirtualAddress = 0;
@@ -179,11 +174,6 @@ class LVBinaryReader : public LVReader {
   Expected<std::pair<LVSectionIndex, object::SectionRef>>
   getSection(LVScope *Scope, LVAddress Address, LVSectionIndex SectionIndex);
 
-  void addSectionRange(LVSectionIndex SectionIndex, LVScope *Scope);
-  void addSectionRange(LVSectionIndex SectionIndex, LVScope *Scope,
-                       LVAddress LowerAddress, LVAddress UpperAddress);
-  LVRange *getSectionRanges(LVSectionIndex SectionIndex);
-
   void includeInlineeLines(LVSectionIndex SectionIndex, LVScope *Function);
 
   Error createInstructions();
diff --git a/llvm/include/llvm/DebugInfo/LogicalView/Readers/LVDWARFReader.h b/llvm/include/llvm/DebugInfo/LogicalView/Readers/LVDWARFReader.h
index fdc97249d8e5a..3fb099924b5f2 100644
--- a/llvm/include/llvm/DebugInfo/LogicalView/Readers/LVDWARFReader.h
+++ b/llvm/include/llvm/DebugInfo/LogicalView/Readers/LVDWARFReader.h
@@ -39,12 +39,6 @@ class LVDWARFReader final : public LVBinaryReader {
   LVAddress CUBaseAddress = 0;
   LVAddress CUHighAddress = 0;
 
-  // Current elements during the processing of a DIE.
-  LVElement *CurrentElement = nullptr;
-  LVScope *CurrentScope = nullptr;
-  LVSymbol *CurrentSymbol = nullptr;
-  LVType *CurrentType = nullptr;
-  LVOffset CurrentOffset = 0;
   LVOffset CurrentEndOffset = 0;
 
   // In DWARF v4, the files are 1-indexed.
@@ -52,9 +46,6 @@ class LVDWARFReader final : public LVBinaryReader {
   // The DWARF reader expects the indexes as 1-indexed.
   bool IncrementFileIndex = false;
 
-  // Address ranges collected for current DIE.
-  std::vector<LVAddressRange> CurrentRanges;
-
   // Symbols with locations for current compile unit.
   LVSymbols SymbolsWithLocations;
 
@@ -82,7 +73,6 @@ class LVDWARFReader final : public LVBinaryReader {
 
   void mapRangeAddress(const object::ObjectFile &Obj) override;
 
-  LVElement *createElement(dwarf::Tag Tag);
   void traverseDieAndChildren(DWARFDie &DIE, LVScope *Parent,
                               DWARFDie &SkeletonDie);
   // Process the attributes for the given DIE.
diff --git a/llvm/include/llvm/DebugInfo/LogicalView/Readers/LVIRReader.h b/llvm/include/llvm/DebugInfo/LogicalView/Readers/LVIRReader.h
new file mode 100644
index 0000000000000..5c70ec9077791
--- /dev/null
+++ b/llvm/include/llvm/DebugInfo/LogicalView/Readers/LVIRReader.h
@@ -0,0 +1,300 @@
+//===-- LVIRReader.h --------------------------------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This file defines the LVIRReader class, which is used to describe a
+// LLVM IR reader.
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_DEBUGINFO_LOGICALVIEW_READERS_LVIRREADER_H
+#define LLVM_DEBUGINFO_LOGICALVIEW_READERS_LVIRREADER_H
+
+#include "llvm/DebugInfo/LogicalView/Core/LVReader.h"
+#include "llvm/Transforms/Utils/DebugSSAUpdater.h"
+
+namespace llvm {
+class DIFile;
+class DINode;
+class DILocation;
+class DIScope;
+class DISubprogram;
+class DIVariable;
+class BasicBlock;
+
+namespace object {
+class IRObjectFile;
+}
+
+namespace logicalview {
+
+class LVElement;
+class LVLine;
+class LVScopeCompileUnit;
+class LVSymbol;
+class LVType;
+
+class LVIRReader final : public LVReader {
+  object::IRObjectFile *BitCodeIR = nullptr;
+  MemoryBufferRef *TextualIR = nullptr;
+
+  // Symbols with locations for current compile unit.
+  LVSymbols SymbolsWithLocations;
+
+  LVSectionIndex SectionIndex = 0;
+
+  const DICompileUnit *CUNode = nullptr;
+
+  // The Dwarf Version (from the module flags).
+  unsigned DwarfVersion;
+
+  // Location index for global variables.
+  uint64_t PoolAddressIndex = 0;
+
+  // Whether to emit all linkage names, or just abstract subprograms.
+  bool UseAllLinkageNames = true;
+
+  // Dependencies on external options (llc, etc).
+  bool includeMinimalInlineScopes() const;
+  bool useAllLinkageNames() const { return UseAllLinkageNames; }
+
+  bool LanguageIsFortran = false;
+  void mapFortranLanguage(unsigned DWLang);
+  bool moduleIsInFortran() const { return LanguageIsFortran; }
+
+  // Generate logical debug line before prologue.
+  bool GenerateLineBeforePrologue = true;
+
+  // We assume a constante increase between instructions.
+  const unsigned OffsetIncrease = 4;
+  void updateLineOffset() { CurrentOffset += OffsetIncrease; }
+
+  // An anonymous type for index type.
+  LVType *NodeIndexType = nullptr;
+
+  std::unique_ptr<DbgValueRangeTable> DbgValueRanges;
+
+  // Record the last assigned file index for each compile unit.
+  using LVIndexFiles = std::map<LVScopeCompileUnit *, size_t>;
+  LVIndexFiles IndexFiles;
+
+  void updateFileIndex(LVScopeCompileUnit *CompileUnit, size_t FileIndex) {
+    LVIndexFiles::iterator Iter = IndexFiles.find(CompileUnit);
+    if (Iter == IndexFiles.end())
+      IndexFiles.emplace(CompileUnit, FileIndex);
+    else
+      Iter->second = FileIndex;
+  }
+
+  // Get the current assigned index file for the given compile unit.
+  size_t getFileIndex(LVScopeCompileUnit *CompileUnit) {
+    size_t FileIndex = 0;
+    LVIndexFiles::iterator Iter = IndexFiles.find(CompileUnit);
+    if (Iter != IndexFiles.end())
+      FileIndex = Iter->second;
+    return FileIndex;
+  }
+
+  // Collect the compile unit metadata files.
+  using LVCompileUnitFiles = std::map<const DIFile *, size_t>;
+  LVCompileUnitFiles CompileUnitFiles;
+
+  size_t getOrCreateSourceID(cons...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Apr 11, 2025

@llvm/pr-subscribers-llvm-binary-utilities

Author: Carlos Alberto Enciso (CarlosAlbertoEnciso)

Changes

Add support for the LLVM IR format and be able to generate logical views.

Both textual representation (.ll) and bitcode (.bc) format are supported.

Note: This patch requires:

Add DebugSSAUpdater class to track debug value liveness
#135349


Patch is 226.09 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/135440.diff

35 Files Affected:

  • (modified) llvm/docs/CommandGuide/llvm-debuginfo-analyzer.rst (+165-4)
  • (modified) llvm/include/llvm/DebugInfo/LogicalView/Core/LVReader.h (+25-1)
  • (modified) llvm/include/llvm/DebugInfo/LogicalView/Core/LVSupport.h (+13)
  • (modified) llvm/include/llvm/DebugInfo/LogicalView/LVReaderHandler.h (+9-3)
  • (modified) llvm/include/llvm/DebugInfo/LogicalView/Readers/LVBinaryReader.h (+1-11)
  • (modified) llvm/include/llvm/DebugInfo/LogicalView/Readers/LVDWARFReader.h (-10)
  • (added) llvm/include/llvm/DebugInfo/LogicalView/Readers/LVIRReader.h (+300)
  • (modified) llvm/lib/DebugInfo/LogicalView/CMakeLists.txt (+2)
  • (modified) llvm/lib/DebugInfo/LogicalView/Core/LVElement.cpp (+8-4)
  • (modified) llvm/lib/DebugInfo/LogicalView/Core/LVReader.cpp (+271)
  • (modified) llvm/lib/DebugInfo/LogicalView/LVReaderHandler.cpp (+37-8)
  • (modified) llvm/lib/DebugInfo/LogicalView/Readers/LVBinaryReader.cpp (-24)
  • (modified) llvm/lib/DebugInfo/LogicalView/Readers/LVDWARFReader.cpp (+6-218)
  • (added) llvm/lib/DebugInfo/LogicalView/Readers/LVIRReader.cpp (+2348)
  • (added) llvm/test/tools/llvm-debuginfo-analyzer/IR/01-ir-compare-logical-elements.test (+121)
  • (added) llvm/test/tools/llvm-debuginfo-analyzer/IR/01-ir-print-basic-details.test (+76)
  • (added) llvm/test/tools/llvm-debuginfo-analyzer/IR/01-ir-select-logical-elements.test (+122)
  • (added) llvm/test/tools/llvm-debuginfo-analyzer/IR/02-ir-logical-lines.test (+60)
  • (added) llvm/test/tools/llvm-debuginfo-analyzer/IR/03-ir-incorrect-lexical-scope-typedef.test (+131)
  • (added) llvm/test/tools/llvm-debuginfo-analyzer/IR/04-ir-missing-nested-enumerators.test (+144)
  • (added) llvm/test/tools/llvm-debuginfo-analyzer/IR/05-ir-incorrect-lexical-scope-variable.test (+113)
  • (added) llvm/test/tools/llvm-debuginfo-analyzer/IR/06-ir-full-logical-view.test (+118)
  • (added) llvm/test/tools/llvm-debuginfo-analyzer/IR/07-ir-debug-formats.test (+48)
  • (added) llvm/test/tools/llvm-debuginfo-analyzer/IR/Inputs/hello-world-clang.ll (+49)
  • (added) llvm/test/tools/llvm-debuginfo-analyzer/IR/Inputs/pr-43860-clang.ll (+92)
  • (added) llvm/test/tools/llvm-debuginfo-analyzer/IR/Inputs/pr-44884-clang.ll (+98)
  • (added) llvm/test/tools/llvm-debuginfo-analyzer/IR/Inputs/pr-46466-clang.ll (+44)
  • (added) llvm/test/tools/llvm-debuginfo-analyzer/IR/Inputs/test-clang-intrinsics.ll (+87)
  • (added) llvm/test/tools/llvm-debuginfo-analyzer/IR/Inputs/test-clang.ll (+83)
  • (modified) llvm/tools/llvm-debuginfo-analyzer/CMakeLists.txt (+2)
  • (modified) llvm/unittests/DebugInfo/LogicalView/CMakeLists.txt (+3)
  • (added) llvm/unittests/DebugInfo/LogicalView/IRReaderTest.cpp (+355)
  • (modified) llvm/unittests/DebugInfo/LogicalView/Inputs/README.md (+11)
  • (added) llvm/unittests/DebugInfo/LogicalView/Inputs/test-clang.bc ()
  • (added) llvm/unittests/DebugInfo/LogicalView/Inputs/test-clang.ll (+83)
diff --git a/llvm/docs/CommandGuide/llvm-debuginfo-analyzer.rst b/llvm/docs/CommandGuide/llvm-debuginfo-analyzer.rst
index 60fa024db5e99..c5d3293c02f57 100644
--- a/llvm/docs/CommandGuide/llvm-debuginfo-analyzer.rst
+++ b/llvm/docs/CommandGuide/llvm-debuginfo-analyzer.rst
@@ -13,10 +13,11 @@ SYNOPSIS
 DESCRIPTION
 -----------
 :program:`llvm-debuginfo-analyzer` parses debug and text sections in
-binary object files and prints their contents in a logical view, which
-is a human readable representation that closely matches the structure
-of the original user source code. Supported object file formats include
-ELF, Mach-O, WebAssembly, PDB and COFF.
+binary object files and textual IR representations and prints their
+contents in a logical view, which is a human readable representation
+that closely matches the structure of the original user source code.
+Supported object file formats include ELF, Mach-O, WebAssembly, PDB,
+COFF, IR (textual representation and bitcode).
 
 The **logical view** abstracts the complexity associated with the
 different low-level representations of the debugging information that
@@ -2124,6 +2125,138 @@ layout and given the number of matches.
   -----------------------------
   Total           71          8
 
+IR (Textual representation and bitcode) SUPPORT
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The below example is used to show the IR output generated by
+:program:`llvm-debuginfo-analyzer`. We compiled the example for a
+IR 64-bit target with Clang (-O0 -g --target=x86_64-linux):
+
+.. code-block:: c++
+
+  1  using INTPTR = const int *;
+  2  int foo(INTPTR ParamPtr, unsigned ParamUnsigned, bool ParamBool) {
+  3    if (ParamBool) {
+  4      typedef int INTEGER;
+  5      const INTEGER CONSTANT = 7;
+  6      return CONSTANT;
+  7    }
+  8    return ParamUnsigned;
+  9  }
+
+PRINT BASIC DETAILS
+^^^^^^^^^^^^^^^^^^^
+The following command prints basic details for all the logical elements
+sorted by the debug information internal offset; it includes its lexical
+level and debug info format.
+
+.. code-block:: none
+
+  llvm-debuginfo-analyzer --attribute=level,format
+                          --output-sort=offset
+                          --print=scopes,symbols,types,lines,instructions
+                          test-clang.ll
+
+or
+
+.. code-block:: none
+
+  llvm-debuginfo-analyzer --attribute=level,format
+                          --output-sort=offset
+                          --print=elements
+                          test-clang.ll
+
+Each row represents an element that is present within the debug
+information. The first column represents the scope level, followed by
+the associated line number (if any), and finally the description of
+the element.
+
+.. code-block:: none
+
+  Logical View:
+  [000]           {File} 'test-clang.ll' -> Textual IR
+
+  [001]             {CompileUnit} 'test.cpp'
+  [002]     2         {Function} extern not_inlined 'foo' -> 'int'
+  [003]                 {Block}
+  [004]     5             {Variable} 'CONSTANT' -> 'const INTEGER'
+  [004]     5             {Line}
+  [004]                   {Code} 'store i32 7, ptr %CONSTANT, align 4, !dbg !32'
+  [004]     6             {Line}
+  [004]                   {Code} 'store i32 7, ptr %retval, align 4, !dbg !33'
+  [004]     6             {Line}
+  [004]                   {Code} 'br label %return, !dbg !33'
+  [003]     2           {Parameter} 'ParamPtr' -> 'INTPTR'
+  [003]     2           {Parameter} 'ParamUnsigned' -> 'unsigned int'
+  [003]     2           {Parameter} 'ParamBool' -> 'bool'
+  [003]     4           {TypeAlias} 'INTEGER' -> 'int'
+  [003]     2           {Line}
+  [003]                 {Code} '%retval = alloca i32, align 4'
+  [003]                 {Code} '%ParamPtr.addr = alloca ptr, align 8'
+  [003]                 {Code} '%ParamUnsigned.addr = alloca i32, align 4'
+  [003]                 {Code} '%ParamBool.addr = alloca i8, align 1'
+  [003]                 {Code} '%CONSTANT = alloca i32, align 4'
+  [003]                 {Code} 'store ptr %ParamPtr, ptr %ParamPtr.addr, align 8'
+  [003]                 {Code} 'store i32 %ParamUnsigned, ptr %ParamUnsigned.addr, align 4'
+  [003]                 {Code} '%storedv = zext i1 %ParamBool to i8'
+  [003]                 {Code} 'store i8 %storedv, ptr %ParamBool.addr, align 1'
+  [003]     8           {Line}
+  [003]                 {Code} '%1 = load i32, ptr %ParamUnsigned.addr, align 4, !dbg !34'
+  [003]     8           {Line}
+  [003]                 {Code} 'store i32 %1, ptr %retval, align 4, !dbg !35'
+  [003]     8           {Line}
+  [003]                 {Code} 'br label %return, !dbg !35'
+  [003]     9           {Line}
+  [003]                 {Code} '%2 = load i32, ptr %retval, align 4, !dbg !36'
+  [003]     9           {Line}
+  [003]                 {Code} 'ret i32 %2, !dbg !36'
+  [003]     3           {Line}
+  [003]     3           {Line}
+  [003]     3           {Line}
+  [003]                 {Code} 'br i1 %loadedv, label %if.then, label %if.end, !dbg !26'
+  [002]     1         {TypeAlias} 'INTPTR' -> '* const int'
+
+SELECT LOGICAL ELEMENTS
+^^^^^^^^^^^^^^^^^^^^^^^
+The following prints all *instructions*, *symbols* and *types* that
+contain **'block'** or **'.store'** in their names or types, using a tab
+layout and given the number of matches.
+
+.. code-block:: none
+
+  llvm-debuginfo-analyzer --attribute=level
+                          --select-nocase --select-regex
+                          --select=LOAD --select=store
+                          --report=list
+                          --print=symbols,types,instructions,summary
+                          test-clang.ll
+
+  Logical View:
+  [000]           {File} 'test-clang.ll'
+
+  [001]           {CompileUnit} 'test.cpp'
+  [003]           {Code} '%0 = load i8, ptr %ParamBool.addr, align 1, !dbg !26'
+  [003]           {Code} '%1 = load i32, ptr %ParamUnsigned.addr, align 4, !dbg !34'
+  [003]           {Code} '%2 = load i32, ptr %retval, align 4, !dbg !36'
+  [004]           {Code} '%loadedv = trunc i8 %0 to i1, !dbg !26'
+  [003]           {Code} '%storedv = zext i1 %ParamBool to i8'
+  [003]           {Code} 'br i1 %loadedv, label %if.then, label %if.end, !dbg !26'
+  [003]           {Code} 'store i32 %1, ptr %retval, align 4, !dbg !35'
+  [003]           {Code} 'store i32 %ParamUnsigned, ptr %ParamUnsigned.addr, align 4'
+  [004]           {Code} 'store i32 7, ptr %CONSTANT, align 4, !dbg !32'
+  [004]           {Code} 'store i32 7, ptr %retval, align 4, !dbg !33'
+  [003]           {Code} 'store i8 %storedv, ptr %ParamBool.addr, align 1'
+  [003]           {Code} 'store ptr %ParamPtr, ptr %ParamPtr.addr, align 8'
+
+  -----------------------------
+  Element      Total    Printed
+  -----------------------------
+  Scopes           5          0
+  Symbols          4          0
+  Types            2          0
+  Lines           22         12
+  -----------------------------
+  Total           33         12
+
 COMPARISON MODE
 ^^^^^^^^^^^^^^^
 Given the previous example we found the above debug information issue
@@ -2197,6 +2330,34 @@ giving more context by swapping the reference and target object files.
 The output shows the merging view path (reference and target) with the
 missing and added elements.
 
+.. code-block:: none
+
+  llvm-debuginfo-analyzer --attribute=level,format
+                          --compare=types
+                          --report=view
+                          --print=symbols,types
+                          test-clang.bc test-dwarf-gcc.o
+
+  Reference: 'test-clang.bc'
+  Target:    'test-dwarf-gcc.o'
+
+  Logical View:
+   [000]           {File} 'test-clang.bc' -> Bitcode IR
+
+   [001]             {CompileUnit} 'test.cpp'
+   [002]     1         {TypeAlias} 'INTPTR' -> '* const int'
+   [002]     2         {Function} extern not_inlined 'foo' -> 'int'
+   [003]                 {Block}
+   [004]     5             {Variable} 'CONSTANT' -> 'const INTEGER'
+  +[004]     4             {TypeAlias} 'INTEGER' -> 'int'
+   [003]     2           {Parameter} 'ParamBool' -> 'bool'
+   [003]     2           {Parameter} 'ParamPtr' -> 'INTPTR'
+   [003]     2           {Parameter} 'ParamUnsigned' -> 'unsigned int'
+  -[003]     4           {TypeAlias} 'INTEGER' -> 'int'
+
+The same output but this time comparing the Clang bitcode with the
+binary object (DWARF) generated by GCC.
+
 LOGICAL ELEMENTS
 """"""""""""""""
 It compares individual logical elements without considering if their
diff --git a/llvm/include/llvm/DebugInfo/LogicalView/Core/LVReader.h b/llvm/include/llvm/DebugInfo/LogicalView/Core/LVReader.h
index 9ce26398e48df..c848ea44f63a7 100644
--- a/llvm/include/llvm/DebugInfo/LogicalView/Core/LVReader.h
+++ b/llvm/include/llvm/DebugInfo/LogicalView/Core/LVReader.h
@@ -56,7 +56,7 @@ class LVSplitContext final {
 
 /// The logical reader owns of all the logical elements created during
 /// the debug information parsing. For its creation it uses a specific
-///  bump allocator for each type of logical element.
+/// bump allocator for each type of logical element.
 class LVReader {
   LVBinaryType BinaryType;
 
@@ -121,7 +121,24 @@ class LVReader {
 
 #undef LV_OBJECT_ALLOCATOR
 
+  // Scopes with ranges for current compile unit. It is used to find a line
+  // giving its exact or closest address. To support comdat functions, all
+  // addresses for the same section are recorded in the same map.
+  using LVSectionRanges = std::map<LVSectionIndex, std::unique_ptr<LVRange>>;
+  LVSectionRanges SectionRanges;
+
 protected:
+  // Current elements during the processing of a DIE/MDNode.
+  LVElement *CurrentElement = nullptr;
+  LVScope *CurrentScope = nullptr;
+  LVSymbol *CurrentSymbol = nullptr;
+  LVType *CurrentType = nullptr;
+  LVLine *CurrentLine = nullptr;
+  LVOffset CurrentOffset = 0;
+
+  // Address ranges collected for current DIE/MDNode/AST Node.
+  std::vector<LVAddressRange> CurrentRanges;
+
   LVScopeRoot *Root = nullptr;
   std::string InputFilename;
   std::string FileFormatName;
@@ -132,11 +149,18 @@ class LVReader {
   // Only for ELF format. The CodeView is handled in a different way.
   LVSectionIndex DotTextSectionIndex = UndefinedSectionIndex;
 
+  void addSectionRange(LVSectionIndex SectionIndex, LVScope *Scope);
+  void addSectionRange(LVSectionIndex SectionIndex, LVScope *Scope,
+                       LVAddress LowerAddress, LVAddress UpperAddress);
+  LVRange *getSectionRanges(LVSectionIndex SectionIndex);
+
   // Record Compilation Unit entry.
   void addCompileUnitOffset(LVOffset Offset, LVScopeCompileUnit *CompileUnit) {
     CompileUnits.emplace(Offset, CompileUnit);
   }
 
+  LVElement *createElement(dwarf::Tag Tag);
+
   // Create the Scope Root.
   virtual Error createScopes() {
     Root = createScopeRoot();
diff --git a/llvm/include/llvm/DebugInfo/LogicalView/Core/LVSupport.h b/llvm/include/llvm/DebugInfo/LogicalView/Core/LVSupport.h
index 8269344fe6efe..28f6dbcf4e2ae 100644
--- a/llvm/include/llvm/DebugInfo/LogicalView/Core/LVSupport.h
+++ b/llvm/include/llvm/DebugInfo/LogicalView/Core/LVSupport.h
@@ -99,6 +99,19 @@ template <typename T> class LVProperties {
 #define KIND_3(ENUM, FIELD, F1, F2, F3)                                        \
   BOOL_BIT_3(Kinds, ENUM, FIELD, F1, F2, F3)
 
+const int DEC_WIDTH = 8;
+inline FormattedNumber decValue(uint64_t N, unsigned Width = DEC_WIDTH) {
+  return format_decimal(N, Width);
+}
+
+// Output the decimal representation of 'Value'.
+inline std::string decString(uint64_t Value, size_t Width = DEC_WIDTH) {
+  std::string String;
+  raw_string_ostream Stream(String);
+  Stream << decValue(Value, Width);
+  return Stream.str();
+}
+
 const int HEX_WIDTH = 12;
 inline FormattedNumber hexValue(uint64_t N, unsigned Width = HEX_WIDTH,
                                 bool Upper = false) {
diff --git a/llvm/include/llvm/DebugInfo/LogicalView/LVReaderHandler.h b/llvm/include/llvm/DebugInfo/LogicalView/LVReaderHandler.h
index bf30501d00c1f..c743e517e371b 100644
--- a/llvm/include/llvm/DebugInfo/LogicalView/LVReaderHandler.h
+++ b/llvm/include/llvm/DebugInfo/LogicalView/LVReaderHandler.h
@@ -17,6 +17,7 @@
 #include "llvm/DebugInfo/LogicalView/Core/LVReader.h"
 #include "llvm/DebugInfo/PDB/Native/PDBFile.h"
 #include "llvm/Object/Archive.h"
+#include "llvm/Object/IRObjectFile.h"
 #include "llvm/Object/MachOUniversal.h"
 #include "llvm/Object/ObjectFile.h"
 #include "llvm/Support/MemoryBuffer.h"
@@ -29,7 +30,9 @@ namespace logicalview {
 
 using LVReaders = std::vector<std::unique_ptr<LVReader>>;
 using ArgVector = std::vector<std::string>;
-using PdbOrObj = PointerUnion<object::ObjectFile *, pdb::PDBFile *>;
+using PdbOrObjOrIr =
+    PointerUnion<object::ObjectFile *, pdb::PDBFile *, object::IRObjectFile *,
+                 MemoryBufferRef *, StringRef *>;
 
 // This class performs the following tasks:
 // - Creates a logical reader for every binary file in the command line,
@@ -60,9 +63,12 @@ class LVReaderHandler {
                      object::Binary &Binary);
   Error handleObject(LVReaders &Readers, StringRef Filename, StringRef Buffer,
                      StringRef ExePath);
+  Error handleObject(LVReaders &Readers, StringRef Filename,
+                     MemoryBufferRef Buffer);
 
-  Error createReader(StringRef Filename, LVReaders &Readers, PdbOrObj &Input,
-                     StringRef FileFormatName, StringRef ExePath = {});
+  Error createReader(StringRef Filename, LVReaders &Readers,
+                     PdbOrObjOrIr &Input, StringRef FileFormatName,
+                     StringRef ExePath = {});
 
 public:
   LVReaderHandler() = delete;
diff --git a/llvm/include/llvm/DebugInfo/LogicalView/Readers/LVBinaryReader.h b/llvm/include/llvm/DebugInfo/LogicalView/Readers/LVBinaryReader.h
index 9cda64e33ddf7..bd897a9944bf7 100644
--- a/llvm/include/llvm/DebugInfo/LogicalView/Readers/LVBinaryReader.h
+++ b/llvm/include/llvm/DebugInfo/LogicalView/Readers/LVBinaryReader.h
@@ -25,6 +25,7 @@
 #include "llvm/MC/MCSubtargetInfo.h"
 #include "llvm/MC/TargetRegistry.h"
 #include "llvm/Object/COFF.h"
+#include "llvm/Object/IRObjectFile.h"
 #include "llvm/Object/ObjectFile.h"
 
 namespace llvm {
@@ -93,12 +94,6 @@ class LVBinaryReader : public LVReader {
       SectionAddresses.emplace(Section.getAddress(), Section);
   }
 
-  // Scopes with ranges for current compile unit. It is used to find a line
-  // giving its exact or closest address. To support comdat functions, all
-  // addresses for the same section are recorded in the same map.
-  using LVSectionRanges = std::map<LVSectionIndex, std::unique_ptr<LVRange>>;
-  LVSectionRanges SectionRanges;
-
   // Image base and virtual address for Executable file.
   uint64_t ImageBaseAddress = 0;
   uint64_t VirtualAddress = 0;
@@ -179,11 +174,6 @@ class LVBinaryReader : public LVReader {
   Expected<std::pair<LVSectionIndex, object::SectionRef>>
   getSection(LVScope *Scope, LVAddress Address, LVSectionIndex SectionIndex);
 
-  void addSectionRange(LVSectionIndex SectionIndex, LVScope *Scope);
-  void addSectionRange(LVSectionIndex SectionIndex, LVScope *Scope,
-                       LVAddress LowerAddress, LVAddress UpperAddress);
-  LVRange *getSectionRanges(LVSectionIndex SectionIndex);
-
   void includeInlineeLines(LVSectionIndex SectionIndex, LVScope *Function);
 
   Error createInstructions();
diff --git a/llvm/include/llvm/DebugInfo/LogicalView/Readers/LVDWARFReader.h b/llvm/include/llvm/DebugInfo/LogicalView/Readers/LVDWARFReader.h
index fdc97249d8e5a..3fb099924b5f2 100644
--- a/llvm/include/llvm/DebugInfo/LogicalView/Readers/LVDWARFReader.h
+++ b/llvm/include/llvm/DebugInfo/LogicalView/Readers/LVDWARFReader.h
@@ -39,12 +39,6 @@ class LVDWARFReader final : public LVBinaryReader {
   LVAddress CUBaseAddress = 0;
   LVAddress CUHighAddress = 0;
 
-  // Current elements during the processing of a DIE.
-  LVElement *CurrentElement = nullptr;
-  LVScope *CurrentScope = nullptr;
-  LVSymbol *CurrentSymbol = nullptr;
-  LVType *CurrentType = nullptr;
-  LVOffset CurrentOffset = 0;
   LVOffset CurrentEndOffset = 0;
 
   // In DWARF v4, the files are 1-indexed.
@@ -52,9 +46,6 @@ class LVDWARFReader final : public LVBinaryReader {
   // The DWARF reader expects the indexes as 1-indexed.
   bool IncrementFileIndex = false;
 
-  // Address ranges collected for current DIE.
-  std::vector<LVAddressRange> CurrentRanges;
-
   // Symbols with locations for current compile unit.
   LVSymbols SymbolsWithLocations;
 
@@ -82,7 +73,6 @@ class LVDWARFReader final : public LVBinaryReader {
 
   void mapRangeAddress(const object::ObjectFile &Obj) override;
 
-  LVElement *createElement(dwarf::Tag Tag);
   void traverseDieAndChildren(DWARFDie &DIE, LVScope *Parent,
                               DWARFDie &SkeletonDie);
   // Process the attributes for the given DIE.
diff --git a/llvm/include/llvm/DebugInfo/LogicalView/Readers/LVIRReader.h b/llvm/include/llvm/DebugInfo/LogicalView/Readers/LVIRReader.h
new file mode 100644
index 0000000000000..5c70ec9077791
--- /dev/null
+++ b/llvm/include/llvm/DebugInfo/LogicalView/Readers/LVIRReader.h
@@ -0,0 +1,300 @@
+//===-- LVIRReader.h --------------------------------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This file defines the LVIRReader class, which is used to describe a
+// LLVM IR reader.
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_DEBUGINFO_LOGICALVIEW_READERS_LVIRREADER_H
+#define LLVM_DEBUGINFO_LOGICALVIEW_READERS_LVIRREADER_H
+
+#include "llvm/DebugInfo/LogicalView/Core/LVReader.h"
+#include "llvm/Transforms/Utils/DebugSSAUpdater.h"
+
+namespace llvm {
+class DIFile;
+class DINode;
+class DILocation;
+class DIScope;
+class DISubprogram;
+class DIVariable;
+class BasicBlock;
+
+namespace object {
+class IRObjectFile;
+}
+
+namespace logicalview {
+
+class LVElement;
+class LVLine;
+class LVScopeCompileUnit;
+class LVSymbol;
+class LVType;
+
+class LVIRReader final : public LVReader {
+  object::IRObjectFile *BitCodeIR = nullptr;
+  MemoryBufferRef *TextualIR = nullptr;
+
+  // Symbols with locations for current compile unit.
+  LVSymbols SymbolsWithLocations;
+
+  LVSectionIndex SectionIndex = 0;
+
+  const DICompileUnit *CUNode = nullptr;
+
+  // The Dwarf Version (from the module flags).
+  unsigned DwarfVersion;
+
+  // Location index for global variables.
+  uint64_t PoolAddressIndex = 0;
+
+  // Whether to emit all linkage names, or just abstract subprograms.
+  bool UseAllLinkageNames = true;
+
+  // Dependencies on external options (llc, etc).
+  bool includeMinimalInlineScopes() const;
+  bool useAllLinkageNames() const { return UseAllLinkageNames; }
+
+  bool LanguageIsFortran = false;
+  void mapFortranLanguage(unsigned DWLang);
+  bool moduleIsInFortran() const { return LanguageIsFortran; }
+
+  // Generate logical debug line before prologue.
+  bool GenerateLineBeforePrologue = true;
+
+  // We assume a constante increase between instructions.
+  const unsigned OffsetIncrease = 4;
+  void updateLineOffset() { CurrentOffset += OffsetIncrease; }
+
+  // An anonymous type for index type.
+  LVType *NodeIndexType = nullptr;
+
+  std::unique_ptr<DbgValueRangeTable> DbgValueRanges;
+
+  // Record the last assigned file index for each compile unit.
+  using LVIndexFiles = std::map<LVScopeCompileUnit *, size_t>;
+  LVIndexFiles IndexFiles;
+
+  void updateFileIndex(LVScopeCompileUnit *CompileUnit, size_t FileIndex) {
+    LVIndexFiles::iterator Iter = IndexFiles.find(CompileUnit);
+    if (Iter == IndexFiles.end())
+      IndexFiles.emplace(CompileUnit, FileIndex);
+    else
+      Iter->second = FileIndex;
+  }
+
+  // Get the current assigned index file for the given compile unit.
+  size_t getFileIndex(LVScopeCompileUnit *CompileUnit) {
+    size_t FileIndex = 0;
+    LVIndexFiles::iterator Iter = IndexFiles.find(CompileUnit);
+    if (Iter != IndexFiles.end())
+      FileIndex = Iter->second;
+    return FileIndex;
+  }
+
+  // Collect the compile unit metadata files.
+  using LVCompileUnitFiles = std::map<const DIFile *, size_t>;
+  LVCompileUnitFiles CompileUnitFiles;
+
+  size_t getOrCreateSourceID(cons...
[truncated]

@jalopezg-git
Copy link
Contributor

@CarlosAlbertoEnciso If this needs another pair of eyes, do not hesitate to add me as reviewer. Cheers!

@CarlosAlbertoEnciso
Copy link
Member Author

@CarlosAlbertoEnciso If this needs another pair of eyes, do not hesitate to add me as reviewer. Cheers!

@jalopezg-git Thanks; that would be great.

Copy link
Member

@jmorse jmorse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial comments; I made it to about 1000 lines into LVIRReader.cpp.

@CarlosAlbertoEnciso
Copy link
Member Author

Rebased branch to take into account #142740 (Move some functionality to LVReader).

@CarlosAlbertoEnciso
Copy link
Member Author

Uploaded new patch to address @jmorse comments.

@jalopezg-git jalopezg-git self-requested a review June 12, 2025 10:46
@CarlosAlbertoEnciso CarlosAlbertoEnciso changed the base branch from users/SLTozer/debug-ssa-updater to main September 16, 2025 20:43
@CarlosAlbertoEnciso CarlosAlbertoEnciso force-pushed the debuginfo-analyzer-ir-reader branch from 0bd9352 to 89741b2 Compare September 16, 2025 21:13
@github-actions
Copy link

github-actions bot commented Sep 16, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@CarlosAlbertoEnciso
Copy link
Member Author

The IR Reader depends on DebugSSAUpdater class to track debug value liveness #135349 which has landed.
This updated patch now is based on main.

@CarlosAlbertoEnciso CarlosAlbertoEnciso force-pushed the debuginfo-analyzer-ir-reader branch 3 times, most recently from b79dd90 to c4064d8 Compare September 16, 2025 21:56
@CarlosAlbertoEnciso CarlosAlbertoEnciso force-pushed the debuginfo-analyzer-ir-reader branch from 57ffe50 to e83dc1a Compare September 27, 2025 20:03
Add support for the LLVM IR format and be able to generate
logical views.

Both textual representation (.ll) and bitcode (.bc) format
are supported.

Note: This patch requires:

  Add DebugSSAUpdater class to track debug value liveness
  llvm#135349
Add support for the LLVM IR format and be able to generate
logical views.

Both textual representation (.ll) and bitcode (.bc) format
are supported.

Note: This patch requires:

Add DebugSSAUpdater class to track debug value liveness
llvm#135349

Note: Address reviewers comments.
Fixed compilation issues after latest DebugSSAUpdater rebase.
Updated test cases that failed due to an upstream fix related
to the high-pc value being now exclusive.

  llvm#153318
Updated test cases that failed due to the DebugSSAUpdater rebase.
Add a missing space between the offset and 'ptr' for a
better layout.

  [004] 9 {Variable} 'CONSTANT' -> 'const INTEGER'
  [006]     {Location}
  [007]       {Entry} bregx 3 ptr %CONSTANT+0
Fix issue reported by Clang format.
Fix issue with C++98 error:

  extra ';' outside of a function is incompatible with C++98
  [-Werror,-Wc++98-compat-extra-semi]
Changes required after rebase with upstream to include:

  Remove LVScope::Children container.
  llvm#144750
@CarlosAlbertoEnciso CarlosAlbertoEnciso force-pushed the debuginfo-analyzer-ir-reader branch from e83dc1a to c3adfbc Compare September 27, 2025 20:08
@CarlosAlbertoEnciso
Copy link
Member Author

CarlosAlbertoEnciso commented Sep 27, 2025

Rebase to include Remove LVScope::Children container.

Copy link
Contributor

@jalopezg-git jalopezg-git left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many thanks for this huge contribution, @CarlosAlbertoEnciso 👍!

I went through all the PR except for llvm/lib/DebugInfo/LogicalView/Readers/LVIRReader.cpp (which I am going to visit in a second pass next week).

I have attached some comments that I have collected.

is a human-readable representation that closely matches the structure
of the original user source code. Supported object file formats include
ELF, Mach-O, WebAssembly, PDB and COFF.
binary object files and textual IR representations and prints their
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the implementation in LVIRReader, it can take both, textual and bitcode representations.

Suggested change
binary object files and textual IR representations and prints their
binary object files or LLVM IR textual / bitcode representation and prints their

contents in a logical view, which is a human readable representation
that closely matches the structure of the original user source code.
Supported object file formats include ELF, Mach-O, WebAssembly, PDB,
COFF, IR (textual representation and bitcode).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
COFF, IR (textual representation and bitcode).
COFF, and IR (textual representation and bitcode).

Comment on lines +2135 to +2136
IR (Textual representation and bitcode) SUPPORT
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
IR (Textual representation and bitcode) SUPPORT
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
LLVM IR (textual / bitcode representation) SUPPORT
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

PRINT BASIC DETAILS
^^^^^^^^^^^^^^^^^^^
The following command prints basic details for all the logical elements
sorted by the debug information internal offset; it includes its lexical
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to use --output-sort=id now?

SELECT LOGICAL ELEMENTS
^^^^^^^^^^^^^^^^^^^^^^^
The following prints all *instructions*, *symbols* and *types* that
contain **'block'** or **'.store'** in their names or types, using a tab
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
contain **'block'** or **'.store'** in their names or types, using a tab
contain **'LOAD'** or **'store'** in their names or types, using a tab


const DICompileUnit *CUNode = nullptr;

// The Dwarf Version (from the module flags).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// The Dwarf Version (from the module flags).
// The DWARF version (from the module flags).

Comment on lines +115 to +116
if (MDObjects.find(MD) == MDObjects.end())
MDObjects.emplace(MD, Element);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider replacing find() + emplace() by try_emplace().

Suggested change
if (MDObjects.find(MD) == MDObjects.end())
MDObjects.emplace(MD, Element);
MDObjects.try_emplace(MD, Element);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO, we could add at least some brief documentation to some of the most important member functions in LVIRReader.


#define DEBUG_TYPE "Reader"

// Traverse all the logical elements and print its basic information.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Traverse all the logical elements and print its basic information.
// Traverse all the logical elements and print their basic information.

#define DEBUG_TYPE "Reader"

// Traverse all the logical elements and print its basic information.
void printCollectedElements(LVScope *Root) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shares some code (at least the form) w/ https://github.com/llvm/llvm-project/pull/135440/files#diff-3763f99a47de5b744d37fdc5a1af9e2ddb2dc3bb57fae42bea2a35f0dddafcb3R72. I wonder if we could do something about it 🤔.

Copy link
Contributor

@jalopezg-git jalopezg-git left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this huge work, @CarlosAlbertoEnciso 👍! I have finalized the second (and final) part of my review and I have attached some comments below.

I would appreciate though that some expert in the IR DebugInfo representation could take an additional look at llvm/lib/DebugInfo/LogicalView/Readers/LVIRReader.cpp.

//===----------------------------------------------------------------------===//
//
// This implements the LVIRReader class.
// It supports LLVM text IR and bitcode format.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// It supports LLVM text IR and bitcode format.
// It supports LLVM textual and bitcode IR format.


#define DEBUG_TYPE "IRReader"

// These flavours of DINodes are not implemented but technically possible:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// These flavours of DINodes are not implemented but technically possible:
// These flavours of `DINode`s are not implemented but technically possible:

Comment on lines +58 to +60
StringRef Name = getMDName(DN);
if (!Name.empty())
Element->setName(Name);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
StringRef Name = getMDName(DN);
if (!Name.empty())
Element->setName(Name);
if (StringRef Name = getMDName(DN); !Name.empty())
Element->setName(Name);

if (const DIFile *File = getMDFile(DN))
getOrCreateSourceID(File);
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

return Element;
}

void LVIRReader::mapFortranLanguage(unsigned DWLang) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After the introduction of LVSourceLanguage, I wonder if we could (somehow) take benefit from it here 🤔.

LVScope *Parent = Scope ? Scope : getParentScope(Ty);
Parent->addElement(Element);

if (/*const DIBasicType *BT =*/dyn_cast<DIBasicType>(Ty)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (/*const DIBasicType *BT =*/dyn_cast<DIBasicType>(Ty)) {
if (isa<DIBasicType>(Ty)) {

if (N > 1) {
const DIType *Ty = Args[N - 1];
if (!Ty)
AddUnspecifiedParameters = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto (about DW_TAG_unspecified_parameters) -- I am not sure about this one though.

BitCodeIR ? BitCodeIR->getMemoryBufferRef() : *TextualIR, Err, Context);
if (!M)
return createStringError(errc::invalid_argument,
"Could not create IR information: %s",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"Could not create IR information: %s",
"Could not create IR module for: %s",

Comment on lines +2039 to +2040
if (options().getAttributeProducer())
CompileUnit->setProducer(CU->getProducer());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should also set the CompileUnit source language, i.e.

Suggested change
if (options().getAttributeProducer())
CompileUnit->setProducer(CU->getProducer());
if (options().getAttributeLanguage())
CompileUnit->setSourceLanguage(LVSourceLanguage{static_cast<llvm::dwarf::SourceLanguage>(CU->getSourceLanguage())});
if (options().getAttributeProducer())
CompileUnit->setProducer(CU->getProducer());

Comment on lines +2263 to +2264
// During the IR Reader development, traverse all the logical elements
// to check if they have been properly constructed (finalized).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is a bit confusing; maybe you meant

Suggested change
// During the IR Reader development, traverse all the logical elements
// to check if they have been properly constructed (finalized).
// During the IR-to-logical-view construction, traverse all the logical elements
// to check if they have been properly constructed (finalized).

?

@CarlosAlbertoEnciso
Copy link
Member Author

@jalopezg-git Thanks very much for your feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants