-
Notifications
You must be signed in to change notification settings - Fork 13.3k
[llvm-debuginfo-analyzer] Add support for LLVM IR format. #135440
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: users/SLTozer/debug-ssa-updater
Are you sure you want to change the base?
[llvm-debuginfo-analyzer] Add support for LLVM IR format. #135440
Conversation
Add support for the LLVM IR format and be able to generate logical views. Both textual representation (.ll) and bitcode (.bc) format are supported. Note: This patch requires: Add DebugSSAUpdater class to track debug value liveness llvm#135349
@llvm/pr-subscribers-debuginfo Author: Carlos Alberto Enciso (CarlosAlbertoEnciso) ChangesAdd support for the LLVM IR format and be able to generate logical views. Both textual representation (.ll) and bitcode (.bc) format are supported. Note: This patch requires: Add DebugSSAUpdater class to track debug value liveness Patch is 226.09 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/135440.diff 35 Files Affected:
diff --git a/llvm/docs/CommandGuide/llvm-debuginfo-analyzer.rst b/llvm/docs/CommandGuide/llvm-debuginfo-analyzer.rst
index 60fa024db5e99..c5d3293c02f57 100644
--- a/llvm/docs/CommandGuide/llvm-debuginfo-analyzer.rst
+++ b/llvm/docs/CommandGuide/llvm-debuginfo-analyzer.rst
@@ -13,10 +13,11 @@ SYNOPSIS
DESCRIPTION
-----------
:program:`llvm-debuginfo-analyzer` parses debug and text sections in
-binary object files and prints their contents in a logical view, which
-is a human readable representation that closely matches the structure
-of the original user source code. Supported object file formats include
-ELF, Mach-O, WebAssembly, PDB and COFF.
+binary object files and textual IR representations and prints their
+contents in a logical view, which is a human readable representation
+that closely matches the structure of the original user source code.
+Supported object file formats include ELF, Mach-O, WebAssembly, PDB,
+COFF, IR (textual representation and bitcode).
The **logical view** abstracts the complexity associated with the
different low-level representations of the debugging information that
@@ -2124,6 +2125,138 @@ layout and given the number of matches.
-----------------------------
Total 71 8
+IR (Textual representation and bitcode) SUPPORT
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The below example is used to show the IR output generated by
+:program:`llvm-debuginfo-analyzer`. We compiled the example for a
+IR 64-bit target with Clang (-O0 -g --target=x86_64-linux):
+
+.. code-block:: c++
+
+ 1 using INTPTR = const int *;
+ 2 int foo(INTPTR ParamPtr, unsigned ParamUnsigned, bool ParamBool) {
+ 3 if (ParamBool) {
+ 4 typedef int INTEGER;
+ 5 const INTEGER CONSTANT = 7;
+ 6 return CONSTANT;
+ 7 }
+ 8 return ParamUnsigned;
+ 9 }
+
+PRINT BASIC DETAILS
+^^^^^^^^^^^^^^^^^^^
+The following command prints basic details for all the logical elements
+sorted by the debug information internal offset; it includes its lexical
+level and debug info format.
+
+.. code-block:: none
+
+ llvm-debuginfo-analyzer --attribute=level,format
+ --output-sort=offset
+ --print=scopes,symbols,types,lines,instructions
+ test-clang.ll
+
+or
+
+.. code-block:: none
+
+ llvm-debuginfo-analyzer --attribute=level,format
+ --output-sort=offset
+ --print=elements
+ test-clang.ll
+
+Each row represents an element that is present within the debug
+information. The first column represents the scope level, followed by
+the associated line number (if any), and finally the description of
+the element.
+
+.. code-block:: none
+
+ Logical View:
+ [000] {File} 'test-clang.ll' -> Textual IR
+
+ [001] {CompileUnit} 'test.cpp'
+ [002] 2 {Function} extern not_inlined 'foo' -> 'int'
+ [003] {Block}
+ [004] 5 {Variable} 'CONSTANT' -> 'const INTEGER'
+ [004] 5 {Line}
+ [004] {Code} 'store i32 7, ptr %CONSTANT, align 4, !dbg !32'
+ [004] 6 {Line}
+ [004] {Code} 'store i32 7, ptr %retval, align 4, !dbg !33'
+ [004] 6 {Line}
+ [004] {Code} 'br label %return, !dbg !33'
+ [003] 2 {Parameter} 'ParamPtr' -> 'INTPTR'
+ [003] 2 {Parameter} 'ParamUnsigned' -> 'unsigned int'
+ [003] 2 {Parameter} 'ParamBool' -> 'bool'
+ [003] 4 {TypeAlias} 'INTEGER' -> 'int'
+ [003] 2 {Line}
+ [003] {Code} '%retval = alloca i32, align 4'
+ [003] {Code} '%ParamPtr.addr = alloca ptr, align 8'
+ [003] {Code} '%ParamUnsigned.addr = alloca i32, align 4'
+ [003] {Code} '%ParamBool.addr = alloca i8, align 1'
+ [003] {Code} '%CONSTANT = alloca i32, align 4'
+ [003] {Code} 'store ptr %ParamPtr, ptr %ParamPtr.addr, align 8'
+ [003] {Code} 'store i32 %ParamUnsigned, ptr %ParamUnsigned.addr, align 4'
+ [003] {Code} '%storedv = zext i1 %ParamBool to i8'
+ [003] {Code} 'store i8 %storedv, ptr %ParamBool.addr, align 1'
+ [003] 8 {Line}
+ [003] {Code} '%1 = load i32, ptr %ParamUnsigned.addr, align 4, !dbg !34'
+ [003] 8 {Line}
+ [003] {Code} 'store i32 %1, ptr %retval, align 4, !dbg !35'
+ [003] 8 {Line}
+ [003] {Code} 'br label %return, !dbg !35'
+ [003] 9 {Line}
+ [003] {Code} '%2 = load i32, ptr %retval, align 4, !dbg !36'
+ [003] 9 {Line}
+ [003] {Code} 'ret i32 %2, !dbg !36'
+ [003] 3 {Line}
+ [003] 3 {Line}
+ [003] 3 {Line}
+ [003] {Code} 'br i1 %loadedv, label %if.then, label %if.end, !dbg !26'
+ [002] 1 {TypeAlias} 'INTPTR' -> '* const int'
+
+SELECT LOGICAL ELEMENTS
+^^^^^^^^^^^^^^^^^^^^^^^
+The following prints all *instructions*, *symbols* and *types* that
+contain **'block'** or **'.store'** in their names or types, using a tab
+layout and given the number of matches.
+
+.. code-block:: none
+
+ llvm-debuginfo-analyzer --attribute=level
+ --select-nocase --select-regex
+ --select=LOAD --select=store
+ --report=list
+ --print=symbols,types,instructions,summary
+ test-clang.ll
+
+ Logical View:
+ [000] {File} 'test-clang.ll'
+
+ [001] {CompileUnit} 'test.cpp'
+ [003] {Code} '%0 = load i8, ptr %ParamBool.addr, align 1, !dbg !26'
+ [003] {Code} '%1 = load i32, ptr %ParamUnsigned.addr, align 4, !dbg !34'
+ [003] {Code} '%2 = load i32, ptr %retval, align 4, !dbg !36'
+ [004] {Code} '%loadedv = trunc i8 %0 to i1, !dbg !26'
+ [003] {Code} '%storedv = zext i1 %ParamBool to i8'
+ [003] {Code} 'br i1 %loadedv, label %if.then, label %if.end, !dbg !26'
+ [003] {Code} 'store i32 %1, ptr %retval, align 4, !dbg !35'
+ [003] {Code} 'store i32 %ParamUnsigned, ptr %ParamUnsigned.addr, align 4'
+ [004] {Code} 'store i32 7, ptr %CONSTANT, align 4, !dbg !32'
+ [004] {Code} 'store i32 7, ptr %retval, align 4, !dbg !33'
+ [003] {Code} 'store i8 %storedv, ptr %ParamBool.addr, align 1'
+ [003] {Code} 'store ptr %ParamPtr, ptr %ParamPtr.addr, align 8'
+
+ -----------------------------
+ Element Total Printed
+ -----------------------------
+ Scopes 5 0
+ Symbols 4 0
+ Types 2 0
+ Lines 22 12
+ -----------------------------
+ Total 33 12
+
COMPARISON MODE
^^^^^^^^^^^^^^^
Given the previous example we found the above debug information issue
@@ -2197,6 +2330,34 @@ giving more context by swapping the reference and target object files.
The output shows the merging view path (reference and target) with the
missing and added elements.
+.. code-block:: none
+
+ llvm-debuginfo-analyzer --attribute=level,format
+ --compare=types
+ --report=view
+ --print=symbols,types
+ test-clang.bc test-dwarf-gcc.o
+
+ Reference: 'test-clang.bc'
+ Target: 'test-dwarf-gcc.o'
+
+ Logical View:
+ [000] {File} 'test-clang.bc' -> Bitcode IR
+
+ [001] {CompileUnit} 'test.cpp'
+ [002] 1 {TypeAlias} 'INTPTR' -> '* const int'
+ [002] 2 {Function} extern not_inlined 'foo' -> 'int'
+ [003] {Block}
+ [004] 5 {Variable} 'CONSTANT' -> 'const INTEGER'
+ +[004] 4 {TypeAlias} 'INTEGER' -> 'int'
+ [003] 2 {Parameter} 'ParamBool' -> 'bool'
+ [003] 2 {Parameter} 'ParamPtr' -> 'INTPTR'
+ [003] 2 {Parameter} 'ParamUnsigned' -> 'unsigned int'
+ -[003] 4 {TypeAlias} 'INTEGER' -> 'int'
+
+The same output but this time comparing the Clang bitcode with the
+binary object (DWARF) generated by GCC.
+
LOGICAL ELEMENTS
""""""""""""""""
It compares individual logical elements without considering if their
diff --git a/llvm/include/llvm/DebugInfo/LogicalView/Core/LVReader.h b/llvm/include/llvm/DebugInfo/LogicalView/Core/LVReader.h
index 9ce26398e48df..c848ea44f63a7 100644
--- a/llvm/include/llvm/DebugInfo/LogicalView/Core/LVReader.h
+++ b/llvm/include/llvm/DebugInfo/LogicalView/Core/LVReader.h
@@ -56,7 +56,7 @@ class LVSplitContext final {
/// The logical reader owns of all the logical elements created during
/// the debug information parsing. For its creation it uses a specific
-/// bump allocator for each type of logical element.
+/// bump allocator for each type of logical element.
class LVReader {
LVBinaryType BinaryType;
@@ -121,7 +121,24 @@ class LVReader {
#undef LV_OBJECT_ALLOCATOR
+ // Scopes with ranges for current compile unit. It is used to find a line
+ // giving its exact or closest address. To support comdat functions, all
+ // addresses for the same section are recorded in the same map.
+ using LVSectionRanges = std::map<LVSectionIndex, std::unique_ptr<LVRange>>;
+ LVSectionRanges SectionRanges;
+
protected:
+ // Current elements during the processing of a DIE/MDNode.
+ LVElement *CurrentElement = nullptr;
+ LVScope *CurrentScope = nullptr;
+ LVSymbol *CurrentSymbol = nullptr;
+ LVType *CurrentType = nullptr;
+ LVLine *CurrentLine = nullptr;
+ LVOffset CurrentOffset = 0;
+
+ // Address ranges collected for current DIE/MDNode/AST Node.
+ std::vector<LVAddressRange> CurrentRanges;
+
LVScopeRoot *Root = nullptr;
std::string InputFilename;
std::string FileFormatName;
@@ -132,11 +149,18 @@ class LVReader {
// Only for ELF format. The CodeView is handled in a different way.
LVSectionIndex DotTextSectionIndex = UndefinedSectionIndex;
+ void addSectionRange(LVSectionIndex SectionIndex, LVScope *Scope);
+ void addSectionRange(LVSectionIndex SectionIndex, LVScope *Scope,
+ LVAddress LowerAddress, LVAddress UpperAddress);
+ LVRange *getSectionRanges(LVSectionIndex SectionIndex);
+
// Record Compilation Unit entry.
void addCompileUnitOffset(LVOffset Offset, LVScopeCompileUnit *CompileUnit) {
CompileUnits.emplace(Offset, CompileUnit);
}
+ LVElement *createElement(dwarf::Tag Tag);
+
// Create the Scope Root.
virtual Error createScopes() {
Root = createScopeRoot();
diff --git a/llvm/include/llvm/DebugInfo/LogicalView/Core/LVSupport.h b/llvm/include/llvm/DebugInfo/LogicalView/Core/LVSupport.h
index 8269344fe6efe..28f6dbcf4e2ae 100644
--- a/llvm/include/llvm/DebugInfo/LogicalView/Core/LVSupport.h
+++ b/llvm/include/llvm/DebugInfo/LogicalView/Core/LVSupport.h
@@ -99,6 +99,19 @@ template <typename T> class LVProperties {
#define KIND_3(ENUM, FIELD, F1, F2, F3) \
BOOL_BIT_3(Kinds, ENUM, FIELD, F1, F2, F3)
+const int DEC_WIDTH = 8;
+inline FormattedNumber decValue(uint64_t N, unsigned Width = DEC_WIDTH) {
+ return format_decimal(N, Width);
+}
+
+// Output the decimal representation of 'Value'.
+inline std::string decString(uint64_t Value, size_t Width = DEC_WIDTH) {
+ std::string String;
+ raw_string_ostream Stream(String);
+ Stream << decValue(Value, Width);
+ return Stream.str();
+}
+
const int HEX_WIDTH = 12;
inline FormattedNumber hexValue(uint64_t N, unsigned Width = HEX_WIDTH,
bool Upper = false) {
diff --git a/llvm/include/llvm/DebugInfo/LogicalView/LVReaderHandler.h b/llvm/include/llvm/DebugInfo/LogicalView/LVReaderHandler.h
index bf30501d00c1f..c743e517e371b 100644
--- a/llvm/include/llvm/DebugInfo/LogicalView/LVReaderHandler.h
+++ b/llvm/include/llvm/DebugInfo/LogicalView/LVReaderHandler.h
@@ -17,6 +17,7 @@
#include "llvm/DebugInfo/LogicalView/Core/LVReader.h"
#include "llvm/DebugInfo/PDB/Native/PDBFile.h"
#include "llvm/Object/Archive.h"
+#include "llvm/Object/IRObjectFile.h"
#include "llvm/Object/MachOUniversal.h"
#include "llvm/Object/ObjectFile.h"
#include "llvm/Support/MemoryBuffer.h"
@@ -29,7 +30,9 @@ namespace logicalview {
using LVReaders = std::vector<std::unique_ptr<LVReader>>;
using ArgVector = std::vector<std::string>;
-using PdbOrObj = PointerUnion<object::ObjectFile *, pdb::PDBFile *>;
+using PdbOrObjOrIr =
+ PointerUnion<object::ObjectFile *, pdb::PDBFile *, object::IRObjectFile *,
+ MemoryBufferRef *, StringRef *>;
// This class performs the following tasks:
// - Creates a logical reader for every binary file in the command line,
@@ -60,9 +63,12 @@ class LVReaderHandler {
object::Binary &Binary);
Error handleObject(LVReaders &Readers, StringRef Filename, StringRef Buffer,
StringRef ExePath);
+ Error handleObject(LVReaders &Readers, StringRef Filename,
+ MemoryBufferRef Buffer);
- Error createReader(StringRef Filename, LVReaders &Readers, PdbOrObj &Input,
- StringRef FileFormatName, StringRef ExePath = {});
+ Error createReader(StringRef Filename, LVReaders &Readers,
+ PdbOrObjOrIr &Input, StringRef FileFormatName,
+ StringRef ExePath = {});
public:
LVReaderHandler() = delete;
diff --git a/llvm/include/llvm/DebugInfo/LogicalView/Readers/LVBinaryReader.h b/llvm/include/llvm/DebugInfo/LogicalView/Readers/LVBinaryReader.h
index 9cda64e33ddf7..bd897a9944bf7 100644
--- a/llvm/include/llvm/DebugInfo/LogicalView/Readers/LVBinaryReader.h
+++ b/llvm/include/llvm/DebugInfo/LogicalView/Readers/LVBinaryReader.h
@@ -25,6 +25,7 @@
#include "llvm/MC/MCSubtargetInfo.h"
#include "llvm/MC/TargetRegistry.h"
#include "llvm/Object/COFF.h"
+#include "llvm/Object/IRObjectFile.h"
#include "llvm/Object/ObjectFile.h"
namespace llvm {
@@ -93,12 +94,6 @@ class LVBinaryReader : public LVReader {
SectionAddresses.emplace(Section.getAddress(), Section);
}
- // Scopes with ranges for current compile unit. It is used to find a line
- // giving its exact or closest address. To support comdat functions, all
- // addresses for the same section are recorded in the same map.
- using LVSectionRanges = std::map<LVSectionIndex, std::unique_ptr<LVRange>>;
- LVSectionRanges SectionRanges;
-
// Image base and virtual address for Executable file.
uint64_t ImageBaseAddress = 0;
uint64_t VirtualAddress = 0;
@@ -179,11 +174,6 @@ class LVBinaryReader : public LVReader {
Expected<std::pair<LVSectionIndex, object::SectionRef>>
getSection(LVScope *Scope, LVAddress Address, LVSectionIndex SectionIndex);
- void addSectionRange(LVSectionIndex SectionIndex, LVScope *Scope);
- void addSectionRange(LVSectionIndex SectionIndex, LVScope *Scope,
- LVAddress LowerAddress, LVAddress UpperAddress);
- LVRange *getSectionRanges(LVSectionIndex SectionIndex);
-
void includeInlineeLines(LVSectionIndex SectionIndex, LVScope *Function);
Error createInstructions();
diff --git a/llvm/include/llvm/DebugInfo/LogicalView/Readers/LVDWARFReader.h b/llvm/include/llvm/DebugInfo/LogicalView/Readers/LVDWARFReader.h
index fdc97249d8e5a..3fb099924b5f2 100644
--- a/llvm/include/llvm/DebugInfo/LogicalView/Readers/LVDWARFReader.h
+++ b/llvm/include/llvm/DebugInfo/LogicalView/Readers/LVDWARFReader.h
@@ -39,12 +39,6 @@ class LVDWARFReader final : public LVBinaryReader {
LVAddress CUBaseAddress = 0;
LVAddress CUHighAddress = 0;
- // Current elements during the processing of a DIE.
- LVElement *CurrentElement = nullptr;
- LVScope *CurrentScope = nullptr;
- LVSymbol *CurrentSymbol = nullptr;
- LVType *CurrentType = nullptr;
- LVOffset CurrentOffset = 0;
LVOffset CurrentEndOffset = 0;
// In DWARF v4, the files are 1-indexed.
@@ -52,9 +46,6 @@ class LVDWARFReader final : public LVBinaryReader {
// The DWARF reader expects the indexes as 1-indexed.
bool IncrementFileIndex = false;
- // Address ranges collected for current DIE.
- std::vector<LVAddressRange> CurrentRanges;
-
// Symbols with locations for current compile unit.
LVSymbols SymbolsWithLocations;
@@ -82,7 +73,6 @@ class LVDWARFReader final : public LVBinaryReader {
void mapRangeAddress(const object::ObjectFile &Obj) override;
- LVElement *createElement(dwarf::Tag Tag);
void traverseDieAndChildren(DWARFDie &DIE, LVScope *Parent,
DWARFDie &SkeletonDie);
// Process the attributes for the given DIE.
diff --git a/llvm/include/llvm/DebugInfo/LogicalView/Readers/LVIRReader.h b/llvm/include/llvm/DebugInfo/LogicalView/Readers/LVIRReader.h
new file mode 100644
index 0000000000000..5c70ec9077791
--- /dev/null
+++ b/llvm/include/llvm/DebugInfo/LogicalView/Readers/LVIRReader.h
@@ -0,0 +1,300 @@
+//===-- LVIRReader.h --------------------------------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This file defines the LVIRReader class, which is used to describe a
+// LLVM IR reader.
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_DEBUGINFO_LOGICALVIEW_READERS_LVIRREADER_H
+#define LLVM_DEBUGINFO_LOGICALVIEW_READERS_LVIRREADER_H
+
+#include "llvm/DebugInfo/LogicalView/Core/LVReader.h"
+#include "llvm/Transforms/Utils/DebugSSAUpdater.h"
+
+namespace llvm {
+class DIFile;
+class DINode;
+class DILocation;
+class DIScope;
+class DISubprogram;
+class DIVariable;
+class BasicBlock;
+
+namespace object {
+class IRObjectFile;
+}
+
+namespace logicalview {
+
+class LVElement;
+class LVLine;
+class LVScopeCompileUnit;
+class LVSymbol;
+class LVType;
+
+class LVIRReader final : public LVReader {
+ object::IRObjectFile *BitCodeIR = nullptr;
+ MemoryBufferRef *TextualIR = nullptr;
+
+ // Symbols with locations for current compile unit.
+ LVSymbols SymbolsWithLocations;
+
+ LVSectionIndex SectionIndex = 0;
+
+ const DICompileUnit *CUNode = nullptr;
+
+ // The Dwarf Version (from the module flags).
+ unsigned DwarfVersion;
+
+ // Location index for global variables.
+ uint64_t PoolAddressIndex = 0;
+
+ // Whether to emit all linkage names, or just abstract subprograms.
+ bool UseAllLinkageNames = true;
+
+ // Dependencies on external options (llc, etc).
+ bool includeMinimalInlineScopes() const;
+ bool useAllLinkageNames() const { return UseAllLinkageNames; }
+
+ bool LanguageIsFortran = false;
+ void mapFortranLanguage(unsigned DWLang);
+ bool moduleIsInFortran() const { return LanguageIsFortran; }
+
+ // Generate logical debug line before prologue.
+ bool GenerateLineBeforePrologue = true;
+
+ // We assume a constante increase between instructions.
+ const unsigned OffsetIncrease = 4;
+ void updateLineOffset() { CurrentOffset += OffsetIncrease; }
+
+ // An anonymous type for index type.
+ LVType *NodeIndexType = nullptr;
+
+ std::unique_ptr<DbgValueRangeTable> DbgValueRanges;
+
+ // Record the last assigned file index for each compile unit.
+ using LVIndexFiles = std::map<LVScopeCompileUnit *, size_t>;
+ LVIndexFiles IndexFiles;
+
+ void updateFileIndex(LVScopeCompileUnit *CompileUnit, size_t FileIndex) {
+ LVIndexFiles::iterator Iter = IndexFiles.find(CompileUnit);
+ if (Iter == IndexFiles.end())
+ IndexFiles.emplace(CompileUnit, FileIndex);
+ else
+ Iter->second = FileIndex;
+ }
+
+ // Get the current assigned index file for the given compile unit.
+ size_t getFileIndex(LVScopeCompileUnit *CompileUnit) {
+ size_t FileIndex = 0;
+ LVIndexFiles::iterator Iter = IndexFiles.find(CompileUnit);
+ if (Iter != IndexFiles.end())
+ FileIndex = Iter->second;
+ return FileIndex;
+ }
+
+ // Collect the compile unit metadata files.
+ using LVCompileUnitFiles = std::map<const DIFile *, size_t>;
+ LVCompileUnitFiles CompileUnitFiles;
+
+ size_t getOrCreateSourceID(cons...
[truncated]
|
@llvm/pr-subscribers-llvm-binary-utilities Author: Carlos Alberto Enciso (CarlosAlbertoEnciso) ChangesAdd support for the LLVM IR format and be able to generate logical views. Both textual representation (.ll) and bitcode (.bc) format are supported. Note: This patch requires: Add DebugSSAUpdater class to track debug value liveness Patch is 226.09 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/135440.diff 35 Files Affected:
diff --git a/llvm/docs/CommandGuide/llvm-debuginfo-analyzer.rst b/llvm/docs/CommandGuide/llvm-debuginfo-analyzer.rst
index 60fa024db5e99..c5d3293c02f57 100644
--- a/llvm/docs/CommandGuide/llvm-debuginfo-analyzer.rst
+++ b/llvm/docs/CommandGuide/llvm-debuginfo-analyzer.rst
@@ -13,10 +13,11 @@ SYNOPSIS
DESCRIPTION
-----------
:program:`llvm-debuginfo-analyzer` parses debug and text sections in
-binary object files and prints their contents in a logical view, which
-is a human readable representation that closely matches the structure
-of the original user source code. Supported object file formats include
-ELF, Mach-O, WebAssembly, PDB and COFF.
+binary object files and textual IR representations and prints their
+contents in a logical view, which is a human readable representation
+that closely matches the structure of the original user source code.
+Supported object file formats include ELF, Mach-O, WebAssembly, PDB,
+COFF, IR (textual representation and bitcode).
The **logical view** abstracts the complexity associated with the
different low-level representations of the debugging information that
@@ -2124,6 +2125,138 @@ layout and given the number of matches.
-----------------------------
Total 71 8
+IR (Textual representation and bitcode) SUPPORT
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The below example is used to show the IR output generated by
+:program:`llvm-debuginfo-analyzer`. We compiled the example for a
+IR 64-bit target with Clang (-O0 -g --target=x86_64-linux):
+
+.. code-block:: c++
+
+ 1 using INTPTR = const int *;
+ 2 int foo(INTPTR ParamPtr, unsigned ParamUnsigned, bool ParamBool) {
+ 3 if (ParamBool) {
+ 4 typedef int INTEGER;
+ 5 const INTEGER CONSTANT = 7;
+ 6 return CONSTANT;
+ 7 }
+ 8 return ParamUnsigned;
+ 9 }
+
+PRINT BASIC DETAILS
+^^^^^^^^^^^^^^^^^^^
+The following command prints basic details for all the logical elements
+sorted by the debug information internal offset; it includes its lexical
+level and debug info format.
+
+.. code-block:: none
+
+ llvm-debuginfo-analyzer --attribute=level,format
+ --output-sort=offset
+ --print=scopes,symbols,types,lines,instructions
+ test-clang.ll
+
+or
+
+.. code-block:: none
+
+ llvm-debuginfo-analyzer --attribute=level,format
+ --output-sort=offset
+ --print=elements
+ test-clang.ll
+
+Each row represents an element that is present within the debug
+information. The first column represents the scope level, followed by
+the associated line number (if any), and finally the description of
+the element.
+
+.. code-block:: none
+
+ Logical View:
+ [000] {File} 'test-clang.ll' -> Textual IR
+
+ [001] {CompileUnit} 'test.cpp'
+ [002] 2 {Function} extern not_inlined 'foo' -> 'int'
+ [003] {Block}
+ [004] 5 {Variable} 'CONSTANT' -> 'const INTEGER'
+ [004] 5 {Line}
+ [004] {Code} 'store i32 7, ptr %CONSTANT, align 4, !dbg !32'
+ [004] 6 {Line}
+ [004] {Code} 'store i32 7, ptr %retval, align 4, !dbg !33'
+ [004] 6 {Line}
+ [004] {Code} 'br label %return, !dbg !33'
+ [003] 2 {Parameter} 'ParamPtr' -> 'INTPTR'
+ [003] 2 {Parameter} 'ParamUnsigned' -> 'unsigned int'
+ [003] 2 {Parameter} 'ParamBool' -> 'bool'
+ [003] 4 {TypeAlias} 'INTEGER' -> 'int'
+ [003] 2 {Line}
+ [003] {Code} '%retval = alloca i32, align 4'
+ [003] {Code} '%ParamPtr.addr = alloca ptr, align 8'
+ [003] {Code} '%ParamUnsigned.addr = alloca i32, align 4'
+ [003] {Code} '%ParamBool.addr = alloca i8, align 1'
+ [003] {Code} '%CONSTANT = alloca i32, align 4'
+ [003] {Code} 'store ptr %ParamPtr, ptr %ParamPtr.addr, align 8'
+ [003] {Code} 'store i32 %ParamUnsigned, ptr %ParamUnsigned.addr, align 4'
+ [003] {Code} '%storedv = zext i1 %ParamBool to i8'
+ [003] {Code} 'store i8 %storedv, ptr %ParamBool.addr, align 1'
+ [003] 8 {Line}
+ [003] {Code} '%1 = load i32, ptr %ParamUnsigned.addr, align 4, !dbg !34'
+ [003] 8 {Line}
+ [003] {Code} 'store i32 %1, ptr %retval, align 4, !dbg !35'
+ [003] 8 {Line}
+ [003] {Code} 'br label %return, !dbg !35'
+ [003] 9 {Line}
+ [003] {Code} '%2 = load i32, ptr %retval, align 4, !dbg !36'
+ [003] 9 {Line}
+ [003] {Code} 'ret i32 %2, !dbg !36'
+ [003] 3 {Line}
+ [003] 3 {Line}
+ [003] 3 {Line}
+ [003] {Code} 'br i1 %loadedv, label %if.then, label %if.end, !dbg !26'
+ [002] 1 {TypeAlias} 'INTPTR' -> '* const int'
+
+SELECT LOGICAL ELEMENTS
+^^^^^^^^^^^^^^^^^^^^^^^
+The following prints all *instructions*, *symbols* and *types* that
+contain **'block'** or **'.store'** in their names or types, using a tab
+layout and given the number of matches.
+
+.. code-block:: none
+
+ llvm-debuginfo-analyzer --attribute=level
+ --select-nocase --select-regex
+ --select=LOAD --select=store
+ --report=list
+ --print=symbols,types,instructions,summary
+ test-clang.ll
+
+ Logical View:
+ [000] {File} 'test-clang.ll'
+
+ [001] {CompileUnit} 'test.cpp'
+ [003] {Code} '%0 = load i8, ptr %ParamBool.addr, align 1, !dbg !26'
+ [003] {Code} '%1 = load i32, ptr %ParamUnsigned.addr, align 4, !dbg !34'
+ [003] {Code} '%2 = load i32, ptr %retval, align 4, !dbg !36'
+ [004] {Code} '%loadedv = trunc i8 %0 to i1, !dbg !26'
+ [003] {Code} '%storedv = zext i1 %ParamBool to i8'
+ [003] {Code} 'br i1 %loadedv, label %if.then, label %if.end, !dbg !26'
+ [003] {Code} 'store i32 %1, ptr %retval, align 4, !dbg !35'
+ [003] {Code} 'store i32 %ParamUnsigned, ptr %ParamUnsigned.addr, align 4'
+ [004] {Code} 'store i32 7, ptr %CONSTANT, align 4, !dbg !32'
+ [004] {Code} 'store i32 7, ptr %retval, align 4, !dbg !33'
+ [003] {Code} 'store i8 %storedv, ptr %ParamBool.addr, align 1'
+ [003] {Code} 'store ptr %ParamPtr, ptr %ParamPtr.addr, align 8'
+
+ -----------------------------
+ Element Total Printed
+ -----------------------------
+ Scopes 5 0
+ Symbols 4 0
+ Types 2 0
+ Lines 22 12
+ -----------------------------
+ Total 33 12
+
COMPARISON MODE
^^^^^^^^^^^^^^^
Given the previous example we found the above debug information issue
@@ -2197,6 +2330,34 @@ giving more context by swapping the reference and target object files.
The output shows the merging view path (reference and target) with the
missing and added elements.
+.. code-block:: none
+
+ llvm-debuginfo-analyzer --attribute=level,format
+ --compare=types
+ --report=view
+ --print=symbols,types
+ test-clang.bc test-dwarf-gcc.o
+
+ Reference: 'test-clang.bc'
+ Target: 'test-dwarf-gcc.o'
+
+ Logical View:
+ [000] {File} 'test-clang.bc' -> Bitcode IR
+
+ [001] {CompileUnit} 'test.cpp'
+ [002] 1 {TypeAlias} 'INTPTR' -> '* const int'
+ [002] 2 {Function} extern not_inlined 'foo' -> 'int'
+ [003] {Block}
+ [004] 5 {Variable} 'CONSTANT' -> 'const INTEGER'
+ +[004] 4 {TypeAlias} 'INTEGER' -> 'int'
+ [003] 2 {Parameter} 'ParamBool' -> 'bool'
+ [003] 2 {Parameter} 'ParamPtr' -> 'INTPTR'
+ [003] 2 {Parameter} 'ParamUnsigned' -> 'unsigned int'
+ -[003] 4 {TypeAlias} 'INTEGER' -> 'int'
+
+The same output but this time comparing the Clang bitcode with the
+binary object (DWARF) generated by GCC.
+
LOGICAL ELEMENTS
""""""""""""""""
It compares individual logical elements without considering if their
diff --git a/llvm/include/llvm/DebugInfo/LogicalView/Core/LVReader.h b/llvm/include/llvm/DebugInfo/LogicalView/Core/LVReader.h
index 9ce26398e48df..c848ea44f63a7 100644
--- a/llvm/include/llvm/DebugInfo/LogicalView/Core/LVReader.h
+++ b/llvm/include/llvm/DebugInfo/LogicalView/Core/LVReader.h
@@ -56,7 +56,7 @@ class LVSplitContext final {
/// The logical reader owns of all the logical elements created during
/// the debug information parsing. For its creation it uses a specific
-/// bump allocator for each type of logical element.
+/// bump allocator for each type of logical element.
class LVReader {
LVBinaryType BinaryType;
@@ -121,7 +121,24 @@ class LVReader {
#undef LV_OBJECT_ALLOCATOR
+ // Scopes with ranges for current compile unit. It is used to find a line
+ // giving its exact or closest address. To support comdat functions, all
+ // addresses for the same section are recorded in the same map.
+ using LVSectionRanges = std::map<LVSectionIndex, std::unique_ptr<LVRange>>;
+ LVSectionRanges SectionRanges;
+
protected:
+ // Current elements during the processing of a DIE/MDNode.
+ LVElement *CurrentElement = nullptr;
+ LVScope *CurrentScope = nullptr;
+ LVSymbol *CurrentSymbol = nullptr;
+ LVType *CurrentType = nullptr;
+ LVLine *CurrentLine = nullptr;
+ LVOffset CurrentOffset = 0;
+
+ // Address ranges collected for current DIE/MDNode/AST Node.
+ std::vector<LVAddressRange> CurrentRanges;
+
LVScopeRoot *Root = nullptr;
std::string InputFilename;
std::string FileFormatName;
@@ -132,11 +149,18 @@ class LVReader {
// Only for ELF format. The CodeView is handled in a different way.
LVSectionIndex DotTextSectionIndex = UndefinedSectionIndex;
+ void addSectionRange(LVSectionIndex SectionIndex, LVScope *Scope);
+ void addSectionRange(LVSectionIndex SectionIndex, LVScope *Scope,
+ LVAddress LowerAddress, LVAddress UpperAddress);
+ LVRange *getSectionRanges(LVSectionIndex SectionIndex);
+
// Record Compilation Unit entry.
void addCompileUnitOffset(LVOffset Offset, LVScopeCompileUnit *CompileUnit) {
CompileUnits.emplace(Offset, CompileUnit);
}
+ LVElement *createElement(dwarf::Tag Tag);
+
// Create the Scope Root.
virtual Error createScopes() {
Root = createScopeRoot();
diff --git a/llvm/include/llvm/DebugInfo/LogicalView/Core/LVSupport.h b/llvm/include/llvm/DebugInfo/LogicalView/Core/LVSupport.h
index 8269344fe6efe..28f6dbcf4e2ae 100644
--- a/llvm/include/llvm/DebugInfo/LogicalView/Core/LVSupport.h
+++ b/llvm/include/llvm/DebugInfo/LogicalView/Core/LVSupport.h
@@ -99,6 +99,19 @@ template <typename T> class LVProperties {
#define KIND_3(ENUM, FIELD, F1, F2, F3) \
BOOL_BIT_3(Kinds, ENUM, FIELD, F1, F2, F3)
+const int DEC_WIDTH = 8;
+inline FormattedNumber decValue(uint64_t N, unsigned Width = DEC_WIDTH) {
+ return format_decimal(N, Width);
+}
+
+// Output the decimal representation of 'Value'.
+inline std::string decString(uint64_t Value, size_t Width = DEC_WIDTH) {
+ std::string String;
+ raw_string_ostream Stream(String);
+ Stream << decValue(Value, Width);
+ return Stream.str();
+}
+
const int HEX_WIDTH = 12;
inline FormattedNumber hexValue(uint64_t N, unsigned Width = HEX_WIDTH,
bool Upper = false) {
diff --git a/llvm/include/llvm/DebugInfo/LogicalView/LVReaderHandler.h b/llvm/include/llvm/DebugInfo/LogicalView/LVReaderHandler.h
index bf30501d00c1f..c743e517e371b 100644
--- a/llvm/include/llvm/DebugInfo/LogicalView/LVReaderHandler.h
+++ b/llvm/include/llvm/DebugInfo/LogicalView/LVReaderHandler.h
@@ -17,6 +17,7 @@
#include "llvm/DebugInfo/LogicalView/Core/LVReader.h"
#include "llvm/DebugInfo/PDB/Native/PDBFile.h"
#include "llvm/Object/Archive.h"
+#include "llvm/Object/IRObjectFile.h"
#include "llvm/Object/MachOUniversal.h"
#include "llvm/Object/ObjectFile.h"
#include "llvm/Support/MemoryBuffer.h"
@@ -29,7 +30,9 @@ namespace logicalview {
using LVReaders = std::vector<std::unique_ptr<LVReader>>;
using ArgVector = std::vector<std::string>;
-using PdbOrObj = PointerUnion<object::ObjectFile *, pdb::PDBFile *>;
+using PdbOrObjOrIr =
+ PointerUnion<object::ObjectFile *, pdb::PDBFile *, object::IRObjectFile *,
+ MemoryBufferRef *, StringRef *>;
// This class performs the following tasks:
// - Creates a logical reader for every binary file in the command line,
@@ -60,9 +63,12 @@ class LVReaderHandler {
object::Binary &Binary);
Error handleObject(LVReaders &Readers, StringRef Filename, StringRef Buffer,
StringRef ExePath);
+ Error handleObject(LVReaders &Readers, StringRef Filename,
+ MemoryBufferRef Buffer);
- Error createReader(StringRef Filename, LVReaders &Readers, PdbOrObj &Input,
- StringRef FileFormatName, StringRef ExePath = {});
+ Error createReader(StringRef Filename, LVReaders &Readers,
+ PdbOrObjOrIr &Input, StringRef FileFormatName,
+ StringRef ExePath = {});
public:
LVReaderHandler() = delete;
diff --git a/llvm/include/llvm/DebugInfo/LogicalView/Readers/LVBinaryReader.h b/llvm/include/llvm/DebugInfo/LogicalView/Readers/LVBinaryReader.h
index 9cda64e33ddf7..bd897a9944bf7 100644
--- a/llvm/include/llvm/DebugInfo/LogicalView/Readers/LVBinaryReader.h
+++ b/llvm/include/llvm/DebugInfo/LogicalView/Readers/LVBinaryReader.h
@@ -25,6 +25,7 @@
#include "llvm/MC/MCSubtargetInfo.h"
#include "llvm/MC/TargetRegistry.h"
#include "llvm/Object/COFF.h"
+#include "llvm/Object/IRObjectFile.h"
#include "llvm/Object/ObjectFile.h"
namespace llvm {
@@ -93,12 +94,6 @@ class LVBinaryReader : public LVReader {
SectionAddresses.emplace(Section.getAddress(), Section);
}
- // Scopes with ranges for current compile unit. It is used to find a line
- // giving its exact or closest address. To support comdat functions, all
- // addresses for the same section are recorded in the same map.
- using LVSectionRanges = std::map<LVSectionIndex, std::unique_ptr<LVRange>>;
- LVSectionRanges SectionRanges;
-
// Image base and virtual address for Executable file.
uint64_t ImageBaseAddress = 0;
uint64_t VirtualAddress = 0;
@@ -179,11 +174,6 @@ class LVBinaryReader : public LVReader {
Expected<std::pair<LVSectionIndex, object::SectionRef>>
getSection(LVScope *Scope, LVAddress Address, LVSectionIndex SectionIndex);
- void addSectionRange(LVSectionIndex SectionIndex, LVScope *Scope);
- void addSectionRange(LVSectionIndex SectionIndex, LVScope *Scope,
- LVAddress LowerAddress, LVAddress UpperAddress);
- LVRange *getSectionRanges(LVSectionIndex SectionIndex);
-
void includeInlineeLines(LVSectionIndex SectionIndex, LVScope *Function);
Error createInstructions();
diff --git a/llvm/include/llvm/DebugInfo/LogicalView/Readers/LVDWARFReader.h b/llvm/include/llvm/DebugInfo/LogicalView/Readers/LVDWARFReader.h
index fdc97249d8e5a..3fb099924b5f2 100644
--- a/llvm/include/llvm/DebugInfo/LogicalView/Readers/LVDWARFReader.h
+++ b/llvm/include/llvm/DebugInfo/LogicalView/Readers/LVDWARFReader.h
@@ -39,12 +39,6 @@ class LVDWARFReader final : public LVBinaryReader {
LVAddress CUBaseAddress = 0;
LVAddress CUHighAddress = 0;
- // Current elements during the processing of a DIE.
- LVElement *CurrentElement = nullptr;
- LVScope *CurrentScope = nullptr;
- LVSymbol *CurrentSymbol = nullptr;
- LVType *CurrentType = nullptr;
- LVOffset CurrentOffset = 0;
LVOffset CurrentEndOffset = 0;
// In DWARF v4, the files are 1-indexed.
@@ -52,9 +46,6 @@ class LVDWARFReader final : public LVBinaryReader {
// The DWARF reader expects the indexes as 1-indexed.
bool IncrementFileIndex = false;
- // Address ranges collected for current DIE.
- std::vector<LVAddressRange> CurrentRanges;
-
// Symbols with locations for current compile unit.
LVSymbols SymbolsWithLocations;
@@ -82,7 +73,6 @@ class LVDWARFReader final : public LVBinaryReader {
void mapRangeAddress(const object::ObjectFile &Obj) override;
- LVElement *createElement(dwarf::Tag Tag);
void traverseDieAndChildren(DWARFDie &DIE, LVScope *Parent,
DWARFDie &SkeletonDie);
// Process the attributes for the given DIE.
diff --git a/llvm/include/llvm/DebugInfo/LogicalView/Readers/LVIRReader.h b/llvm/include/llvm/DebugInfo/LogicalView/Readers/LVIRReader.h
new file mode 100644
index 0000000000000..5c70ec9077791
--- /dev/null
+++ b/llvm/include/llvm/DebugInfo/LogicalView/Readers/LVIRReader.h
@@ -0,0 +1,300 @@
+//===-- LVIRReader.h --------------------------------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This file defines the LVIRReader class, which is used to describe a
+// LLVM IR reader.
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_DEBUGINFO_LOGICALVIEW_READERS_LVIRREADER_H
+#define LLVM_DEBUGINFO_LOGICALVIEW_READERS_LVIRREADER_H
+
+#include "llvm/DebugInfo/LogicalView/Core/LVReader.h"
+#include "llvm/Transforms/Utils/DebugSSAUpdater.h"
+
+namespace llvm {
+class DIFile;
+class DINode;
+class DILocation;
+class DIScope;
+class DISubprogram;
+class DIVariable;
+class BasicBlock;
+
+namespace object {
+class IRObjectFile;
+}
+
+namespace logicalview {
+
+class LVElement;
+class LVLine;
+class LVScopeCompileUnit;
+class LVSymbol;
+class LVType;
+
+class LVIRReader final : public LVReader {
+ object::IRObjectFile *BitCodeIR = nullptr;
+ MemoryBufferRef *TextualIR = nullptr;
+
+ // Symbols with locations for current compile unit.
+ LVSymbols SymbolsWithLocations;
+
+ LVSectionIndex SectionIndex = 0;
+
+ const DICompileUnit *CUNode = nullptr;
+
+ // The Dwarf Version (from the module flags).
+ unsigned DwarfVersion;
+
+ // Location index for global variables.
+ uint64_t PoolAddressIndex = 0;
+
+ // Whether to emit all linkage names, or just abstract subprograms.
+ bool UseAllLinkageNames = true;
+
+ // Dependencies on external options (llc, etc).
+ bool includeMinimalInlineScopes() const;
+ bool useAllLinkageNames() const { return UseAllLinkageNames; }
+
+ bool LanguageIsFortran = false;
+ void mapFortranLanguage(unsigned DWLang);
+ bool moduleIsInFortran() const { return LanguageIsFortran; }
+
+ // Generate logical debug line before prologue.
+ bool GenerateLineBeforePrologue = true;
+
+ // We assume a constante increase between instructions.
+ const unsigned OffsetIncrease = 4;
+ void updateLineOffset() { CurrentOffset += OffsetIncrease; }
+
+ // An anonymous type for index type.
+ LVType *NodeIndexType = nullptr;
+
+ std::unique_ptr<DbgValueRangeTable> DbgValueRanges;
+
+ // Record the last assigned file index for each compile unit.
+ using LVIndexFiles = std::map<LVScopeCompileUnit *, size_t>;
+ LVIndexFiles IndexFiles;
+
+ void updateFileIndex(LVScopeCompileUnit *CompileUnit, size_t FileIndex) {
+ LVIndexFiles::iterator Iter = IndexFiles.find(CompileUnit);
+ if (Iter == IndexFiles.end())
+ IndexFiles.emplace(CompileUnit, FileIndex);
+ else
+ Iter->second = FileIndex;
+ }
+
+ // Get the current assigned index file for the given compile unit.
+ size_t getFileIndex(LVScopeCompileUnit *CompileUnit) {
+ size_t FileIndex = 0;
+ LVIndexFiles::iterator Iter = IndexFiles.find(CompileUnit);
+ if (Iter != IndexFiles.end())
+ FileIndex = Iter->second;
+ return FileIndex;
+ }
+
+ // Collect the compile unit metadata files.
+ using LVCompileUnitFiles = std::map<const DIFile *, size_t>;
+ LVCompileUnitFiles CompileUnitFiles;
+
+ size_t getOrCreateSourceID(cons...
[truncated]
|
@CarlosAlbertoEnciso If this needs another pair of eyes, do not hesitate to add me as reviewer. Cheers! |
@jalopezg-git Thanks; that would be great. |
Add support for the LLVM IR format and be able to generate logical views.
Both textual representation (.ll) and bitcode (.bc) format are supported.
Note: This patch requires:
Add DebugSSAUpdater class to track debug value liveness
#135349