Skip to content

[LLDB] Add type summaries for MSVC STL strings #143177

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Nerixyz
Copy link
Contributor

@Nerixyz Nerixyz commented Jun 6, 2025

This PR adds type summaries for std::{string,wstring,u8string,u16string,u32string} from the MSVC STL.

See #24834 for the MSVC STL issue.

The following changes were made:

  • dotest.py now detects if the MSVC STL is available. It does so by looking at the target triple, which is an additional argument passed from Lit. It specifically checks for windows-msvc to not match on windows-gnu (i.e. MinGW/Cygwin).
  • (The main part): Added support for summarizing std::(w)string from MSVC's STL. Because the type names from the libstdc++ (pre C++ 11) string types are the same as on MSVC's STL, CXXCompositeSummaryFormat is used with two entries, one for MSVC's STL and one for libstdc++.
    With MSVC's STL, std::u{8,16,32}string is also handled. These aren't handled for libstdc++, so I put them in LoadMsvcStlFormatters.

Copy link

github-actions bot commented Jun 6, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@Nerixyz Nerixyz force-pushed the feat/lldb-msvc-formatters branch 7 times, most recently from 938bc84 to 00e7e4c Compare June 11, 2025 13:43
@Nerixyz Nerixyz force-pushed the feat/lldb-msvc-formatters branch 3 times, most recently from 519f089 to d953276 Compare June 14, 2025 19:51
Michael137 pushed a commit that referenced this pull request Jun 17, 2025
As part of #143177, I moved the
non-libc++ specific formatting of `std::string`s out to `CxxStringTypes`
as MSVC's STL `std::string` can also be thought of a pointer+size pair.
I named this kind of string "string buffer".

This PR picks that change, so the MSVC PR can be smaller.
Unfortunately, libstdc++'s `std::string` does not fit this (it also uses
a different string printer function).

This resolves two FIXMEs in the libc++ tests, where empty u16 and u32
strings didn't have any prefix (u/U).
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Jun 17, 2025
As part of llvm/llvm-project#143177, I moved the
non-libc++ specific formatting of `std::string`s out to `CxxStringTypes`
as MSVC's STL `std::string` can also be thought of a pointer+size pair.
I named this kind of string "string buffer".

This PR picks that change, so the MSVC PR can be smaller.
Unfortunately, libstdc++'s `std::string` does not fit this (it also uses
a different string printer function).

This resolves two FIXMEs in the libc++ tests, where empty u16 and u32
strings didn't have any prefix (u/U).
@Nerixyz Nerixyz force-pushed the feat/lldb-msvc-formatters branch from d953276 to 3033eb6 Compare June 17, 2025 19:55
@Nerixyz
Copy link
Contributor Author

Nerixyz commented Jun 17, 2025

I would love some general feedback on this PR. I updated the description to reflect the current state.

For some reason, the Windows CI doesn't run on this PR, but I ran the STL tests locally.

@Nerixyz Nerixyz changed the title [LLDB] WIP: Add type summaries for MSVC STL strings [LLDB] Add type summaries for MSVC STL strings Jun 17, 2025
@Nerixyz Nerixyz marked this pull request as ready for review June 17, 2025 20:57
@Nerixyz Nerixyz requested a review from JDevlieghere as a code owner June 17, 2025 20:57
@llvmbot llvmbot added the lldb label Jun 17, 2025
@llvmbot
Copy link
Member

llvmbot commented Jun 17, 2025

@llvm/pr-subscribers-lldb

Author: nerix (Nerixyz)

Changes

This PR adds type summaries for std::{string,wstring,u8string,u16string,u32string} from the MSVC STL.

See #24834 for the MSVC STL issue.

The following changes were made:

  • Added a new type summary kind CXXCompositeSummaryFormat. It holds a list of child summaries paired with a validator function that checks if a ValueObject can be formatted by a child. This feels like it should be done in another PR, but there aren't any tests for it other than the STL formatters (since it's only in C++ it can't be used in Python).
    • As part of this, the formatter data (callback/format string) from CXXFunctionSummaryFormat and StringSummaryFormat was extracted to CXXFunctionSummaryData and StringSummaryData respectively. This allows the composite format to use the same data and FormatObject of said data.
  • dotest.py now detects if the MSVC STL is available. It does so by looking at the target triple, which is an additional argument passed from Lit. It specifically checks for windows-msvc to not match on windows-gnu (i.e. MinGW/Cygwin).
  • (The main part): Added support for summarizing std::(w)string from MSVC's STL. Because the type names from the libstdc++ (pre C++ 11) string types are the same as on MSVC's STL, CXXCompositeSummaryFormat is used with two entries, one for MSVC's STL and one for libstdc++.
    With MSVC's STL, std::u{8,16,32}string is also handled. These aren't handled for libstdc++, so I put them in LoadMsvcStlFormatters.

Patch is 53.05 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/143177.diff

21 Files Affected:

  • (modified) lldb/include/lldb/DataFormatters/StringPrinter.h (+15)
  • (modified) lldb/include/lldb/DataFormatters/TypeSummary.h (+91-11)
  • (modified) lldb/include/lldb/DataFormatters/ValueObjectPrinter.h (+1-1)
  • (modified) lldb/packages/Python/lldbsuite/test/configuration.py (+2)
  • (modified) lldb/packages/Python/lldbsuite/test/dotest.py (+23)
  • (modified) lldb/packages/Python/lldbsuite/test/dotest_args.py (+6)
  • (modified) lldb/packages/Python/lldbsuite/test/test_categories.py (+1)
  • (modified) lldb/source/API/SBTypeSummary.cpp (+10-1)
  • (modified) lldb/source/Commands/CommandObjectType.cpp (+4-3)
  • (modified) lldb/source/DataFormatters/TypeSummary.cpp (+112-31)
  • (modified) lldb/source/Plugins/Language/CPlusPlus/CMakeLists.txt (+1)
  • (modified) lldb/source/Plugins/Language/CPlusPlus/CPlusPlusLanguage.cpp (+115-31)
  • (added) lldb/source/Plugins/Language/CPlusPlus/MsvcStl.cpp (+140)
  • (added) lldb/source/Plugins/Language/CPlusPlus/MsvcStl.h (+33)
  • (added) lldb/test/API/functionalities/data-formatter/data-formatter-stl/msvcstl/string/Makefile (+4)
  • (added) lldb/test/API/functionalities/data-formatter/data-formatter-stl/msvcstl/string/TestDataFormatterStdString.py (+118)
  • (added) lldb/test/API/functionalities/data-formatter/data-formatter-stl/msvcstl/string/main.cpp (+40)
  • (added) lldb/test/API/functionalities/data-formatter/data-formatter-stl/msvcstl/u8string/Makefile (+4)
  • (added) lldb/test/API/functionalities/data-formatter/data-formatter-stl/msvcstl/u8string/TestDataFormatterStdU8String.py (+31)
  • (added) lldb/test/API/functionalities/data-formatter/data-formatter-stl/msvcstl/u8string/main.cpp (+14)
  • (modified) lldb/test/API/lit.cfg.py (+3)
diff --git a/lldb/include/lldb/DataFormatters/StringPrinter.h b/lldb/include/lldb/DataFormatters/StringPrinter.h
index 4169f53e63f38..4ebe712be60e1 100644
--- a/lldb/include/lldb/DataFormatters/StringPrinter.h
+++ b/lldb/include/lldb/DataFormatters/StringPrinter.h
@@ -152,6 +152,21 @@ class StringPrinter {
   template <StringElementType element_type>
   static bool
   ReadBufferAndDumpToStream(const ReadBufferAndDumpToStreamOptions &options);
+
+  template <StringElementType element_type>
+  static constexpr uint64_t ElementByteSize() {
+    switch (element_type) {
+    case StringElementType::ASCII:
+    case StringElementType::UTF8:
+      return 1;
+    case StringElementType::UTF16:
+      return 2;
+    case StringElementType::UTF32:
+      return 3;
+    default:
+      return 0;
+    }
+  }
 };
 
 } // namespace formatters
diff --git a/lldb/include/lldb/DataFormatters/TypeSummary.h b/lldb/include/lldb/DataFormatters/TypeSummary.h
index 589f68c2ce314..ea32b31c0b2ae 100644
--- a/lldb/include/lldb/DataFormatters/TypeSummary.h
+++ b/lldb/include/lldb/DataFormatters/TypeSummary.h
@@ -48,7 +48,14 @@ class TypeSummaryOptions {
 
 class TypeSummaryImpl {
 public:
-  enum class Kind { eSummaryString, eScript, eBytecode, eCallback, eInternal };
+  enum class Kind {
+    eSummaryString,
+    eScript,
+    eBytecode,
+    eCallback,
+    eComposite,
+    eInternal
+  };
 
   virtual ~TypeSummaryImpl() = default;
 
@@ -292,18 +299,29 @@ class TypeSummaryImpl {
   const TypeSummaryImpl &operator=(const TypeSummaryImpl &) = delete;
 };
 
-// simple string-based summaries, using ${var to show data
-struct StringSummaryFormat : public TypeSummaryImpl {
+struct StringSummaryData {
   std::string m_format_str;
   FormatEntity::Entry m_format;
   Status m_error;
 
+  StringSummaryData(const char *f);
+  void SetFormat(const char *f);
+
+  bool FormatObject(ValueObject *valobj, std::string &dest,
+                    const TypeSummaryOptions &options,
+                    const TypeSummaryImpl &summary);
+};
+
+// simple string-based summaries, using ${var to show data
+struct StringSummaryFormat : public TypeSummaryImpl {
+  StringSummaryData m_data;
+
   StringSummaryFormat(const TypeSummaryImpl::Flags &flags, const char *f,
                       uint32_t ptr_match_depth = 1);
 
   ~StringSummaryFormat() override = default;
 
-  const char *GetSummaryString() const { return m_format_str.c_str(); }
+  const char *GetSummaryString() const { return m_data.m_format_str.c_str(); }
 
   void SetSummaryString(const char *f);
 
@@ -323,15 +341,23 @@ struct StringSummaryFormat : public TypeSummaryImpl {
   const StringSummaryFormat &operator=(const StringSummaryFormat &) = delete;
 };
 
-// summaries implemented via a C++ function
-struct CXXFunctionSummaryFormat : public TypeSummaryImpl {
+struct CXXFunctionSummaryData {
   // we should convert these to SBValue and SBStream if we ever cross the
   // boundary towards the external world
-  typedef std::function<bool(ValueObject &, Stream &,
-                             const TypeSummaryOptions &)>
-      Callback;
+  using Callback =
+      std::function<bool(ValueObject &, Stream &, const TypeSummaryOptions &)>;
 
   Callback m_impl;
+
+  bool FormatObject(ValueObject *valobj, std::string &dest,
+                    const TypeSummaryOptions &options,
+                    const TypeSummaryImpl &summary);
+};
+
+// summaries implemented via a C++ function
+struct CXXFunctionSummaryFormat : public TypeSummaryImpl {
+  using Callback = CXXFunctionSummaryData::Callback;
+  CXXFunctionSummaryData m_data;
   std::string m_description;
 
   CXXFunctionSummaryFormat(const TypeSummaryImpl::Flags &flags, Callback impl,
@@ -340,11 +366,13 @@ struct CXXFunctionSummaryFormat : public TypeSummaryImpl {
 
   ~CXXFunctionSummaryFormat() override = default;
 
-  Callback GetBackendFunction() const { return m_impl; }
+  Callback GetBackendFunction() const { return m_data.m_impl; }
 
   const char *GetTextualInfo() const { return m_description.c_str(); }
 
-  void SetBackendFunction(Callback cb_func) { m_impl = std::move(cb_func); }
+  void SetBackendFunction(Callback cb_func) {
+    m_data.m_impl = std::move(cb_func);
+  }
 
   void SetTextualInfo(const char *descr) {
     if (descr)
@@ -372,6 +400,58 @@ struct CXXFunctionSummaryFormat : public TypeSummaryImpl {
   operator=(const CXXFunctionSummaryFormat &) = delete;
 };
 
+/// A summary that supports multiple type layouts for the same type name.
+///
+/// This summary maintains a list of child summaries, each paired with a
+/// validator function. An associated validator is used to determine which child
+/// is appropriate for formatting a given ValueObject.
+struct CXXCompositeSummaryFormat : public TypeSummaryImpl {
+  using Validator = bool(ValueObject &valobj);
+  using ChildData = std::variant<CXXFunctionSummaryData, StringSummaryData>;
+  using Child = std::pair<Validator *, ChildData>;
+  using Children = llvm::SmallVector<Child, 2>;
+
+  Children m_children;
+  std::string m_description;
+
+  CXXCompositeSummaryFormat(const TypeSummaryImpl::Flags &flags,
+                            const char *description, Children impls = {},
+                            uint32_t ptr_match_depth = 1);
+
+  ~CXXCompositeSummaryFormat() override = default;
+
+  const char *GetTextualInfo() const { return m_description.c_str(); }
+
+  CXXCompositeSummaryFormat *Append(Validator *validator, ChildData impl);
+
+  void SetTextualInfo(const char *descr) {
+    if (descr)
+      m_description.assign(descr);
+    else
+      m_description.clear();
+  }
+
+  Children CopyChildren() const;
+
+  bool FormatObject(ValueObject *valobj, std::string &dest,
+                    const TypeSummaryOptions &options) override;
+
+  std::string GetDescription() override;
+
+  static bool classof(const TypeSummaryImpl *S) {
+    return S->GetKind() == Kind::eComposite;
+  }
+
+  std::string GetName() override;
+
+  using SharedPointer = std::shared_ptr<CXXCompositeSummaryFormat>;
+
+private:
+  CXXCompositeSummaryFormat(const CXXCompositeSummaryFormat &) = delete;
+  const CXXCompositeSummaryFormat &
+  operator=(const CXXCompositeSummaryFormat &) = delete;
+};
+
 // Python-based summaries, running script code to show data
 struct ScriptSummaryFormat : public TypeSummaryImpl {
   std::string m_function_name;
diff --git a/lldb/include/lldb/DataFormatters/ValueObjectPrinter.h b/lldb/include/lldb/DataFormatters/ValueObjectPrinter.h
index f9deb6ebf8d6d..1f20c3180d37a 100644
--- a/lldb/include/lldb/DataFormatters/ValueObjectPrinter.h
+++ b/lldb/include/lldb/DataFormatters/ValueObjectPrinter.h
@@ -165,7 +165,7 @@ class ValueObjectPrinter {
   std::string m_error;
   bool m_val_summary_ok;
 
-  friend struct StringSummaryFormat;
+  friend struct StringSummaryData;
 
   ValueObjectPrinter(const ValueObjectPrinter &) = delete;
   const ValueObjectPrinter &operator=(const ValueObjectPrinter &) = delete;
diff --git a/lldb/packages/Python/lldbsuite/test/configuration.py b/lldb/packages/Python/lldbsuite/test/configuration.py
index b2d91fd211477..4ec892bef69f9 100644
--- a/lldb/packages/Python/lldbsuite/test/configuration.py
+++ b/lldb/packages/Python/lldbsuite/test/configuration.py
@@ -134,6 +134,8 @@
 libcxx_include_target_dir = None
 libcxx_library_dir = None
 
+target_triple = None
+
 # A plugin whose tests will be enabled, like intel-pt.
 enabled_plugins = []
 
diff --git a/lldb/packages/Python/lldbsuite/test/dotest.py b/lldb/packages/Python/lldbsuite/test/dotest.py
index d7f274ac4f60e..dd6fbdf8daed4 100644
--- a/lldb/packages/Python/lldbsuite/test/dotest.py
+++ b/lldb/packages/Python/lldbsuite/test/dotest.py
@@ -299,6 +299,8 @@ def parseOptionsAndInitTestdirs():
     configuration.libcxx_library_dir = args.libcxx_library_dir
     configuration.cmake_build_type = args.cmake_build_type.lower()
 
+    configuration.target_triple = args.target_triple
+
     if args.channels:
         lldbtest_config.channels = args.channels
 
@@ -831,6 +833,26 @@ def checkLibstdcxxSupport():
     configuration.skip_categories.append("libstdcxx")
 
 
+def canRunMsvcStlTests():
+    if "windows-msvc" in configuration.target_triple:
+        return True, "MSVC STL is present on *windows-msvc*"
+    return (
+        False,
+        f"Don't know how to build with MSVC's STL on {configuration.target_triple}",
+    )
+
+
+def checkMsvcStlSupport():
+    result, reason = canRunMsvcStlTests()
+    if result:
+        return  # msvcstl supported
+    if "msvcstl" in configuration.categories_list:
+        return  # msvcstl category explicitly requested, let it run.
+    if configuration.verbose:
+        print(f"msvcstl tests will not be run because: {reason}")
+    configuration.skip_categories.append("msvcstl")
+
+
 def canRunWatchpointTests():
     from lldbsuite.test import lldbplatformutil
 
@@ -1044,6 +1066,7 @@ def run_suite():
 
     checkLibcxxSupport()
     checkLibstdcxxSupport()
+    checkMsvcStlSupport()
     checkWatchpointSupport()
     checkDebugInfoSupport()
     checkDebugServerSupport()
diff --git a/lldb/packages/Python/lldbsuite/test/dotest_args.py b/lldb/packages/Python/lldbsuite/test/dotest_args.py
index e9c21388bc213..f47cfc7db357e 100644
--- a/lldb/packages/Python/lldbsuite/test/dotest_args.py
+++ b/lldb/packages/Python/lldbsuite/test/dotest_args.py
@@ -116,6 +116,12 @@ def create_parser():
             "The location of llvm tools used for testing (yaml2obj, FileCheck, etc.)."
         ),
     )
+    group.add_argument(
+        "--target-triple",
+        help=textwrap.dedent(
+            "The target triple the tests will run on (e.g. x86_64-pc-windows-msvc)."
+        ),
+    )
 
     # Test filtering options
     group = parser.add_argument_group("Test filtering options")
diff --git a/lldb/packages/Python/lldbsuite/test/test_categories.py b/lldb/packages/Python/lldbsuite/test/test_categories.py
index b585f695adeab..1f6e8a78e0c0d 100644
--- a/lldb/packages/Python/lldbsuite/test/test_categories.py
+++ b/lldb/packages/Python/lldbsuite/test/test_categories.py
@@ -33,6 +33,7 @@
     "lldb-server": "Tests related to lldb-server",
     "lldb-dap": "Tests for the Debug Adapter Protocol with lldb-dap",
     "llgs": "Tests for the gdb-server functionality of lldb-server",
+    "msvcstl": "Test for MSVC STL data formatters",
     "pexpect": "Tests requiring the pexpect library to be available",
     "objc": "Tests related to the Objective-C programming language support",
     "pyapi": "Tests related to the Python API",
diff --git a/lldb/source/API/SBTypeSummary.cpp b/lldb/source/API/SBTypeSummary.cpp
index 58ec068ab9600..b62e786f522d4 100644
--- a/lldb/source/API/SBTypeSummary.cpp
+++ b/lldb/source/API/SBTypeSummary.cpp
@@ -370,6 +370,9 @@ bool SBTypeSummary::IsEqualTo(lldb::SBTypeSummary &rhs) {
     if (IsSummaryString() != rhs.IsSummaryString())
       return false;
     return GetOptions() == rhs.GetOptions();
+  case TypeSummaryImpl::Kind::eComposite:
+    return llvm::dyn_cast<CXXCompositeSummaryFormat>(m_opaque_sp.get()) ==
+           llvm::dyn_cast<CXXCompositeSummaryFormat>(rhs.m_opaque_sp.get());
   case TypeSummaryImpl::Kind::eInternal:
     return (m_opaque_sp.get() == rhs.m_opaque_sp.get());
   }
@@ -406,7 +409,7 @@ bool SBTypeSummary::CopyOnWrite_Impl() {
   if (CXXFunctionSummaryFormat *current_summary_ptr =
           llvm::dyn_cast<CXXFunctionSummaryFormat>(m_opaque_sp.get())) {
     new_sp = TypeSummaryImplSP(new CXXFunctionSummaryFormat(
-        GetOptions(), current_summary_ptr->m_impl,
+        GetOptions(), current_summary_ptr->m_data.m_impl,
         current_summary_ptr->m_description.c_str()));
   } else if (ScriptSummaryFormat *current_summary_ptr =
                  llvm::dyn_cast<ScriptSummaryFormat>(m_opaque_sp.get())) {
@@ -417,6 +420,12 @@ bool SBTypeSummary::CopyOnWrite_Impl() {
                  llvm::dyn_cast<StringSummaryFormat>(m_opaque_sp.get())) {
     new_sp = TypeSummaryImplSP(new StringSummaryFormat(
         GetOptions(), current_summary_ptr->GetSummaryString()));
+  } else if (CXXCompositeSummaryFormat *current_summary_ptr =
+                 llvm::dyn_cast<CXXCompositeSummaryFormat>(m_opaque_sp.get())) {
+    new_sp = TypeSummaryImplSP(new CXXCompositeSummaryFormat(
+        GetOptions(), current_summary_ptr->GetTextualInfo(),
+        current_summary_ptr->CopyChildren(),
+        current_summary_ptr->GetPtrMatchDepth()));
   }
 
   SetSP(new_sp);
diff --git a/lldb/source/Commands/CommandObjectType.cpp b/lldb/source/Commands/CommandObjectType.cpp
index 19cd3ff2972e9..bd6138a860caa 100644
--- a/lldb/source/Commands/CommandObjectType.cpp
+++ b/lldb/source/Commands/CommandObjectType.cpp
@@ -1397,9 +1397,10 @@ bool CommandObjectTypeSummaryAdd::Execute_StringSummary(
     result.AppendError("summary creation failed");
     return false;
   }
-  if (string_format->m_error.Fail()) {
-    result.AppendErrorWithFormat("syntax error: %s",
-                                 string_format->m_error.AsCString("<unknown>"));
+  if (string_format->m_data.m_error.Fail()) {
+    result.AppendErrorWithFormat(
+        "syntax error: %s",
+        string_format->m_data.m_error.AsCString("<unknown>"));
     return false;
   }
   lldb::TypeSummaryImplSP entry(string_format.release());
diff --git a/lldb/source/DataFormatters/TypeSummary.cpp b/lldb/source/DataFormatters/TypeSummary.cpp
index 6aa290698cd12..4830b2c28901a 100644
--- a/lldb/source/DataFormatters/TypeSummary.cpp
+++ b/lldb/source/DataFormatters/TypeSummary.cpp
@@ -57,21 +57,17 @@ std::string TypeSummaryImpl::GetSummaryKindName() {
     return "python";
   case Kind::eInternal:
     return "c++";
+  case Kind::eComposite:
+    return "composite";
   case Kind::eBytecode:
     return "bytecode";
   }
   llvm_unreachable("Unknown type kind name");
 }
 
-StringSummaryFormat::StringSummaryFormat(const TypeSummaryImpl::Flags &flags,
-                                         const char *format_cstr,
-                                         uint32_t ptr_match_depth)
-    : TypeSummaryImpl(Kind::eSummaryString, flags, ptr_match_depth),
-      m_format_str() {
-  SetSummaryString(format_cstr);
-}
+StringSummaryData::StringSummaryData(const char *f) { SetFormat(f); }
 
-void StringSummaryFormat::SetSummaryString(const char *format_cstr) {
+void StringSummaryData::SetFormat(const char *format_cstr) {
   m_format.Clear();
   if (format_cstr && format_cstr[0]) {
     m_format_str = format_cstr;
@@ -82,8 +78,19 @@ void StringSummaryFormat::SetSummaryString(const char *format_cstr) {
   }
 }
 
-bool StringSummaryFormat::FormatObject(ValueObject *valobj, std::string &retval,
-                                       const TypeSummaryOptions &options) {
+StringSummaryFormat::StringSummaryFormat(const TypeSummaryImpl::Flags &flags,
+                                         const char *format_cstr,
+                                         uint32_t ptr_match_depth)
+    : TypeSummaryImpl(Kind::eSummaryString, flags, ptr_match_depth),
+      m_data(format_cstr) {}
+
+void StringSummaryFormat::SetSummaryString(const char *format_cstr) {
+  m_data.SetFormat(format_cstr);
+}
+
+bool StringSummaryData::FormatObject(ValueObject *valobj, std::string &retval,
+                                     const TypeSummaryOptions &options,
+                                     const TypeSummaryImpl &summary) {
   if (!valobj) {
     retval.assign("NULL ValueObject");
     return false;
@@ -96,12 +103,12 @@ bool StringSummaryFormat::FormatObject(ValueObject *valobj, std::string &retval,
   if (frame)
     sc = frame->GetSymbolContext(lldb::eSymbolContextEverything);
 
-  if (IsOneLiner()) {
+  if (summary.IsOneLiner()) {
     // We've already checked the case of a NULL valobj above.  Let's put in an
     // assert here to make sure someone doesn't take that out:
     assert(valobj && "Must have a valid ValueObject to summarize");
     ValueObjectPrinter printer(*valobj, &s, DumpValueObjectOptions());
-    printer.PrintChildrenOneLiner(HideNames(valobj));
+    printer.PrintChildrenOneLiner(summary.HideNames(valobj));
     retval = std::string(s.GetString());
     return true;
   } else {
@@ -117,40 +124,52 @@ bool StringSummaryFormat::FormatObject(ValueObject *valobj, std::string &retval,
   }
 }
 
+bool StringSummaryFormat::FormatObject(ValueObject *valobj, std::string &retval,
+                                       const TypeSummaryOptions &options) {
+  return m_data.FormatObject(valobj, retval, options, *this);
+}
+
 std::string StringSummaryFormat::GetDescription() {
   StreamString sstr;
 
-  sstr.Printf("`%s`%s%s%s%s%s%s%s%s%s ptr-match-depth=%u", m_format_str.c_str(),
-              m_error.Fail() ? " error: " : "",
-              m_error.Fail() ? m_error.AsCString() : "",
-              Cascades() ? "" : " (not cascading)",
-              !DoesPrintChildren(nullptr) ? "" : " (show children)",
-              !DoesPrintValue(nullptr) ? " (hide value)" : "",
-              IsOneLiner() ? " (one-line printout)" : "",
-              SkipsPointers() ? " (skip pointers)" : "",
-              SkipsReferences() ? " (skip references)" : "",
-              HideNames(nullptr) ? " (hide member names)" : "",
-              GetPtrMatchDepth());
+  sstr.Printf(
+      "`%s`%s%s%s%s%s%s%s%s%s ptr-match-depth=%u", m_data.m_format_str.c_str(),
+      m_data.m_error.Fail() ? " error: " : "",
+      m_data.m_error.Fail() ? m_data.m_error.AsCString() : "",
+      Cascades() ? "" : " (not cascading)",
+      !DoesPrintChildren(nullptr) ? "" : " (show children)",
+      !DoesPrintValue(nullptr) ? " (hide value)" : "",
+      IsOneLiner() ? " (one-line printout)" : "",
+      SkipsPointers() ? " (skip pointers)" : "",
+      SkipsReferences() ? " (skip references)" : "",
+      HideNames(nullptr) ? " (hide member names)" : "", GetPtrMatchDepth());
   return std::string(sstr.GetString());
 }
 
-std::string StringSummaryFormat::GetName() { return m_format_str; }
+std::string StringSummaryFormat::GetName() { return m_data.m_format_str; }
+
+bool CXXFunctionSummaryData::FormatObject(ValueObject *valobj,
+                                          std::string &dest,
+                                          const TypeSummaryOptions &options,
+                                          const TypeSummaryImpl &summary) {
+  dest.clear();
+  StreamString stream;
+  if (!m_impl || !m_impl(*valobj, stream, options))
+    return false;
+  dest = std::string(stream.GetString());
+  return true;
+}
 
 CXXFunctionSummaryFormat::CXXFunctionSummaryFormat(
     const TypeSummaryImpl::Flags &flags, Callback impl, const char *description,
     uint32_t ptr_match_depth)
-    : TypeSummaryImpl(Kind::eCallback, flags, ptr_match_depth), m_impl(impl),
+    : TypeSummaryImpl(Kind::eCallback, flags, ptr_match_depth), m_data{impl},
       m_description(description ? description : "") {}
 
 bool CXXFunctionSummaryFormat::FormatObject(ValueObject *valobj,
                                             std::string &dest,
                                             const TypeSummaryOptions &options) {
-  dest.clear();
-  StreamString stream;
-  if (!m_impl || !m_impl(*valobj, stream, options))
-    return false;
-  dest = std::string(stream.GetString());
-  return true;
+  return m_data.FormatObject(valobj, dest, options, *this);
 }
 
 std::string CXXFunctionSummaryFormat::GetDescription() {
@@ -169,6 +188,68 @@ std::string CXXFunctionSummaryFormat::GetDescription() {
 
 std::string CXXFunctionSummaryFormat::GetName() { return m_description; }
 
+CXXCompositeSummaryFormat::CXXCompositeSummaryFormat(
+    const TypeSummaryImpl::Flags &flags, const char *description,
+    Children children, uint32_t ptr_match_depth)
+    : TypeSummaryImpl(Kind::eComposite, flags, ptr_match_depth),
+      m_children(std::move(children)) {}
+
+CXXCompositeSummaryFormat *
+CXXCompositeSummaryFormat::Append(Validator *validator, ChildData child) {
+  m_children.emplace_back(validator, std::move(child));
+  return this;
+}
+
+CXXCompositeSummaryFormat::Children
+CXXCompositeSummaryFormat::CopyChildren() const {
+  auto copy = [](const Child &entry) -> Child {
+    return {entry.first,
+            std::visit(llvm::makeVisitor(
+                           [](const CXXFunctionSummaryData &fn) -> ChildData {
+                             return fn;
+                           },
+                           [](const StringSummaryData &string) -> ChildData {
+                             return StringS...
[truncated]

@DavidSpickett
Copy link
Collaborator

Added a new type summary kind CXXCompositeSummaryFormat. It holds a list of child summaries paired with a validator function that checks if a ValueObject can be formatted by a child. This feels like it should be done in another PR, but there aren't any tests for it other than the STL formatters (since it's only in C++ it can't be used in Python).

I had a similar problem with another formatting option, so I added https://github.com/llvm/llvm-project/blob/main/lldb/unittests/ValueObject/DumpValueObjectOptionsTests.cpp.

Maybe you could extend or take inspiration from that.

Copy link
Member

@Michael137 Michael137 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this! The general idea of having composite formats seems reasonable to me (though others, CC @jimingham, might have opinions). If libc++ ever decided to majorly overhaul some types we might be able to provide backwards compatibility for old layouts this way. Will give it a more thorough review later though.

@labath
Copy link
Collaborator

labath commented Jun 18, 2025

The composite formats are not an unreasonable idea, but I can't help but wonder if we even need to touch the generic data formatter infrastructure. I mean we already have the ability to do callback-based summaries, so what if we just did the stl switch inside the callback? Being able to do this declaratively is sort of nice, but this already isn't completely declarative since you have callbacks for type validation.

Maybe this means that we can't specify the summary for std::string via summary strings (or maybe we can -- I haven't checked how the code is structured), but I don't think that's such a disaster. Implementing ${var._M_dataplus._M_p} in c++ is not that hard, and this is probably one of our more complex summaries (containers just say size=#children), which doesn't even need to be customized for different STLs.

fschlimb pushed a commit to fschlimb/llvm-project that referenced this pull request Jun 18, 2025
As part of llvm#143177, I moved the
non-libc++ specific formatting of `std::string`s out to `CxxStringTypes`
as MSVC's STL `std::string` can also be thought of a pointer+size pair.
I named this kind of string "string buffer".

This PR picks that change, so the MSVC PR can be smaller.
Unfortunately, libstdc++'s `std::string` does not fit this (it also uses
a different string printer function).

This resolves two FIXMEs in the libc++ tests, where empty u16 and u32
strings didn't have any prefix (u/U).
@Nerixyz Nerixyz force-pushed the feat/lldb-msvc-formatters branch from 3033eb6 to 87f304c Compare June 18, 2025 19:58
@Nerixyz
Copy link
Contributor Author

Nerixyz commented Jun 18, 2025

The composite formats are not an _un_reasonable idea, but I can't help but wonder if we even need to touch the generic data formatter infrastructure. I mean we already have the ability to do callback-based summaries, so what if we just did the stl switch inside the callback? Being able to do this declaratively is sort of nice, but this already isn't completely declarative since you have callbacks for type validation.

That's a good point. I removed the CXXCompositeSummaryFormat and implemented the selection in a lambda.

Implementing ${var._M_dataplus._M_p} in c++ is not that hard, and this is probably one of our more complex summaries (containers just say size=#children), which doesn't even need to be customized for different STLs.

Right, I did that at the beginning, but then saw more types using string summaries without looking at what they're doing, so I wanted to support both.

Without the new summary type this PR is much more manageable.

@Nerixyz Nerixyz force-pushed the feat/lldb-msvc-formatters branch 2 times, most recently from 570732b to 9df3961 Compare June 19, 2025 14:05
@Nerixyz
Copy link
Contributor Author

Nerixyz commented Jun 19, 2025

Another aside: Does anyone know why the libstdc++ std::__cxx11::string summaries weren't using the string summary? They were initially added in D13964, but it doesn't provide a reason to do so. Note that the summary for a std::string v = "foo" was the format string one (${var._M_dataplus._M_p}). Only when printing a reference or pointer to a string, LibStdcppStringSummaryProvider was called. I tried to use this function for all libstdc++ strings, but this broke when using -D_GLIBCXX_USE_CXX11_ABI=0 (on MinGW). On the other hand, ${var._M_dataplus._M_p} works with both ABIs.

@labath
Copy link
Collaborator

labath commented Jun 20, 2025

Another aside: Does anyone know why the libstdc++ std::__cxx11::string summaries weren't using the string summary? They were initially added in D13964, but it doesn't provide a reason to do so. Note that the summary for a std::string v = "foo" was the format string one (${var._M_dataplus._M_p}). Only when printing a reference or pointer to a string, LibStdcppStringSummaryProvider was called. I tried to use this function for all libstdc++ strings, but this broke when using -D_GLIBCXX_USE_CXX11_ABI=0 (on MinGW). On the other hand, ${var._M_dataplus._M_p} works with both ABIs.

Well, the commit says (I haven't verified this) that the std::string formatter at the time was written in python, so this could have been a way to avoid the python dependency. As for the implementation, a possible reason (again, I don't know if that's the case) could be to make the formatter work in the case where you don't have debug info for std::string -- it looks like it avoids looking at any type information, which means it could work even with a forward declaration -- but that it would break when the ABI changes, as you've noticed.

I don't think we need to concern ourselves with that though. Your new implementation is fine.

Comment on lines 302 to 303
configuration.target_triple = args.target_triple

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd very much like to avoid adding this argument. Maybe you could do something like canRunLibcxxTests, which tries to compile?

It'd also be nice if there was a way to run these tests even if the default clang configuration does not use it. Is there any equivalent to -stdlib=libc++ which would force the usage of the MSVC STL?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no -stdlib, but detection through _MSVC_STL_VERSION works.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's better, though it would be even better if we could run these tests even if msvcstl was not the default clang target (similar to how we can run libc++ tests even when clang targets libstdc++ by default).

From what I gather, this behavior is controlled by the target triple. Would manually changing the target triple by appending something to CFLAGS (take self.dbg.GetSelectedPlatform().GetTriple() and append "-msvc" or something) be a terrible idea? (FWIW, this is where we append libc++ flags.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the MSVC toolchain on Clang, there's an open FIXME for supporting libc++. I feel hesitant changing the target there.

If I understand you correctly, then the test would check if it's running on *-pc-windows-* and change it to *-pc-windows-msvc. In which case, it assumes to always run with MSVC's STL. That could work. Clang on MinGW uses x86_64-w64-windows-gnu so this wouldn't get changed, which is good.

Maybe it's better to get that fixme fixed in Clang (i.e. introduce something like -stdlib=libc++/-stdlib=msvcstl). As far as I know, MSVC's STL only works on the *-msvc targets.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the MSVC toolchain on Clang, there's an open FIXME for supporting libc++. I feel hesitant changing the target there.

Yeah, that's not what I'm suggesting.

If I understand you correctly, then the test would check if it's running on *-pc-windows-* and change it to *-pc-windows-msvc. In which case, it assumes to always run with MSVC's STL. That could work. Clang on MinGW uses x86_64-w64-windows-gnu so this wouldn't get changed, which is good.

Yes, something like that.

Maybe it's better to get that fixme fixed in Clang (i.e. introduce something like -stdlib=libc++/-stdlib=msvcstl). As far as I know, MSVC's STL only works on the *-msvc targets.

Maybe -- I don't know. It may be there is a reason this isn't implemented that way. It may have something to do with the fact that the triple also controls other things (such as the name mangling). Maybe MSVC STL doesn't work with the itanium mangling (why should it)?

This is why I'm not totally convinced that this is a good idea, though I also believe that it's important that these tests do get run -- somewhere. If you can confirm that the tests would run on the existing windows bots (we can check with bot owners if needed), then maybe this wouldn't be necessary.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...unless someone has a mingw compiler and wants to use MSVC STL but I think that's A: asking for trouble and B: defeating part of the point of mingw? Hmm.

Anyway, the tests will run on a buildbot that will be running for a few years to come, that's the main thing here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did apply your commit onto main, rather than building the PR branch so it could be a different change causing that.

The unexpected passes are due to this change specifically.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are also some unexpected passes, which is probably a good thing:

That's great! I removed the XFAIL now. Aside: #25146 should be labeled as platform:windows .

Do I understand correctly?

Yes, that's what the tests currently do.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is why I'm not totally convinced that this is a good idea, though I also believe that it's important that these tests do get run -- somewhere. If you can confirm that the tests would run on the existing windows bots (we can check with bot owners if needed), then maybe this wouldn't be necessary.

@labath are you ok with the test predicate as is, knowing that the tests will run on Linaro's buildbot?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think it's fine. FWIW, we also don't add any special flags when the compiler is gcc and we're running libstdc++ tests, only that fact is obscured by a lot of ifdefs.

@Nerixyz Nerixyz force-pushed the feat/lldb-msvc-formatters branch 2 times, most recently from ad7f551 to d806942 Compare June 20, 2025 14:32
Copy link

github-actions bot commented Jun 20, 2025

✅ With the latest revision this PR passed the Python code formatter.

@Nerixyz Nerixyz force-pushed the feat/lldb-msvc-formatters branch from d806942 to 9f95794 Compare June 20, 2025 14:37
@Nerixyz Nerixyz force-pushed the feat/lldb-msvc-formatters branch 2 times, most recently from f201388 to 7815eeb Compare June 26, 2025 13:56
@Nerixyz Nerixyz force-pushed the feat/lldb-msvc-formatters branch from 7815eeb to 66bc2ad Compare June 26, 2025 19:31
@Nerixyz Nerixyz force-pushed the feat/lldb-msvc-formatters branch 2 times, most recently from 0f4c8b8 to f17eea0 Compare June 27, 2025 20:07
Copy link
Collaborator

@labath labath left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the tests are fine. @Michael137, @jimingham, do you have anything to say about the implementation?

}
}
return false;
ValueObjectSP dataplus = valobj.GetChildMemberWithName("_M_dataplus");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be implemented the same way as LibStdcppStringSummaryProvider (maybe even by the same function)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, this worked for me in MinGW, I'll see what the CI says.

Comment on lines 302 to 303
configuration.target_triple = args.target_triple

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think it's fine. FWIW, we also don't add any special flags when the compiler is gcc and we're running libstdc++ tests, only that fact is obscured by a lot of ifdefs.

Copy link
Member

@Michael137 Michael137 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, with some minor nits

# coding=utf8
"""
Test std::*string summaries with MSVC's STL.
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might've missed some discussion on this:

Not a blocker for this PR but why do we have separate API test directories for the various STLs? The only difference in them being USE_LIBCPP vs. USE_LIBSTCPP in the Makefiles. And now a third one with no apparent difference in the Makefile.

We should probably consolidate them all into the data-formatter/data-formatter-stl and just pass the necessary makefile flags from within the python tests (via self.build)?

@labath @DavidSpickett anything stopping us from doing that?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's mainly historical. We have data-formatter/data-formatter-stl/generic/, but it's a fairly recent addition. I think it would be nice to move everything towards that, in particular because it encourages a consistent presentation of different implementations -- not just for humans, but this also makes it easier to write data formatters for other data structures (which contain std:: types internally) -- it's nice when the data formatter works regardless of the STL version your code is compiled with.

In theory there could be tests that only make sense on a particular (e.g. a test for what happens when a particular internal pointer is corrupted), but those could be handled with ifdefs, or with separate tests.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. Will give it a shot later this week

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Nerixyz Nerixyz force-pushed the feat/lldb-msvc-formatters branch from f17eea0 to df58984 Compare July 1, 2025 14:00
Copy link
Member

@Michael137 Michael137 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM once #143177 (comment) is merged separately.

Michael137 pushed a commit that referenced this pull request Jul 2, 2025
From #143177. This combines the summaries for the pre- and post C++ 11
`std::string` as well as `std::wstring`. In all cases, the data pointer
is reachable through `_M_dataplus._M_p`. It has the correct type (i.e.
`char*`/`wchar_t*`) and it's null terminated, so LLDB knows how to
format it as expected when using `GetSummaryAsCString`.
DavidSpickett pushed a commit that referenced this pull request Jul 3, 2025
#146562 made
[TestCallStdStringFunction](https://github.com/llvm/llvm-project/blob/bd6cd92984e7a30cb91e4f069a0bacc5c582a234/lldb/test/API/commands/expression/call-function/TestCallStdStringFunction.py)
unexpectedly pass on Windows.

The test now passes, because `expression str` now prints the "raw"
string object, which happens to include the string "Hello world".
Previously, this resulted in an error:

```
(lldb) expression str
(std::string) $0 = {
  _Mypair = {
    _Myval2 = {
      _Bx = (_Buf = "Hello world", _Ptr = "", _Alias = "Hello world")
      _Mysize = 11
      _Myres = 15
    }
  }
}
(lldb) type summary add std::string --summary-string "${var._M_dataplus._M_p}"
^^^ previous summary ^^^
(lldb) expression str
(std::string) $1 = error: summary string parsing error
```

#143177 will eventually add the correct summary for MSVC STL strings.

Relates to #22139
@Nerixyz Nerixyz force-pushed the feat/lldb-msvc-formatters branch from df58984 to c230ad6 Compare July 3, 2025 14:14
@Nerixyz Nerixyz force-pushed the feat/lldb-msvc-formatters branch from c230ad6 to cf6e5cb Compare July 3, 2025 14:15
@Michael137
Copy link
Member

@labath @jimingham any final comments? Otherwise happy to merge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants