Skip to content

[LLDB] Consolidate C++ string buffer summaries #144258

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 17, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
102 changes: 78 additions & 24 deletions lldb/source/Plugins/Language/CPlusPlus/CxxStringTypes.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -116,15 +116,7 @@ bool lldb_private::formatters::WCharStringSummaryProvider(
return false;

// Get a wchar_t basic type from the current type system
CompilerType wchar_compiler_type =
valobj.GetCompilerType().GetBasicTypeFromAST(lldb::eBasicTypeWChar);

if (!wchar_compiler_type)
return false;

// Safe to pass nullptr for exe_scope here.
std::optional<uint64_t> size =
llvm::expectedToOptional(wchar_compiler_type.GetBitSize(nullptr));
std::optional<uint64_t> size = GetWCharByteSize(valobj);
if (!size)
return false;
const uint32_t wchar_size = *size;
Expand All @@ -136,13 +128,13 @@ bool lldb_private::formatters::WCharStringSummaryProvider(
options.SetPrefixToken("L");

switch (wchar_size) {
case 8:
case 1:
return StringPrinter::ReadStringAndDumpToStream<StringElementType::UTF8>(
options);
case 16:
case 2:
return StringPrinter::ReadStringAndDumpToStream<StringElementType::UTF16>(
options);
case 32:
case 4:
return StringPrinter::ReadStringAndDumpToStream<StringElementType::UTF32>(
options);
default:
Expand Down Expand Up @@ -177,15 +169,7 @@ bool lldb_private::formatters::WCharSummaryProvider(
return false;

// Get a wchar_t basic type from the current type system
CompilerType wchar_compiler_type =
valobj.GetCompilerType().GetBasicTypeFromAST(lldb::eBasicTypeWChar);

if (!wchar_compiler_type)
return false;

// Safe to pass nullptr for exe_scope here.
std::optional<uint64_t> size =
llvm::expectedToOptional(wchar_compiler_type.GetBitSize(nullptr));
std::optional<uint64_t> size = GetWCharByteSize(valobj);
if (!size)
return false;
const uint32_t wchar_size = *size;
Expand All @@ -199,13 +183,13 @@ bool lldb_private::formatters::WCharSummaryProvider(
options.SetBinaryZeroIsTerminator(false);

switch (wchar_size) {
case 8:
case 1:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm so this switch was always wrong in the past? Is this why the test changes had to be made?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this wasn't wrong. It used the bit size before, and the libc++ summary used the byte size. I'm not sure which is better (I suppose 8 bits/byte is true for all targets supported by LLDB).

This reminds me of another difference: The *CharSummaryProviders used valobj.GetCompilerType().GetBasicTypeFromAST() and the libc++ formatters used ScratchTypeSystemClang::GetBasicType(). Is there a big difference between them, and is one preferred over the other?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this wasn't wrong. It used the bit size before, and the libc++ summary used the byte size. I'm not sure which is better (I suppose 8 bits/byte is true for all targets supported by LLDB).

Ah I see. Making them consistent makes sense

The *CharSummaryProviders used valobj.GetCompilerType().GetBasicTypeFromAST() and the libc++ formatters used ScratchTypeSystemClang::GetBasicType(). Is there a big difference between them, and is one preferred over the other?

The ScratchTypeSystemClang is the AST that is shared between all expressions run in the target. Whereas the TypeSystem underlying a CompilerType only contains the AST nodes created when the ValueObject and its type were created. It shouldn't make a difference for primitive types. And especially shouldn't make a difference when all we need to do is get the bytesize. That being said, I'd prefer using valobj.GetCompilerType().GetBasicTypeFromAST() (assuming that doesn't break anything). Reaching into the scratch typesystem feels weird here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I refactored it to use valobj.GetCompilerType().GetBasicTypeFromAST() now.

return StringPrinter::ReadBufferAndDumpToStream<StringElementType::UTF8>(
options);
case 16:
case 2:
return StringPrinter::ReadBufferAndDumpToStream<StringElementType::UTF16>(
options);
case 32:
case 4:
return StringPrinter::ReadBufferAndDumpToStream<StringElementType::UTF32>(
options);
default:
Expand All @@ -214,3 +198,73 @@ bool lldb_private::formatters::WCharSummaryProvider(
}
return true;
}

std::optional<uint64_t>
lldb_private::formatters::GetWCharByteSize(ValueObject &valobj) {
return llvm::expectedToOptional(
valobj.GetCompilerType()
.GetBasicTypeFromAST(lldb::eBasicTypeWChar)
.GetByteSize(nullptr));
}

template <StringPrinter::StringElementType element_type>
bool lldb_private::formatters::StringBufferSummaryProvider(
Stream &stream, const TypeSummaryOptions &summary_options,
lldb::ValueObjectSP location_sp, uint64_t size, std::string prefix_token) {

if (size == 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah i see, this is where the tests get fixed

stream.PutCString(prefix_token);
stream.PutCString("\"\"");
return true;
}

if (!location_sp)
return false;

StringPrinter::ReadBufferAndDumpToStreamOptions options(*location_sp);

if (summary_options.GetCapping() == TypeSummaryCapping::eTypeSummaryCapped) {
const auto max_size =
location_sp->GetTargetSP()->GetMaximumSizeOfStringSummary();
if (size > max_size) {
size = max_size;
options.SetIsTruncated(true);
}
}

{
DataExtractor extractor;
const size_t bytes_read = location_sp->GetPointeeData(extractor, 0, size);
if (bytes_read < size)
return false;

options.SetData(std::move(extractor));
}
options.SetStream(&stream);
if (prefix_token.empty())
options.SetPrefixToken(nullptr);
else
options.SetPrefixToken(prefix_token);
options.SetQuote('"');
options.SetSourceSize(size);
options.SetBinaryZeroIsTerminator(false);
return StringPrinter::ReadBufferAndDumpToStream<element_type>(options);
}

// explicit instantiations for all string element types
template bool
lldb_private::formatters::StringBufferSummaryProvider<StringElementType::ASCII>(
Stream &, const TypeSummaryOptions &, lldb::ValueObjectSP, uint64_t,
std::string);
template bool
lldb_private::formatters::StringBufferSummaryProvider<StringElementType::UTF8>(
Stream &, const TypeSummaryOptions &, lldb::ValueObjectSP, uint64_t,
std::string);
template bool
lldb_private::formatters::StringBufferSummaryProvider<StringElementType::UTF16>(
Stream &, const TypeSummaryOptions &, lldb::ValueObjectSP, uint64_t,
std::string);
template bool
lldb_private::formatters::StringBufferSummaryProvider<StringElementType::UTF32>(
Stream &, const TypeSummaryOptions &, lldb::ValueObjectSP, uint64_t,
std::string);
29 changes: 29 additions & 0 deletions lldb/source/Plugins/Language/CPlusPlus/CxxStringTypes.h
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
#ifndef LLDB_SOURCE_PLUGINS_LANGUAGE_CPLUSPLUS_CXXSTRINGTYPES_H
#define LLDB_SOURCE_PLUGINS_LANGUAGE_CPLUSPLUS_CXXSTRINGTYPES_H

#include "lldb/DataFormatters/StringPrinter.h"
#include "lldb/DataFormatters/TypeSummary.h"
#include "lldb/Utility/Stream.h"
#include "lldb/ValueObject/ValueObject.h"
Expand Down Expand Up @@ -43,6 +44,34 @@ bool Char32SummaryProvider(ValueObject &valobj, Stream &stream,
bool WCharSummaryProvider(ValueObject &valobj, Stream &stream,
const TypeSummaryOptions &options); // wchar_t

std::optional<uint64_t> GetWCharByteSize(ValueObject &valobj);

/// Print a summary for a string buffer to \a stream.
///
/// \param[in] stream
/// The output stream to print the summary to.
///
/// \param[in] summary_options
/// Options for printing the string contents. This function respects the
/// capping.
///
/// \param[in] location_sp
/// ValueObject of a pointer to the string being printed.
///
/// \param[in] size
/// The size of the buffer pointed to by \a location_sp.
///
/// \param[in] prefix_token
/// A prefix before the double quotes (e.g. 'u' results in u"...").
///
/// \return
/// Returns whether the string buffer was successfully printed.
template <StringPrinter::StringElementType element_type>
bool StringBufferSummaryProvider(Stream &stream,
const TypeSummaryOptions &summary_options,
lldb::ValueObjectSP location_sp, uint64_t size,
std::string prefix_token);

} // namespace formatters
} // namespace lldb_private

Expand Down
147 changes: 34 additions & 113 deletions lldb/source/Plugins/Language/CPlusPlus/LibCxx.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@
#include "lldb/ValueObject/ValueObject.h"
#include "lldb/ValueObject/ValueObjectConstResult.h"

#include "Plugins/Language/CPlusPlus/CxxStringTypes.h"
#include "Plugins/LanguageRuntime/CPlusPlus/CPPLanguageRuntime.h"
#include "Plugins/TypeSystem/Clang/TypeSystemClang.h"
#include "lldb/lldb-enumerations.h"
Expand Down Expand Up @@ -535,70 +536,6 @@ ExtractLibcxxStringInfo(ValueObject &valobj) {
return std::make_pair(size, location_sp);
}

static bool
LibcxxWStringSummaryProvider(ValueObject &valobj, Stream &stream,
const TypeSummaryOptions &summary_options,
ValueObjectSP location_sp, size_t size) {
if (size == 0) {
stream.Printf("L\"\"");
return true;
}
if (!location_sp)
return false;

StringPrinter::ReadBufferAndDumpToStreamOptions options(valobj);
if (summary_options.GetCapping() == TypeSummaryCapping::eTypeSummaryCapped) {
const auto max_size = valobj.GetTargetSP()->GetMaximumSizeOfStringSummary();
if (size > max_size) {
size = max_size;
options.SetIsTruncated(true);
}
}

DataExtractor extractor;
const size_t bytes_read = location_sp->GetPointeeData(extractor, 0, size);
if (bytes_read < size)
return false;

// std::wstring::size() is measured in 'characters', not bytes
TypeSystemClangSP scratch_ts_sp =
ScratchTypeSystemClang::GetForTarget(*valobj.GetTargetSP());
if (!scratch_ts_sp)
return false;

auto wchar_t_size =
scratch_ts_sp->GetBasicType(lldb::eBasicTypeWChar).GetByteSize(nullptr);
if (!wchar_t_size)
return false;

options.SetData(std::move(extractor));
options.SetStream(&stream);
options.SetPrefixToken("L");
options.SetQuote('"');
options.SetSourceSize(size);
options.SetBinaryZeroIsTerminator(false);

switch (*wchar_t_size) {
case 1:
return StringPrinter::ReadBufferAndDumpToStream<
lldb_private::formatters::StringPrinter::StringElementType::UTF8>(
options);
break;

case 2:
return StringPrinter::ReadBufferAndDumpToStream<
lldb_private::formatters::StringPrinter::StringElementType::UTF16>(
options);
break;

case 4:
return StringPrinter::ReadBufferAndDumpToStream<
lldb_private::formatters::StringPrinter::StringElementType::UTF32>(
options);
}
return false;
}

bool lldb_private::formatters::LibcxxWStringSummaryProvider(
ValueObject &valobj, Stream &stream,
const TypeSummaryOptions &summary_options) {
Expand All @@ -609,52 +546,22 @@ bool lldb_private::formatters::LibcxxWStringSummaryProvider(
ValueObjectSP location_sp;
std::tie(size, location_sp) = *string_info;

return ::LibcxxWStringSummaryProvider(valobj, stream, summary_options,
location_sp, size);
}

template <StringPrinter::StringElementType element_type>
static bool
LibcxxStringSummaryProvider(ValueObject &valobj, Stream &stream,
const TypeSummaryOptions &summary_options,
std::string prefix_token, ValueObjectSP location_sp,
uint64_t size) {

if (size == 0) {
stream.Printf("\"\"");
return true;
}

if (!location_sp)
auto wchar_t_size = GetWCharByteSize(valobj);
if (!wchar_t_size)
return false;

StringPrinter::ReadBufferAndDumpToStreamOptions options(valobj);

if (summary_options.GetCapping() == TypeSummaryCapping::eTypeSummaryCapped) {
const auto max_size = valobj.GetTargetSP()->GetMaximumSizeOfStringSummary();
if (size > max_size) {
size = max_size;
options.SetIsTruncated(true);
}
}

{
DataExtractor extractor;
const size_t bytes_read = location_sp->GetPointeeData(extractor, 0, size);
if (bytes_read < size)
return false;

options.SetData(std::move(extractor));
switch (*wchar_t_size) {
case 1:
return StringBufferSummaryProvider<StringPrinter::StringElementType::UTF8>(
stream, summary_options, location_sp, size, "L");
case 2:
return StringBufferSummaryProvider<StringPrinter::StringElementType::UTF16>(
stream, summary_options, location_sp, size, "L");
case 4:
return StringBufferSummaryProvider<StringPrinter::StringElementType::UTF32>(
stream, summary_options, location_sp, size, "L");
}
options.SetStream(&stream);
if (prefix_token.empty())
options.SetPrefixToken(nullptr);
else
options.SetPrefixToken(prefix_token);
options.SetQuote('"');
options.SetSourceSize(size);
options.SetBinaryZeroIsTerminator(false);
return StringPrinter::ReadBufferAndDumpToStream<element_type>(options);
return false;
}

template <StringPrinter::StringElementType element_type>
Expand All @@ -669,8 +576,8 @@ LibcxxStringSummaryProvider(ValueObject &valobj, Stream &stream,
ValueObjectSP location_sp;
std::tie(size, location_sp) = *string_info;

return LibcxxStringSummaryProvider<element_type>(
valobj, stream, summary_options, prefix_token, location_sp, size);
return StringBufferSummaryProvider<element_type>(
stream, summary_options, location_sp, size, prefix_token);
}
template <StringPrinter::StringElementType element_type>
static bool formatStringImpl(ValueObject &valobj, Stream &stream,
Expand Down Expand Up @@ -742,8 +649,8 @@ static bool formatStringViewImpl(ValueObject &valobj, Stream &stream,
return true;
}

return LibcxxStringSummaryProvider<element_type>(
valobj, stream, summary_options, prefix_token, dataobj, size);
return StringBufferSummaryProvider<element_type>(stream, summary_options,
dataobj, size, prefix_token);
}

bool lldb_private::formatters::LibcxxStringViewSummaryProviderASCII(
Expand Down Expand Up @@ -781,8 +688,22 @@ bool lldb_private::formatters::LibcxxWStringViewSummaryProvider(
return true;
}

return ::LibcxxWStringSummaryProvider(valobj, stream, summary_options,
dataobj, size);
auto wchar_t_size = GetWCharByteSize(valobj);
if (!wchar_t_size)
return false;

switch (*wchar_t_size) {
case 1:
return StringBufferSummaryProvider<StringPrinter::StringElementType::UTF8>(
stream, summary_options, dataobj, size, "L");
case 2:
return StringBufferSummaryProvider<StringPrinter::StringElementType::UTF16>(
stream, summary_options, dataobj, size, "L");
case 4:
return StringBufferSummaryProvider<StringPrinter::StringElementType::UTF32>(
stream, summary_options, dataobj, size, "L");
}
return false;
}

static bool
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -65,11 +65,9 @@ def cleanup():
'(%s::wstring) IHaveEmbeddedZerosToo = L"hello world!\\0てざ ル゜䋨ミ㠧槊 きゅへ狦穤襩 じゃ馩リョ 䤦監"'
% ns,
'(%s::u16string) u16_string = u"ß水氶"' % ns,
# FIXME: This should have a 'u' prefix.
'(%s::u16string) u16_empty = ""' % ns,
'(%s::u16string) u16_empty = u""' % ns,
'(%s::u32string) u32_string = U"🍄🍅🍆🍌"' % ns,
# FIXME: This should have a 'U' prefix.
'(%s::u32string) u32_empty = ""' % ns,
'(%s::u32string) u32_empty = U""' % ns,
"(%s::string *) null_str = nullptr" % ns,
],
)
Expand Down Expand Up @@ -123,7 +121,7 @@ def cleanup():
% ns,
'(%s::u16string) u16_string = u"ß水氶"' % ns,
'(%s::u32string) u32_string = U"🍄🍅🍆🍌"' % ns,
'(%s::u32string) u32_empty = ""' % ns,
'(%s::u32string) u32_empty = U""' % ns,
"(%s::string *) null_str = nullptr" % ns,
],
)
Expand Down
Loading
Loading