-
Notifications
You must be signed in to change notification settings - Fork 4.2k
Description
Background and Motivation
Today, many SG authors attempt to create code using the SyntaxModel. This is both arduous and extraordinarily slow. Code has to figure out how to properly create C# syntax constructs (non trivial in many cases like indentation, doc comments, and the like), or avoid trivia entirely, and then use NormalizeWhitespace to normalize it. This is extremely slow in practice, and consumes lots of memory with all the intermediary constructs.
Emperical testing with the real world ComputeSharp and MVVM toolkits show a real gain of nearly 100x in terms of CPU usage and 100x in terms of memory usage by having the generators write directly to a string builder.
However, directly using a stringbuilder is slightly unpleasant, as a user has to keep track of the most common construct that makes code legible: indentation.
Proposed API
The sketch of what i'm proposing is a dedicated IndentingStringBuilder that one could use to generate indented code, designed to be good at that job, and otherwise as bare metal as possible.
The major design points are:
- Newlines and indentation are automatically handled by the user. As long as the buffer is at the start of the line, an indentation will be handled for the user.
- Fast by default. Most calls do almost nothing but some simple checks and passing data through to the buffer.
- Easy to add multiple lines in a single call (potentially an opt-in more-costly fashion).
- Easy to open blocks
{ ... }and other indented regions of code. - Low overhead interpolation support.
- Common proven patterns for constructs are easy (or provided by extensions). Especially writing out lists of things.
- Lowest overhead ability to get a SourceText directly (since we can internally just pass the data around, and avoid intermediary reallocs).
Note: We should move our own Syntax-generator to this, and work with community to try out our prototypes of this to help refine the final shape. As such, we should likely ship this as experimental so we have some time to revise.
Sketch:
namespace Microsoft.CodeAnalysis.CSharp
{
+public struct IndentingStringBuilder : IDisposable
+{
+ public IndentingStringBuilder(string indentationString, string endOfLine);
+ public static IndentingStringBuilder Create(string indentation = DefaultIndentation, string endOfLine = DefaultEndOfLine)
+
// Releases resources for efficiency
+ public void Dispose();
+ /// <summary>
+ /// Increases the current indentation level, increasing the amount of indentation written at the start of a
+ /// new line when content is written to this.
+ /// </summary>
+ public void IncreaseIndent();
+ /// <summary>
+ /// Decreases the current indentation level, decreasing the amount of indentation written at the start of a
+ /// new line when content is written to it.
+ /// </summary>
+ public void DecreaseIndent();
+ /// Returns a new SourceText with the contents of this writer, and resets this writer to its initial state.
+ public SourceText GetSourceTextAndClear();
+ /// <summary>
+ /// Writes content to the underlying buffer. If the buffer is at the start of a line, then indentation will be
+ /// appended first before the content. By default, for performance reasons, the content is assumed to contain no
+ /// end of line characters in it. If the content may contain end of line characters, then <see langword="true"/>
+ /// should be passed in for <paramref name="splitContent"/>. This will cause the provided content to be split into
+ /// constituent lines, with each line being appended one at a time.
+ /// </summary>
+ public IndentingStringBuilder Write(string content, bool splitContent = false);
+ /// <inheritdoc cref="Write(string, bool)"/>
+ public IndentingStringBuilder Write(ReadOnlySpan<char> content, bool splitContent = false);
+ /// <summary>
+ /// Equivalent to <see cref="Write(string, bool)"/> except that a final end of line sequence will be written after
+ /// the content is written.
+ /// </summary>
+ public IndentingStringBuilder WriteLine(string content = "", bool splitContent = false);
+ /// <inheritdoc cref="WriteLine(string, bool)"/>
+ public IndentingStringBuilder WriteLine(ReadOnlySpan<char> content, bool splitContent = false);
+ /// <summary>
+ /// Ensures that the current buffer has at least one blank line between the last written content and the content
+ /// that would be written. Note: a line containing only whitespace/indentation is not considered an empty line.
+ /// Only a line with no content on it counts.
+ /// </summary>
+ /// <returns></returns>
+ public readonly IndentingStringBuilder EnsureEmptyLine();
+ /// <summary>
+ /// Opens a block of code to write new content into, using <c>{</c> and <c>}</c> as the block delimeters. Can be used like so:
+ /// <code>
+ /// using (writer.StartBlock())
+ /// {
+ /// write.WriteLine("...");
+ /// write.WriteLine("...");
+ /// }
+ /// </code>
+ /// </summary>
+ public Region EnterBlock();
+ /// <summary>
+ /// Opens a block of code to write new content into without delimeters. Can be used like so:
+ /// <code>
+ /// using (writer.StartBlock())
+ /// {
+ /// write.WriteLine("...");
+ /// write.WriteLine("...");
+ /// }
+ /// </code>
+ /// </summary>
+ public Region EnterIndentedRegion();
+ public readonly struct Region : IDisposable;
+ public readonly IndentingStringBuilder Write(bool splitContent, [InterpolatedStringHandlerArgument("", nameof(splitContent))] WriteInterpolatedStringHandler handler);
+ public readonly IndentingStringBuilder Write([InterpolatedStringHandlerArgument("")] WriteInterpolatedStringHandler handler)
+ public readonly IndentingStringBuilder WriteLine(bool splitContent, [InterpolatedStringHandlerArgument("", nameof(splitContent))] WriteInterpolatedStringHandler handler);
+ public readonly IndentingStringBuilder WriteLine([InterpolatedStringHandlerArgument("")] WriteInterpolatedStringHandler handler);
/// <summary>
/// Provides a handler used by the language compiler to append interpolated strings into <see cref="IndentedTextWriter"/> instances.
/// </summary>
[EditorBrowsable(EditorBrowsableState.Never)]
[InterpolatedStringHandler]
public readonly ref struct WriteInterpolatedStringHandler
{
public WriteInterpolatedStringHandler(int literalLength, int formattedCount, IndentingStringBuilder builder, bool splitContent = false);
public void AppendLiteral(string literal);
public void AppendFormatted<T>(T value);
public void AppendFormatted<T>(T value, string format) where T : IFormattable;
}
}
Usage Examples
using var builder = IndentedStringBuilder.Create();
builder.AppendLine("using System;");
builder.EnsureBlankLine();
builder.AppendLine("public class C");
using (builder.EnterBlock())
{
foreach (var member in members)
{
builder.AppendLine($"public {member.ReturnType} {member.Name}()");
using (builder.EnterBlock())
{
}
}
}API draft here: #71163
You can also imagine helpers/extensions (or instance members) for common patterns. For example:
public static void WriteBlankLineSeparatedContent<T, TArg>(
this IndentedTextWriter writer,
ReadOnlySpan<T> items,
Action<IndentedStringBuilder, T, TArg> callback);
public static void WriteSeparatedContent<T, TArg>(
this IndentedTextWriter writer,
ReadOnlySpan<T> items,
Action<IndentedStringBuilder, T, TArg> callback,
string separator = ", ");Which could then be used like so:
using var builder = IndentedStringBuilder.Create();
builder.AppendLine("using System;");
builder.EnsureBlankLine();
builder.AppendLine("public class C");
using (builder.EnterBlock())
{
builder.WriteBlankLineSeparatedContent(
members,
static (builder, member, arg) =>
{
builder.Append($"public {member.ReturnType} {member.Name}(");
builder.WriteSeparatedContent(
member.Parameters,
static (builder, parameter, arg) => ...);
builder.AppendLine(")");
using (builder.EnterBlock())
{
}
}
}This would help for the common cases of wanting blank lines between things, but not at the start or end. And so on.
Other helpers the community has found useful are:
+ public IndentingStringBuilder WriteIf(bool condition, string content, bool splitContent = false);
+ public IndentingStringBuilder WriteLineIf(bool condition, string? content = "", bool splitContent = false);Which allows for common patterns of only writing out data if a condition holds, helping to avoid lots of if (...) tests in code before writing things out. It's likely we want this, especially as it woudl allow us to do the interpolation-handler form ourselves, without forcing users to have to provide that.