Skip to content

Introduce RegexRunnerPool to reuse multiple RegexRunner instances at once under contention #116184

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@
<Compile Include="System\Text\RegularExpressions\RegexRunner.cs" />
<Compile Include="System\Text\RegularExpressions\RegexRunnerFactory.cs" />
<Compile Include="System\Text\RegularExpressions\RegexRunnerMode.cs" />
<Compile Include="System\Text\RegularExpressions\RegexRunnerPool.cs" />
<Compile Include="System\Text\RegularExpressions\RegexTree.cs" />
<Compile Include="System\Text\RegularExpressions\RegexTreeAnalyzer.cs" />
<Compile Include="System\Text\RegularExpressions\RegexWriter.cs" />
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ public partial class Regex : ISerializable

private WeakReference<RegexReplacement?>? _replref; // cached parsed replacement pattern
private volatile RegexRunner? _runner; // cached runner
private RegexRunnerPool? _runnerPool; // pool of cached runners to spill into

#if DEBUG
// These members aren't used from Regex(), but we want to keep them in debug builds for now,
Expand Down Expand Up @@ -421,7 +422,7 @@ protected void InitializeReferences()
ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument.length, ExceptionResource.LengthNotNegative);
}

RegexRunner runner = Interlocked.Exchange(ref _runner, null) ?? CreateRunner();
RegexRunner runner = RentOrCreateRunner();
try
{
runner.InitializeTimeout(internalMatchTimeout);
Expand Down Expand Up @@ -453,7 +454,7 @@ protected void InitializeReferences()
finally
{
runner.runtext = null; // drop reference to text to avoid keeping it alive in a cache.
_runner = runner;
ReturnRunner(runner);
}
}

Expand All @@ -466,7 +467,7 @@ protected void InitializeReferences()
// that takes in startat.
Debug.Assert(startat <= input.Length);

RegexRunner runner = Interlocked.Exchange(ref _runner, null) ?? CreateRunner();
RegexRunner runner = RentOrCreateRunner();
try
{
runner.InitializeTimeout(internalMatchTimeout);
Expand Down Expand Up @@ -513,7 +514,7 @@ protected void InitializeReferences()
}
finally
{
_runner = runner;
ReturnRunner(runner);
}
}

Expand All @@ -529,7 +530,7 @@ private void RunAllMatchesWithCallback<TState>(string? inputString, ReadOnlySpan
Debug.Assert(inputString is null || inputSpan.SequenceEqual(inputString));
Debug.Assert((uint)startat <= (uint)inputSpan.Length);

RegexRunner runner = Interlocked.Exchange(ref _runner, null) ?? CreateRunner();
RegexRunner runner = RentOrCreateRunner();
try
{
runner.runtext = inputString;
Expand Down Expand Up @@ -599,7 +600,7 @@ private void RunAllMatchesWithCallback<TState>(string? inputString, ReadOnlySpan
finally
{
runner.runtext = null; // drop reference to string to avoid keeping it alive in a cache.
_runner = runner;
ReturnRunner(runner);
}
}

Expand Down Expand Up @@ -643,11 +644,43 @@ private void RunAllMatchesWithCallback<TState>(string? inputString, ReadOnlySpan
return RegularExpressions.Match.Empty;
}

/// <summary>Creates a new runner instance.</summary>
private RegexRunner CreateRunner() =>
// The factory needs to be set by the ctor. `factory` is a protected field, so it's possible a derived
[MethodImpl(MethodImplOptions.AggressiveInlining)]
private RegexRunner RentOrCreateRunner()
{
RegexRunner? runner = Interlocked.Exchange(ref _runner, null);
if (runner != null)
{
return runner;
}

RegexRunnerPool? pool = _runnerPool;
if (pool != null && pool.TryGet(out runner))
{
return runner;
}

// The factory needs to be set by the ctor. `factory` is a protected field, so it's possible a derived
// type nulls out the factory after we've set it, but that's the nature of the design.
factory!.CreateInstance();
return factory!.CreateInstance();
}

[MethodImpl(MethodImplOptions.AggressiveInlining)]
private void ReturnRunner(RegexRunner runner)
{
if (_runner is null)
{
// If we don't have a runner, then we can just store this one.
_runner = runner;
}
else
{
// If we reached here, it means that another operation has won the race and already stored a runner.
// Use this condition to detect contended runner usage and initialize a pool to store the runner in.
// Here, we may also lose the race and create more than one pool. This is acceptable as the goal is to
// reduce the number of ammortized allocations at reasonable cost rather than eliminating every single one.
(_runnerPool ??= new()).Return(runner);
}
}

/// <summary>True if the <see cref="RegexOptions.Compiled"/> option was set.</summary>
[Obsolete(Obsoletions.RegexExtensibilityImplMessage, DiagnosticId = Obsoletions.RegexExtensibilityDiagId, UrlFormat = Obsoletions.SharedUrlFormat)]
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
// Licensed to the .NET Foundation under one or more agreements.
// The .NET Foundation licenses this file to you under the MIT license.

using System.Collections.Concurrent;
using System.Diagnostics.CodeAnalysis;
using System.Threading;

namespace System.Text.RegularExpressions
{
internal sealed class RegexRunnerPool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there any value in attmepting to use an existing pool such as something out of Microsoft.Extensions.ObjectPool

Copy link
Contributor Author

@neon-sunset neon-sunset Jun 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a queue/stack pool to existing cached runner in a field essentially mimics DefaultObjectPool<T>: https://github.com/dotnet/aspnetcore/blob/main/src/ObjectPool/src/DefaultObjectPool.cs

However, I didn't see an easy way to bring it over and ObjectPool opts for using ConcurrentQueue<T> instead, which incurs >800B in allocations compared to ConcurrentStack<T> which is 56B to allocate + add one element. I wanted to keep memory usage to a minimum and avoid increasing assembly size more than necessary - regex patterns may still be fairly shortlived even if shared by multiple threads, or there can be thousands of them kept for a long time (which matches the workload of the application I found the initial issue in). This is still just an attempt - it needs a less noisy benchmark and I need to look into test failures first. Thank you for reviewing this!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you for working on it.

{
private static readonly int s_maxCapacity = Math.Max(4, Environment.ProcessorCount);

private ConcurrentStack<RegexRunner> _storage = new();
private int _storedCount;

public bool TryGet([NotNullWhen(true)] out RegexRunner? runner)
{
if (_storage.TryPop(out runner))
{
Interlocked.Decrement(ref _storedCount);
return true;
}

return false;
}

public void Return(RegexRunner runner)
{
if (Interlocked.Increment(ref _storedCount) <= s_maxCapacity)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you keep this code you could add some Debug.Assert that it wasn't already returned

{
_storage.Push(runner);
}
else
{
Interlocked.Decrement(ref _storedCount);
}
}
}
}
Loading