-
Notifications
You must be signed in to change notification settings - Fork 605
Optimized cross-platform binary (de)serialization #729
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimized cross-platform binary (de)serialization #729
Conversation
…-platform. Also using array pooling to reduce allocations when (de)serializing data from the NetworkStream.
This should be back-portable to 5.x as well if that's wanted. |
This passes all tests and seems reasonable. Are there any memory usage benchmarks you can share to demonstrate the difference? |
I'm adding a couple more optimizations, then I'll post benchmarks :) |
Benchmark app is opening two connections, sending 50000 messages with a 4k payload on one connection and receiving the same messages on another connection. App Codeusing System;
using System.Threading;
using System.Threading.Tasks;
using RabbitMQ.Client;
using RabbitMQ.Client.Events;
namespace DeadlockRabbitMQ
{
class Program
{
private static int messagesSent = 0;
private static int messagesReceived = 0;
private static int batchesToSend = 100;
private static int itemsPerBatch = 500;
static async Task Main(string[] args)
{
var connectionString = new Uri("amqp://guest:guest@localhost/");
var connectionFactory = new ConnectionFactory() { DispatchConsumersAsync = true, Uri = connectionString };
var connection = connectionFactory.CreateConnection();
var connection2 = connectionFactory.CreateConnection();
var publisher = connection.CreateModel();
var subscriber = connection2.CreateModel();
publisher.ConfirmSelect();
//subscriber.ConfirmSelect();
publisher.ExchangeDeclare("test", ExchangeType.Topic, true);
subscriber.QueueDeclare("testqueue", true, false, true);
var asyncListener = new AsyncEventingBasicConsumer(subscriber);
asyncListener.Received += AsyncListener_Received;
subscriber.QueueBind("testqueue", "test", "myawesome.routing.key");
subscriber.BasicConsume("testqueue", true, "testconsumer", asyncListener);
byte[] payload = new byte[4096];
var batchPublish = Task.Run(() =>
{
while (messagesSent < batchesToSend * itemsPerBatch)
{
var batch = publisher.CreateBasicPublishBatch();
for (int i = 0; i < itemsPerBatch; i++)
{
var properties = publisher.CreateBasicProperties();
properties.AppId = "testapp";
properties.CorrelationId = Guid.NewGuid().ToString();
batch.Add("test", "myawesome.routing.key", false, properties, payload);
}
batch.Publish();
messagesSent += itemsPerBatch;
publisher.WaitForConfirmsOrDie();
}
});
var sentTask = Task.Run(async () =>
{
while (messagesSent < batchesToSend * itemsPerBatch)
{
Console.WriteLine($"Messages sent: {messagesSent}");
await Task.Delay(500);
}
Console.WriteLine("Done sending messages!");
});
var receivedTask = Task.Run(async () =>
{
while (messagesReceived < batchesToSend * itemsPerBatch)
{
Console.WriteLine($"Messages received: {messagesReceived}");
await Task.Delay(500);
}
Console.WriteLine("Done receiving all messages.");
});
await Task.WhenAll(sentTask, receivedTask);
publisher.Dispose();
subscriber.Dispose();
connection.Dispose();
connection2.Dispose();
}
private static Task AsyncListener_Received(object sender, BasicDeliverEventArgs @event)
{
// Doing things in parallel here is what will eventually trigger the deadlock,
// probably due to a race condition in AsyncConsumerWorkService.Loop, although
// I've had trouble pinpointing it exactly, but due to how the code in there uses
// a TaskCompletionSource, and elsewhere overrides it, it might cause Enqueue and Loop
// to eventually be working with different references, or that's at least the current theory.
// Moving to better synchronization constructs solves the issue, and using the ThreadPool
// is standard practice as well to maximize core utilization and reduce overhead of Thread creation
Interlocked.Increment(ref messagesReceived);
return Task.CompletedTask;
}
}
} By far the biggest reduction is in the number of bytes allocated in
|
B.t.w, this can be taken a lot further with some more optimizations which I'll spend time on adding later if this PR is approved. |
I will investigate the test failures this morning - https://ci.appveyor.com/project/rabbitmq/rabbitmq-dotnet-client/builds/30966174 |
Found the bugs. Fixed now :) Let's see the test run. |
That was fast! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bording I assume that a new package reference is allowed in 5.2.0
(for when I backport this)
temp = bytes[3]; | ||
bytes[3] = bytes[4]; | ||
bytes[4] = temp; | ||
return TemporaryBinaryReader(bytes).ReadDouble(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's SO NICE to see this sort of code disappear 🎉
These are solid improvements in terms of peak memory allocations. I'll take it together with less code we reinvented (admittedly a long time ago) 👍 Thanks for your substantial contributions @stebet! 💪🤓 |
default: | ||
{ | ||
string message = | ||
string.Format("Invalid boolean value in StreamMessage: {0}", value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for better performance string.Concat can be used
string.Concat("Invalid boolean value in StreamMessage: ", value.ToString())
.
See that I also suggest value.ToString()
to reduce boxing of the value type.
Optimized cross-platform binary (de)serialization (cherry picked from commit 5a681f8)
Merge pull request #729 from stebet/crossPlatformSerialization
@lukebakken Well, in general that would be fine I think, but this is one of the packages that I had previously been saying that I thought would be better holding off on for the 6.0 release, and then using in 7.0. So, I'm not sure it's a good idea at all to use this in 5.2.0, but I see you've already backported it. |
I understand your concern, and indeed earlier versions of this package were problematic with binding redirects, but it now has proper targets for earlier framework versions as well as netstandard1.x and is in active use by very big packages such as Npgsql and Google.ProtoBuf as I mentioned above. Perhaps @terrajobst can chime in with some details or point to the source, or correct me if I'm wrong. |
The latest version of the System.Memory package looks like: It doesn't have any .NET Framework targets. It also has dependencies on other several other packages, so it's not just this one package, it's the entire package hierarchy. And sure, this stuff can work on .NET Framework, I just think there are enough landmines that it's better just to avoid them completely. |
That's System.Buffers. Here is the System.Memory package (which does targets System.Buffers, with the correct redirects): https://www.nuget.org/packages/System.Memory/ I added the System.Memory package reference. I can create some test projects targeting different framework versions to verify that it binds correctly. |
@stebet That screenshot is of the System.Memory package, not System.Buffers. But System.Memory also brings in System.Buffers.
Test projects aren't going to be enough. There are weird edge cases that you can get into based on which version of the framework you're using and which other dependencies you might also have in your project which can lead to odd binding redirect problems. Ultimately, the issue isn't when you yourself have a reference to these packages. It's when you have references to these packages transitively through multiple other dependencies in your project, and then they depend on different versions, leading to binding redirect hell. |
@bording it's very easy for me to revert these changes in |
I'm sure someone could look at what I'm advocating for and claim I'm being too conservative, but I've seen enough people hit the sharp edges around them that it makes me wary to add them as dependencies for older versions of the .NET Framework. It just makes the consumers of your packages have to deal with a lot of pain. Nick Craver from Stack Overflow recently wrote up a good article that talks about a lot of the problems, and you'll notice that System.Memory is one of the problematic packages: https://nickcraver.com/blog/2020/02/11/binding-redirects/ |
Thanks, that's good enough evidence for me. What we see all the time are users blindly or automatically updating client libraries - frequently in production. |
Reverted in |
The question that remains now, is the change in this PR something you want to take for 6.0? Since 6.0 is still targeting The only truly safe way to take the dependency is to target .NET Core only and prevent people from using it with the .NET Framework entirely. |
It'd be interesting to run a prerelease package with the changes under 6.0 and see if they behave as they should. The fact that these packages are used by big ones like Google.Protobuf gives me added confidence that the latest versions are correctly configured. The problems with these packages were almost always a side-effect of only targeting netstandard2.0 and not having proper lower targets for .NET Framework versions. I'd like to verify with people that really know their stuff around this as I'd consider the gains (near elimination of memory churn) to be worth it in a major version bump, if that's ok with everyone. B.t.w, I do like your skepticism @bording, as strange as it may sound. It's always good to have a voice of reason to double check these things. |
And just to give an example, we actually have several in-house projects using these libraries running on .NET Framework 4.6.2, 4.7.1 and 4.8, including WebForms, MVC 5, and a WCF app so I do have personal experiences with projects targeting them. That also include StackExchange.Redis which uses Pipelines, which in turn references these libraries as well. |
I think the benefits that @stebet has been providing outweigh any downsides for version |
It's not just that, though. It's also all the other packages you might be referencing in your project, and the versions of the problematic packages that they might be referencing where you'll find yourself running into painful problems. This library can do all the due diligence it can around them, but someone can still easily run into pain because of the combination of packages they use, and conflicting versions of these packages.
I also, think that it's worth a major version bump for these kind of improvements. I'm just thinking that it makes more sense to release 6.0 without them, and then begin working on a 7.0 that brings these improvements in and also drops .NET Framework support at the same time. @lukebakken You are of course free to move ahead with adopting these in the 6.0 timeframe, just wanted to make sure you're doing so knowing the potential risks and support issues it could cause! |
We will have at least one RC for version |
Proposed Changes
This PR makes the (de)serialization of data to/from the network stream cross platform, taking the endianness of the running system into account as well as using array pooling for the serialization work reducing memory allocations. This will also unlock the ability to get rid of NetworkBinary(Reader/Writer) and simply create a class that can work directly cross-platform on bytes/spans/memory provided by Streams/Sockets/Pipelines which would decouple the serialization from the transport layer.
This also resolves #449.
Contains changes to the public API (methods that are no longer used internally)
Please note that this adds a reference to the System.Memory NuGet package. This package should be safe to use for .NET Framework targets, for example it is currently referenced by very big packages such as Google.Protobuf, Npgsql and System.Diagnostics.DiagnosticSource, so it is well tested
Tagging @bording for visibility.
Types of Changes
Checklist
CONTRIBUTING.md
document