-
Notifications
You must be signed in to change notification settings - Fork 308
Improve async string perf and fix reading chars with initial offset. #3377
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…decrease memcopy and gc time spent
@@ -13133,8 +13133,7 @@ bool writeDataSizeToSnapshot | |||
|
|||
totalCharsRead = (startOffsetByteCount >> 1); | |||
charsLeft -= totalCharsRead; | |||
offst = totalCharsRead; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is all that's required to fix #3331, done in both netcore and netfx.
@@ -13088,7 +13088,7 @@ bool writeDataSizeToSnapshot | |||
|| | |||
(buff.Length >= offst + len) | |||
|| | |||
(buff.Length == (startOffsetByteCount >> 1) + 1), | |||
(buff.Length >= (startOffsetByteCount >> 1) + 1), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change simply allows the target buffer we're copying into to be a bit larger than we actually need. This is required so that we can make the later change which grows the buffer more than we need.
@@ -13152,7 +13152,10 @@ bool writeDataSizeToSnapshot | |||
} | |||
else | |||
{ | |||
newbuf = new char[offst + charsRead]; | |||
// grow by an arbitrary number of packets to avoid needing to reallocate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As this comment explains we grow the buffer in a slightly unusual way. Typically buffer sizes are doubled or use some reasonably fraction (like 1.5). This can be unhelpful when we know that we're likely to run in low memory conditions like containers and that the data could be very large. It would be very easy to have a 512 Mib buffer and be unable to resize it to 1Gib because of limits.
The reason that we need to resize the buffer is that we've encountered some more data and we aren't certain of the total data size. In the case of a multipacket string we would keep encountering more data with each packet and keep creating new buffers, copying data and then doing it all again resulting in huge amounts of memory throughput and time spent in memcpy, see profile traces in #3274 (comment)
This strategy allows the buffer to be slightlty oversized each time more space is needed. The multiple of packetsize means that we can accept at least that number of packets before needing another resize (because packet data cannot be entirely made up of char data) which reduces the reallocation frequenty by the multiplier. 8 is an entirely arbitrary choice guided by the opposing needs to be 1) conservative about growth and 2) reduce frequency of reallocation.
@@ -13096,9 +13096,9 @@ bool writeDataSizeToSnapshot | |||
// If total length is known up front, the length isn't specified as unknown | |||
// and the caller doesn't pass int.max/2 indicating that it doesn't know the length | |||
// allocate the whole buffer in one shot instead of realloc'ing and copying over each time | |||
if (buff == null && stateObj._longlen != TdsEnums.SQL_PLP_UNKNOWNLEN && len < (int.MaxValue >> 1)) | |||
if (buff == null && stateObj._longlen != TdsEnums.SQL_PLP_UNKNOWNLEN && stateObj._longlen < (int.MaxValue >> 1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've found this line of code suspicious for quite a while but haven't been confident about changing it. Understanding what it's doing requires both understanding what the internals of the code are doing and how plp data is represented in the tds protocol, please see TDS spec 2.2.5.2.3:
A value sent as PLP can be either NULL, a length followed by chunks (as defined by PLP_CHUNK), or an unknown length token followed by chunks
The important part is that the total length can be sent up front. In practice when we encounter a plp data stream we read the length and store the value in stateObj._longlen
. The len
parameter is not the same thing, it's a parameter to the function to tell it how much data we're expecting it to read and if we're reading chunks we usually don't know.
Now look at the check it's checking the buffer, then the plp total length to see if it's unknown, then the length we're asking for. Why does it matter if we're asking for "as much as possible" which is what int.MaxValue /2 is interpretted as? It doesn't matter as far as I can see in this context. The context is deciding how large of a buffer to allocate.
Now imagine we change it like this. If we have no buff yet, and we know the stateObj._longlen
(which is what the protocol gave us as the known total length) is not the sprecific value for unknown, and the value of stateObj._longlen
is less than the largest possible char buffer we're allowed to allocate by the runtime, then we could allocate a buffer of that size.
With that change all tests continue to pass and we're more likely to allocate correctly size buffers for plp data at the start of the read process. This avoids later copying resizing etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like we're still requesting/creating a buffer of length Min(stateObj._longlen, len) below, so will this actually change the behavior?
When we're chunking PLP data, what values are passed in for len
? Is it just a guess? based on packet size? something else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
len will be unknown length, which is int.max/2 as the original comment mentions. So all valid plp longlen's will be less than it in that case. If we are passed a len value which is less than the chunk size it'll read the chunk length then read the len bytes that it wants from that chunk. The min(len, _longlen) still makes sense which is why I didn't change it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit suspicious of this change ... SQL_PLP_UNKNOWNLEN
is a magic number that indicates the plp data is of unknown length, but (int.MaxValue >> 1)
is magic number that indicates the caller doesn't know the length. Also, _longlen is a ulong but len is just an int. So I'm confused why the magic maxint/2 value would be useful to the _longlen. Ie, there's a wide range between UNKNOWNLEN
and int.MaxValue >> 1
that isn't being accounted for, afaict.
Either way, rewriting the comment above the check and giving int.MaxValue >> 1 a constant would probably help with understanding what's going on here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My interpretation of the int.MaxValue >> 1 value is that it's simply a derivation of the SQL_PLP_UNKNOWNLEN constant which is a byte count converted to a char count since chars are twice as long. As far as I know the protocol only deals with byte count lengths. If you're reading bytes you'd expect the unknown value and if you're working in chars you /2, there are a number of other places in the codebase that this operation is done.
I've spent a lot of time debugging through this code over a few years because of the async-continue work and other improvements before that. When doing so I've observed that the piece of code that I'm changing here is almost never executed because the len is the sentinel int.max/2 value. Inspecting the state on several of those occasions it wasn't clear to me why the check was written the way that it is.
As I mentioned in #3377 (comment) I found this suspicious, but only suspicious. It's not clearly incorrect and the code functions but it does so less well than it possibly can. I appreciate the need for confidence in the change but without being the original author I can't identify the original intention of the check and whether it guards against some obscure condition that we can't identify. Unfortunately I can't claim to have complete knowledge of why it is the way it is so I also can't say that the change is entirely safe. It's just too complicated for that level of assertion.
Without this change everything continues to work. If we don't allocate the entire array up front then we carry on into the code path where we repeatedly resize the array which I've changed in this PR. If you'd prefer to drop this particular change from the PR and keep all the other changes I'd be fine with it, it's still a bug fix and net gain PR.
/cc @Tornhoof, @AlexanderKot this should fix your issues. @dotnet/sqlclientdevteam can I get a CI kick? I'd really really like to get this in to preview 2 because it fixes all the known issues with async continue. |
/azp run |
Azure Pipelines successfully started running 2 pipeline(s). |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3377 +/- ##
==========================================
- Coverage 67.04% 58.83% -8.22%
==========================================
Files 300 292 -8
Lines 65376 65214 -162
==========================================
- Hits 43831 38366 -5465
- Misses 21545 26848 +5303
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR targets performance improvements in asynchronous string processing and corrects an issue with reading characters using an initial offset.
- Added a new sequential character reading test ("CanGetCharsSequentially") to verify correct behavior.
- Introduced helper methods in tests to use pooled buffers, ensuring buffers are resized appropriately.
- Updated buffer management logic in TdsParser (both netfx and netcore) to use the correct offset and to improve buffer reallocation efficiency.
Reviewed Changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
File | Description |
---|---|
src/Microsoft.Data.SqlClient/tests/ManualTests/SQL/DataReaderTest/DataReaderTest.cs | Adds a new test and helper method for sequential character reading using pooled buffers. |
src/Microsoft.Data.SqlClient/src/Microsoft/Data/SqlClient/TdsParserStateObject.cs | Introduces a GetPacketSize() method and a property to calculate the current packet index. |
src/Microsoft.Data.SqlClient/netfx/src/Microsoft/Data/SqlClient/TdsParser.cs | Adjusts buffer sizing logic and offset handling to improve performance and correctness. |
src/Microsoft.Data.SqlClient/netcore/src/Microsoft/Data/SqlClient/TdsParser.cs | Mirrors changes in netfx for improved buffer management and offset accumulation. |
src/Microsoft.Data.SqlClient/tests/ManualTests/SQL/DataReaderTest/DataReaderTest.cs
Show resolved
Hide resolved
src/Microsoft.Data.SqlClient/netfx/src/Microsoft/Data/SqlClient/TdsParser.cs
Show resolved
Hide resolved
src/Microsoft.Data.SqlClient/netcore/src/Microsoft/Data/SqlClient/TdsParser.cs
Show resolved
Hide resolved
src/Microsoft.Data.SqlClient/src/Microsoft/Data/SqlClient/TdsParserStateObject.cs
Outdated
Show resolved
Hide resolved
/azp run |
Azure Pipelines successfully started running 2 pipeline(s). |
@@ -13096,9 +13096,9 @@ bool writeDataSizeToSnapshot | |||
// If total length is known up front, the length isn't specified as unknown | |||
// and the caller doesn't pass int.max/2 indicating that it doesn't know the length | |||
// allocate the whole buffer in one shot instead of realloc'ing and copying over each time | |||
if (buff == null && stateObj._longlen != TdsEnums.SQL_PLP_UNKNOWNLEN && len < (int.MaxValue >> 1)) | |||
if (buff == null && stateObj._longlen != TdsEnums.SQL_PLP_UNKNOWNLEN && stateObj._longlen < (int.MaxValue >> 1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like we're still requesting/creating a buffer of length Min(stateObj._longlen, len) below, so will this actually change the behavior?
When we're chunking PLP data, what values are passed in for len
? Is it just a guess? based on packet size? something else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't say I know much about the internals here, so I appreciate the callouts on what changes are being made. However, I'm not understanding what the _longlen changes are really doing. Please take a look at my comments and I'll approve if you can explain it without code changes :)
@@ -13088,17 +13088,17 @@ bool writeDataSizeToSnapshot | |||
|| | |||
(buff.Length >= offst + len) | |||
|| | |||
(buff.Length == (startOffsetByteCount >> 1) + 1), | |||
(buff.Length >= (startOffsetByteCount >> 1) + 1), | |||
"Invalid length sent to ReadPlpUnicodeChars()!" | |||
); | |||
charsLeft = len; | |||
|
|||
// If total length is known up front, the length isn't specified as unknown | |||
// and the caller doesn't pass int.max/2 indicating that it doesn't know the length |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think with this change, the comment will need to be changed as well - for this check we no longer care what the caller requested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would you like it changed to?
@@ -13096,9 +13096,9 @@ bool writeDataSizeToSnapshot | |||
// If total length is known up front, the length isn't specified as unknown | |||
// and the caller doesn't pass int.max/2 indicating that it doesn't know the length | |||
// allocate the whole buffer in one shot instead of realloc'ing and copying over each time | |||
if (buff == null && stateObj._longlen != TdsEnums.SQL_PLP_UNKNOWNLEN && len < (int.MaxValue >> 1)) | |||
if (buff == null && stateObj._longlen != TdsEnums.SQL_PLP_UNKNOWNLEN && stateObj._longlen < (int.MaxValue >> 1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit suspicious of this change ... SQL_PLP_UNKNOWNLEN
is a magic number that indicates the plp data is of unknown length, but (int.MaxValue >> 1)
is magic number that indicates the caller doesn't know the length. Also, _longlen is a ulong but len is just an int. So I'm confused why the magic maxint/2 value would be useful to the _longlen. Ie, there's a wide range between UNKNOWNLEN
and int.MaxValue >> 1
that isn't being accounted for, afaict.
Either way, rewriting the comment above the check and giving int.MaxValue >> 1 a constant would probably help with understanding what's going on here.
Description
For performance related changes see background information in #3274 (comment) and the preceeding conversation. Reading a string in async mode was being slower than expected so I investigated. I will annotate each change with an explanation.
Issues
fixes #3331
fixes #3274
Testing
Tests have been added which cover both issues that are being fixed.