-
Notifications
You must be signed in to change notification settings - Fork 234
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improved N-1 join query performance for DW SQL #2631
Conversation
/azp run |
Azure Pipelines successfully started running 6 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 6 pipeline(s). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good, but looking for better testing. And making sure we still continue to test when the feature flag is off because thats the default behavior today.
Turning it on always by default in testing keeps us at risk of regression in the false scenario.
/azp run |
Azure Pipelines successfully started running 6 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 6 pipeline(s). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for addressing all comments and answering the questions! We should look at better DwSql testing (not using SQL2019 instance for e.g.) as a follow up to this PR.
/azp run |
Azure Pipelines successfully started running 6 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 6 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 6 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 6 pipeline(s). |
Why make this change?
Back 1 year ago, as DW SQL does not support
JSON PATH
for converting the execution to json format. Hence we had to useSTRING_AGG
as the workaround.Recently, we've noticed
JSON PATH
is now supported for outer query in DW, and we can useJSON_OBJECT
+JSON_PATH
to address the json conversion for N-to-1 relations, which can optimize the performance.For N-N relations, we're still looking into any resolutions as
JSON_ARRAYAGG
does not provide much performance improvements.For other scenarios when joins are not needed for a simple SELECT, we will
JSON PATH
instead ofSTRING_AGG
for better performance.What is this change?
This PR covers
JSON_OBJECT
to generate the columns for sub-queries and appliedJSON PATH
to handle outer query, which fully replace the need ofSTRING_AGG
.JSON PATH
to replace the need ofSTRING_AGG
for better performance as well. This will have impact on aggregations, non-join queries and pagination.How was this tested?
Unit Tests
Integration Tests
Manual Testing - Join Scenarios
Query 1-1 relation - As expected, optimization applied
Query N-1 relation - As expected, optimization applied
Query 1-N Relation - As expected, optimization not applied
Query N-N Relation - As expected, optimization not applied
Other Scenarios
We've applied the
JSON PATH
when there is no join in the query to replace theSTRING_AGG
for better performance.Aggregation
Non-Join Query
Pagination
N to 1, total items: 3


N to N


