-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Remove redundant statistics from FileScanConfig #14955
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
f3eb12a to
c819b16
Compare
|
Looks like there are some CI issues to address Note that @blaginin fixed some issues recently, so if you merge up from main it might be better now |
c819b16 to
bdfb9ac
Compare
|
i feel it may be easier if we fix #14936 first. I was planning to do it this week, but feel free to take over (just take the issue then) |
bdfb9ac to
6743003
Compare
|
hey @Standing-Man #15352 just got merged so this pr may be easier to finish fyi 🌻 |
Signed-off-by: Alan Tang <[email protected]>
9397785 to
2d6e298
Compare
Signed-off-by: Alan Tang <[email protected]> chore: fix fmt and clippy errors
cbeb95e to
b99e658
Compare
|
Hi @blaginin, Thanks for your valuable contributions! I’ll continue working on fixing this issue. |
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @Standing-Man and @blaginin
FYI @xudong963 perhaps you can review this too as you are also working on statistics
xudong963
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems the statistics generated in line 884 will be lost.
I think the idea is that the DataSource has the statistics rather than the FileScanConfig 🤔 |
yeah i know, but i don't see the statistics is set to the datasource. I will check the code again later. |
xudong963
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the confusion, I saw this line: https://github.com/apache/datafusion/pull/14955/files#diff-6ddd45bba34238ec207080fd56251fffeee8ef263a3ae25be172362bab753a5cL401, so the statistics isn't lost.
Thank you for your contribution!
|
Thanks again @Standing-Man and @xudong963 |
* enhance: Remove redundant statistics from FileScanConfig Signed-off-by: Alan Tang <[email protected]> * chore: fix some ci error Signed-off-by: Alan Tang <[email protected]> chore: fix fmt and clippy errors --------- Signed-off-by: Alan Tang <[email protected]>

Which issue does this PR close?
Rationale for this change
Both
FileScanConfigandDataSourcehas same statistics, it make that statistics may have out-of-sync bugs.What changes are included in this PR?
Remove redundant statistics from
FileScanConfigand only a single statistics that held on theDataSource.Are these changes tested?
Are there any user-facing changes?