Skip to content

Patched DataFusion version 45.0.0 #54

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 14 commits into
base: base-df-upgrade-ver45
Choose a base branch
from

Conversation

wiedld
Copy link
Collaborator

@wiedld wiedld commented Jan 28, 2025

Follow on to

This brings us up to the version45 release, this Feb 3rd commit here and this apache branch.

Patches due to tech debt (all slated as lower priority, and are lingering):

Patches due to upstream bugs:

Patches because we haven't caught up yet:

  • Fixed (in ver 46):
  • Feb 5: fix(ci): build error with wasm
  • Feb 18: Specify rust toolchain explicitly, document how to change it (#14655)
  • Feb 21: Bump MSRV to 1.82, toolchain to 1.85 (#14811)
  • March 11: chore: get wasm to build in CI.
    • This is a temporary patch of ours, instead of pulling in alot of code to proper build wasm for parquet. The actual fix requires 2 PRs together: 6f285d6 and the earlier (large) PR with most of the code, but also introduced a bug fixed on March 11.
  • April 1: Disable sccache action to fix gh cache issue (https://github.com/apache/datafusion/pull/15536)
  • April 5: test: update sqllogictest errors.
    • This was updated on apache main with a much larger PR for improved spill (5c31692).
  • 3 weeks ago: chore: fix clippy::large_enum_variant for DataFusionError

Patches due to iox needs (unclear if upstream action TBD):

  • fix: handle when the left side of the union has no fields (e.g. an empty projection)

@wiedld wiedld force-pushed the iox-13280/upgrade-df-ver45 branch from 1353531 to 6405b35 Compare February 4, 2025 16:21
@alamb alamb mentioned this pull request Feb 5, 2025
@alamb alamb changed the title Upgrade to patched datafusion version 45.0.0 Patched DataFusion version 45.0.0 Feb 5, 2025
@alamb alamb force-pushed the iox-13280/upgrade-df-ver45 branch from b1c4a07 to c915be1 Compare February 5, 2025 19:16
@alamb
Copy link
Collaborator

alamb commented Feb 7, 2025

I re-reviewed this PR and patches and it looks good to me 👍

Thanks @wiedld

Copy link

github-actions bot commented Apr 9, 2025

Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or this will be closed in 7 days.

@github-actions github-actions bot added the Stale label Apr 9, 2025
@github-actions github-actions bot closed this Apr 16, 2025
@wiedld
Copy link
Collaborator Author

wiedld commented May 15, 2025

This auto-closed.
We need this branch. It's our next DF upgrade to ver 45.

@wiedld wiedld reopened this May 15, 2025
…rceDistribution) which later causes an error during EnforceSort (without our patch). The next DataFusion version 46 upgrade does the proper fix, which is to not insert the coalesce in the first place.

test: recreating the iox plan:
* demonstrate the insertion of coalesce after the use of column estimates, and the removal of the test scenario's forcing of rr repartitioning

test: reproducer of SanityCheck failure after EnforceSorting removes the coalesce added in the EnforceDistribution

fix: special case to not remove the needed coalesce
@wiedld wiedld force-pushed the iox-13280/upgrade-df-ver45 branch from b7d3c03 to 4f816b8 Compare May 21, 2025 06:28
* Bump sccache version to latest to fix gh cache issue.

* version blocked, trying with a hash

* disable sccache.
@wiedld
Copy link
Collaborator Author

wiedld commented May 21, 2025

The security audit CI will fail due to:

Crate:     pyo3
Version:   0.23.5,
Title:     Risk of buffer overflow in `PyString::from_object`
Date:      2025-04-01
ID:        RUSTSEC-2025-0020
URL:       https://rustsec.org/advisories/RUSTSEC-2025-0020
Solution:  Upgrade to >=0.24.1

But the pyo3 upgraded cannot be done until the arrow upgrade (Apr 4 commit: b717723)

@wiedld
Copy link
Collaborator Author

wiedld commented May 21, 2025

I'm not going to debug the last few CI items, since they do not impact us.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants