Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

don't write deletion vector entry in the log #3211

Open
djouallah opened this issue Feb 12, 2025 · 6 comments
Open

don't write deletion vector entry in the log #3211

djouallah opened this issue Feb 12, 2025 · 6 comments
Labels
bug Something isn't working

Comments

@djouallah
Copy link

getting errors from snowflake when I try to read a delta table there, it seems delta_rs add an entry "deletionVector": null

which get misinterpreted as there are some deletion vectors in the file, I notice spark don't write that entry in the log hence it works just fine.

@djouallah djouallah added the bug Something isn't working label Feb 12, 2025
@ion-elgreco
Copy link
Collaborator

Seems like easy for fix snowflake, if it's null, then obviously they can ignore it

@scovich
Copy link

scovich commented Feb 13, 2025

JSON null is a weird beast... some systems treat it as non-NULL (because something is there), and the spark/parquet variant spec mandates that it is NOT SQL NULL.

That said, regardless of whether it's a "proper" SQL NULL or not, it's clearly not a DV and clients should ignore it.

@Zan-L
Copy link

Zan-L commented Feb 14, 2025

I asked the same question before (#3055) but Delta protocol does allow null deletion vector field. I guess delta-rs is just not going to be compatible with Snowflake unless someone asks Snowflake to update their code.

@ion-elgreco
Copy link
Collaborator

ion-elgreco commented Feb 14, 2025

I asked the same question before (#3055) but Delta protocol does allow null deletion vector field. I guess delta-rs is just not going to be compatible with Snowflake unless someone asks Snowflake to update their code.

You mean "snowflake is not going to be compatible with delta-rs and the delta protocol".

@Zan-L
Copy link

Zan-L commented Feb 14, 2025

I asked the same question before (#3055) but Delta protocol does allow null deletion vector field. I guess delta-rs is just not going to be compatible with Snowflake unless someone asks Snowflake to update their code.

You mean "snowflake is not going to be compatible with delta-rs and the delta protocol".

I see what you mean. Yes, the fault is not at delta-rs. However, the fact that Snowflake claims to support Delta implies that the native Delta implementation (Databricks?) must have chosen one of the two routes specified in the protocol, which is to not have deletion vector field when it's not enabled, and Snowflake must have blindly followed Databricks. Unfortunately, both Snowflake and Databricks are the more dominant players.

@ion-elgreco
Copy link
Collaborator

Nothing is keeping Snowflake from fixing this issue though 🤷 . They are a billion dollar company with enough money and devs to address this on their end.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants