Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Implement option 'delete_rows' of argument 'if_exists' in 'DataFrame.to_sql' API. #60376
feat: Implement option 'delete_rows' of argument 'if_exists' in 'DataFrame.to_sql' API. #60376
Changes from all commits
a52d2e2
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So in the case the table does not exist we are just ignoring any instruction to perform a DELETE? I'm somewhat wary of assuming a user may not want an error here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, me neither.
We can remove this check letting the driver error and eventually raise a
DatabaseError
.What you think ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On second thought I think this is OK. It follows the same pattern as
replace
which will create the table if it does not existThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still a bit unclear why we are calling
.close()
in this implementation but not in the othersThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a good / recommended practice to close a cursor after usage because you release resources. I believe that's a fact we agree ?
Ok...but putting it aside for a moment the answer is the adbc driver raises an error in case a cursor is not explicitly closed, causing some tests to fail. There's no such check in sqlalchemy / sqlite3, that's why a missing

close
is "overlooked" there. Example:There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So do sqlalchemy and sqlite3 just leak the cursor then? Or should they be adding this?
I am just confused as to why we are offering potentially different cursor lifetime management across the implementations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case pandas is leaking the cursor because sqlalchemy and sqlite3 do not provide a cool and friendly message alerting the developer. On the other hand it is standard practice to always close it.
I'm in favor of adding
.close
(so we guarantee there's no leak) to all calls but we had this discussion previously.Please let me know if I can help with more context
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh OK - I thought that previously you tried to
.close
on those as well but they would cause other test failures, as the lifecycle of the cursor was tied to the class.We shouldn't be leaking resources across any of these implementations, but the challenge is that we also need to stay consistent in how we are managing lifecycles. If it is as simple as calling .close (or pref using a context manager) for all implementations, then let's do that. If calling .close on everything but ADBC is causing an error, then we need to align the cursor lifecycle management of the ADBC subclass with the others
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There you go @WillAyd :). Sorry for not sending it as a separate commit. I'm keeping a close eye on CI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All good. CI passed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK great. And none of these offer a context manager right? And we don't want to be using
self.pd._sql.execute
either?I'm still a little unclear as to the difference in usage of
self.execute
versusself.pd_sql.execute
within this PR, but I also don't want to bikeshed if that's the path it takes us downThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sqlite3 implementation for a cursor does not provide a close object. This is one of the reasons I decided to standardize to
.close
calls. Of course there's a workaround for that (contextlib.closing) and we can discuss if you find it worth it.No worries, I don't think you're bikeshedding. It is a bit confusing for sure. Let me try to break it down:
self.pd_sql.execute
should be used only atSQLiteTable
orSQLTable
classes. These classes don't implement theexecute
method asSQLiteDatabase
andSQLDatabase
do, respectively and that's why we gotta useself.pd_sql
.con
orcursor
objects directly...There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK that's helpful. Sounds like
self.pd_sql.execute
is just poorly named, but that's not a problem for this PR to fix