Skip to content

Fix dead locks in connection pool #499

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

dkropachev
Copy link
Collaborator

Investigating problems with scaling test for #473 I have found dead locks in both pools implementations: HostConnection and HostConnectionPool

All these deadlocks caused by following reasons:

  1. Pool and connection are locked together
  2. Connection.close is called on locked connection

Solution to all of them is to avoid locking pool and session at the same time.
Tests were not added because for most cases there could be no reliable test possible.

Pre-review checklist

  • I have split my patch into logically separate commits.
  • All commit messages clearly explain what they change and why.
  • I added relevant tests for new features and bug fixes.
  • All commits compile, pass static checks and pass test.
  • PR description sums up the changes and reasons why they should be introduced.
  • I have provided docstrings for the public items that I want to introduce.
  • I have adjusted the documentation in ./docs/source/.
  • I added appropriate Fixes: annotations to PR description.

and pool

Locking connection and pool at the same time can lead to deadlock.
Consider that connection.close also locks connection.
pool

Locking connection and pool can lead to deadlock.
Consider that connection.close locks connection.
@dkropachev dkropachev requested a review from Lorak-mmk July 8, 2025 10:32
@dkropachev dkropachev self-assigned this Jul 8, 2025
Copy link

@Lorak-mmk Lorak-mmk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eh, I don't understand this code well enough to say with confidence that those changes are correct. At first glance they look correct, left just one comment about improving commit message.
I wish it was written in a more dev-friendly way: one thread/task/whatever than handles connection / pool, receives events etc. Then there is basically no need for any locks, since things only have one owner.

and pool

Locking connection a pool can lead to a deadlock.
Consider that connection.close also locks connection.
This commit also fixes condition for closing connection right away.
Instead of checking there is no in-flight requests it now checks if
in-flight requests less or equal than orphaned requests, since orphaned
requests still counts as in-flight and can be safely disregarded when connection
is dropped.
and connection

Locking connection and pool at the same time lead to dead lock.
Consider connection.close locks connection.
@dkropachev dkropachev force-pushed the dk/fix-dead-lock-in-connection-pool branch from 161c996 to 729f9cc Compare July 8, 2025 18:31
@dkropachev dkropachev requested a review from Lorak-mmk July 8, 2025 18:39
@dkropachev
Copy link
Collaborator Author

dkropachev commented Jul 8, 2025

Eh, I don't understand this code well enough to say with confidence that those changes are correct. At first glance they look correct, left just one comment about improving commit message. I wish it was written in a more dev-friendly way: one thread/task/whatever than handles connection / pool, receives events etc. Then there is basically no need for any locks, since things only have one owner.

We could have rewritten it, but I think it would be way better to invest into the wrapper around rust-driver.

@dkropachev dkropachev merged commit 4f2312f into scylladb:master Jul 11, 2025
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants