Conversation
|
I already have something like this included in #3582 if you'd like to take a look. |
I checked but still don't see any timeout around ping(). The code I modify seem unchanged. So pings can hang indefinitely, which can affect return_to_pool and test_before_acquire? |
|
sqlx/sqlx-core/src/pool/connection.rs Line 154 in 5d4a5dc This covers both the On the acquire side, the entire call to sqlx/sqlx-core/src/pool/inner.rs Line 177 in 5d4a5dc |
Ok, then. Seems it is covered. Will close this then. Looking forward for your pool changes to merge. Seems they have been in progress for a while. Thanks for working on cleaning that up. Async rust can be hard! |
Does your PR solve an issue?
The PR adds a optional and configurable ping timeout. Without ping timeout, some connections can get stuck indefinitely in rare circumstances, where a TCP connection is blackholed. I don't a reliable repro, but this is a clearly a problem that can happen, especially in cloud environments with live VM migrations, etc. This can affect the connection pool in two ways:
test_before_acquireis set to true. Unlike return_to_pool, the duration of this is bounded by the acquire timeout. However, if we trigger the acquire connection timeout, the caller will observe an error. However, if we configure a tighter ping timeout, the caller will be able to establish a new connection within the acquire timeout.To add this, add a configurable and optional ping timeout.
Is this a breaking change?
No. The default ping timeout is None, which means no timeout. I would personally set this to tens of seconds or less in a production environment, but kept the default to None so the change is opt-in.
Testing
Added tests that exercise the success and error (timeout) on acquire and return_to_pool. Duration::ZERO always results in timeout to facilitate making the tests non-flaky and deterministic.