[SignalR] [Java] Fix NPE when closing hub connection during negotiation #62319
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fix NPE when closing hub connection during negotiation
Fixes a
NullPointerException
thrown ifHubConnection.close()
orHubConnection.stop()
is called on a HubConnection while the negotiate request is still ongoing, and eventually fails.The cause and the effect
The following must happen for the NPE to trigger:
HubConnectionState.CONNECTING
connectionState.transport
connectionState.transport.stop()
, butconnectionState.transport
is null.This NPE is really insidious, because it cannot be caught as usual. RxJava will directly call
Thread.uncaughtException()
(throughRxJavaPlugins.onError()
), and theHubConnection.stop()
Completable will never complete.Alternatively
HubConnection.close()
will block forever (it callsstop().blockingAwait()
with no timeout).The fix
Simply checking if transport is null before trying to close it will avoid this from happening. The start task which errored is guaranteed to be finished at that point, so there should be no leaks of the transport either, it won't be set after the error was thrown.
I also added a test for this specific scenario, it was failing before this fix, and now works as expected.
It unfortunately relies on timing of concurrently running start and stop, I chose a 100ms time window, which should hopefully not be flaky, and is in line with some other tests which also wait for 100 ms, so the test should not run for an unnecessarily long time.
Fixes #52907