Skip to content

Conversation

jsvisa
Copy link
Contributor

@jsvisa jsvisa commented Oct 23, 2025

Found in https://github.com/ethereum/go-ethereum/actions/runs/17803828253/job/50611300621?pr=32585

--- FAIL: TestClientCancelWebsocket (0.33s)
panic: read tcp 127.0.0.1:36048->127.0.0.1:38643: read: connection reset by peer [recovered, repanicked]

goroutine 15 [running]:
testing.tRunner.func1.2({0x98dd20, 0xc0005b0100})
	/opt/actions-runner/_work/_tool/go/1.25.1/x64/src/testing/testing.go:1872 +0x237
testing.tRunner.func1()
	/opt/actions-runner/_work/_tool/go/1.25.1/x64/src/testing/testing.go:1875 +0x35b
panic({0x98dd20?, 0xc0005b0100?})
	/opt/actions-runner/_work/_tool/go/1.25.1/x64/src/runtime/panic.go:783 +0x132
github.com/ethereum/go-ethereum/rpc.httpTestClient(0xc0001dc1c0?, {0x9d5e40, 0x2}, 0xc0002bc1c0)
	/opt/actions-runner/_work/go-ethereum/go-ethereum/rpc/client_test.go:932 +0x2b1
github.com/ethereum/go-ethereum/rpc.testClientCancel({0x9d5e40, 0x2}, 0xc0001dc1c0)
	/opt/actions-runner/_work/go-ethereum/go-ethereum/rpc/client_test.go:356 +0x15f
github.com/ethereum/go-ethereum/rpc.TestClientCancelWebsocket(0xc0001dc1c0?)
	/opt/actions-runner/_work/go-ethereum/go-ethereum/rpc/client_test.go:319 +0x25
testing.tRunner(0xc0001dc1c0, 0xa07370)
	/opt/actions-runner/_work/_tool/go/1.25.1/x64/src/testing/testing.go:1934 +0xea
created by testing.(*T).Run in goroutine 1
	/opt/actions-runner/_work/_tool/go/1.25.1/x64/src/testing/testing.go:1997 +0x465
FAIL	github.com/ethereum/go-ethereum/rpc	0.371s

In testClientCancel we wrap the server listener in flakeyListener, which schedules an unconditional close of every accepted connection after a random delay, if the random delay is zero then the timer fires immediately, and then the http client paniced of connection reset by peer.

Here we add a minimum 10ms to ensure the timeout won't fire immediately.

@jwasinger
Copy link
Contributor

Good catch. I can confirm that I was able to locally reproduce this flakey test, and that this PR fixes it. My exact stack trace was a bit different but it was the same underlying cause of raciness due to the occasional generation of a small timeout value for the test.

@jwasinger jwasinger merged commit 0413af4 into ethereum:master Oct 23, 2025
4 of 6 checks passed
@jsvisa jsvisa deleted the rpc-flaky-test branch October 23, 2025 13:16
@jsvisa
Copy link
Contributor Author

jsvisa commented Oct 23, 2025

Good catch. I can confirm that I was able to locally reproduce this flakey test, and that this PR fixes it. My exact stack trace was a bit different but it was the same underlying cause of raciness due to the occasional generation of a small timeout value for the test.

Yes, I just found we can reduce the time out to make it the reproduce able easier

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants