-
Notifications
You must be signed in to change notification settings - Fork 26
ethmonitor: add poc for goroutine leak during shutdown #197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
b8be9dc
8734b3b
218cfec
9f78a00
2003fb6
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| @@ -0,0 +1,66 @@ | ||||||||||||||||||||||||||||||||
| package ethmonitor_test | ||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||
| //go:generate mockgen -destination=internal/mocks/mock_provider.go -package=mocks github.com/0xsequence/ethkit/ethrpc RawInterface | ||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||
| import ( | ||||||||||||||||||||||||||||||||
| "context" | ||||||||||||||||||||||||||||||||
| "fmt" | ||||||||||||||||||||||||||||||||
| "math/big" | ||||||||||||||||||||||||||||||||
| "runtime" | ||||||||||||||||||||||||||||||||
| "testing" | ||||||||||||||||||||||||||||||||
| "time" | ||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||
| "github.com/0xsequence/ethkit/ethmonitor" | ||||||||||||||||||||||||||||||||
| "github.com/0xsequence/ethkit/ethmonitor/internal/mocks" | ||||||||||||||||||||||||||||||||
| "go.uber.org/mock/gomock" | ||||||||||||||||||||||||||||||||
| ) | ||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||
| // TestMonitorShutdownNoGoroutineLeak verifies that the monitor shuts down cleanly without leaking goroutines. | ||||||||||||||||||||||||||||||||
| func TestMonitorShutdownNoGoroutineLeak(t *testing.T) { | ||||||||||||||||||||||||||||||||
| if testing.Short() { | ||||||||||||||||||||||||||||||||
| t.Skip("Skipping in short mode") | ||||||||||||||||||||||||||||||||
| } | ||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||
| ctrl := gomock.NewController(t) | ||||||||||||||||||||||||||||||||
| defer ctrl.Finish() | ||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||
| provider := mocks.NewMockRawInterface(ctrl) | ||||||||||||||||||||||||||||||||
| provider.EXPECT().ChainID(gomock.Any()).Return(big.NewInt(1), nil).AnyTimes() | ||||||||||||||||||||||||||||||||
| provider.EXPECT().IsStreamingEnabled().Return(false).AnyTimes() | ||||||||||||||||||||||||||||||||
| provider.EXPECT().RawBlockByNumber(gomock.Any(), gomock.Any()). | ||||||||||||||||||||||||||||||||
| Return(nil, fmt.Errorf("simulated network error")).AnyTimes() | ||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||
| baseline := runtime.NumGoroutine() | ||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||
| opts := ethmonitor.DefaultOptions | ||||||||||||||||||||||||||||||||
| opts.PollingInterval = 10 * time.Millisecond | ||||||||||||||||||||||||||||||||
| opts.Timeout = 50 * time.Millisecond | ||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||
| monitor, err := ethmonitor.NewMonitor(provider, opts) | ||||||||||||||||||||||||||||||||
| if err != nil { | ||||||||||||||||||||||||||||||||
| t.Fatalf("failed to create monitor: %v", err) | ||||||||||||||||||||||||||||||||
| } | ||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||
| ctx, cancel := context.WithCancel(context.Background()) | ||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||
| done := make(chan error, 1) | ||||||||||||||||||||||||||||||||
| go func() { | ||||||||||||||||||||||||||||||||
| done <- monitor.Run(ctx) | ||||||||||||||||||||||||||||||||
| }() | ||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||
| time.Sleep(200 * time.Millisecond) | ||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||
| time.Sleep(200 * time.Millisecond) | |
| waitCtx, waitCancel := context.WithTimeout(context.Background(), time.Second) | |
| defer waitCancel() | |
| for { | |
| if monitor.IsRunning() { | |
| break | |
| } | |
| select { | |
| case <-waitCtx.Done(): | |
| t.Fatalf("monitor did not start running within timeout: %v", waitCtx.Err()) | |
| case <-time.After(10 * time.Millisecond): | |
| } | |
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not worth it in this context.
Copilot
AI
Jan 21, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test doesn't explicitly call runtime.GC() and give goroutines time to be garbage collected before checking for leaks. The Go runtime may not immediately clean up exited goroutines. Consider adding runtime.GC() and a small sleep (e.g., 100ms) before the leak check to ensure more accurate results.
| // Force garbage collection and give it a moment to complete before checking for leaks. | |
| runtime.GC() | |
| time.Sleep(100 * time.Millisecond) |
Copilot
AI
Jan 21, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The sleep duration of 100ms after GC may be insufficient for goroutines to fully clean up. Goroutine leak detection is inherently racy, and a more robust approach would be to poll runtime.NumGoroutine() in a loop with a reasonable timeout (e.g., 1 second) until the count returns to baseline or the timeout expires.
| time.Sleep(100 * time.Millisecond) | |
| // Poll for goroutine count to return to baseline with a reasonable timeout | |
| deadline := time.Now().Add(1 * time.Second) | |
| for time.Now().Before(deadline) { | |
| if runtime.NumGoroutine() <= baseline { | |
| break | |
| } | |
| time.Sleep(10 * time.Millisecond) | |
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The second select statement at line 493-497 is redundant and creates an unnecessary race condition. After successfully receiving from the nextBlock channel (line 488-492), the code immediately tries to send on ch. However, if the context is cancelled between these two select statements, the goroutine will return without sending on ch, which could cause the receiver to miss the block number. Consider combining these into a single select that handles both the receive and the send atomically, or ensure that the value is sent before checking context cancellation again.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the comment is accurate in the sense that the goroutine returns without sending on ch, but if we're shutting down that's acceptable. I don't think we should act on this comment, but I'd defer that decision to the reviewers.