Skip to content

Conversation

@cce
Copy link
Contributor

@cce cce commented Oct 21, 2025

Summary

Occasionally TestP2PHTTPHandler causes the logger to be called after it finishes. The logging occurs inside a goroutine started by go-libp2p-kad-dht which is created when NewP2PNetwork calls p2p.MakeCapabilitiesDiscovery which calls MakeDHT. Calling P2PNetwork.Stop() calls dht.Close() and should shut down these goroutines.

=== FAIL: network  (0.00s)
PASS
panic: Log in goroutine after TestP2PHTTPHandler has completed: time="2025-10-20T18:00:54.980905 +0000" level=warning msg="failed when refreshing routing table2 errors occurred:\n\t* failed to query for self, err=failed to find any peer in table\n\t* failed to refresh cpl=0, err=failed to find any peer in table\n\n" file=rt_refresh_manager.go function="github.com/libp2p/go-libp2p-kad-dht/rtrefresh.(*RtRefreshManager).loop" libp2p=dht/RtRefreshManager line=187

goroutine 10525 [running]:
testing.(*common).log(0xc002740a80, {0xc00272fba0, 0x190})
	/opt/hostedtoolcache/go/1.25.3/x64/src/testing/testing.go:1030 +0x1df
testing.(*common).Log(0xc002740a80, {0xc0030151e0, 0x1, 0x1})
	/opt/hostedtoolcache/go/1.25.3/x64/src/testing/testing.go:1180 +0x77
github.com/algorand/go-algorand/logging.TestLogWriter.Write({{0x2e25150?, 0xc002740a80?}}, {0xc003052280, 0x190, 0x4966c9?})
	/home/runner/work/go-algorand/go-algorand/logging/testingLogger.go:40 +0x104
github.com/sirupsen/logrus.(*Entry).write(0xc0007461c0)
	/home/runner/go/pkg/mod/github.com/sirupsen/[email protected]/entry.go:286 +0x262
github.com/sirupsen/logrus.(*Entry).log(0xc000746150, 0x3, {0xc003270b40, 0xbb})
	/home/runner/go/pkg/mod/github.com/sirupsen/[email protected]/entry.go:251 +0x976
github.com/sirupsen/logrus.(*Entry).Log(0xc000746150, 0x3, {0xc002ff3ae8, 0x1, 0x1})
	/home/runner/go/pkg/mod/github.com/sirupsen/[email protected]/entry.go:293 +0x8c
github.com/algorand/go-algorand/network/p2p.(*loggingCore).Write(0xc0016bec80, {0x1, {0xc235bc15ba76b315, 0x18a1265d3, 0x3c46720}, {0x27f0bcb, 0x14}, {0xc003270a80, 0xbb}, {0x1, ...}, ...}, ...)
	/home/runner/work/go-algorand/go-algorand/network/p2p/logger.go:136 +0x6ba
go.uber.org/zap/zapcore.(*CheckedEntry).Write(0xc002fd1380, {0x0, 0x0, 0x0})
	/home/runner/go/pkg/mod/go.uber.org/[email protected]/zapcore/entry.go:253 +0x1ed
go.uber.org/zap.(*SugaredLogger).log(0xc0006e14a0, 0x1, {0x0, 0x0}, {0xc002ff3f28, 0x2, 0x2}, {0x0, 0x0, 0x0})
	/home/runner/go/pkg/mod/go.uber.org/[email protected]/sugar.go:355 +0x12d
go.uber.org/zap.(*SugaredLogger).Warn(...)
	/home/runner/go/pkg/mod/go.uber.org/[email protected]/sugar.go:161
github.com/libp2p/go-libp2p-kad-dht/rtrefresh.(*RtRefreshManager).loop(0xc0000ff680)
	/home/runner/go/pkg/mod/github.com/libp2p/[email protected]/rtrefresh/rt_refresh_manager.go:187 +0x1f4
created by github.com/libp2p/go-libp2p-kad-dht/rtrefresh.(*RtRefreshManager).Start in goroutine 4610
	/home/runner/go/pkg/mod/github.com/libp2p/[email protected]/rtrefresh/rt_refresh_manager.go:93 +0xa5
FAIL	github.com/algorand/go-algorand/network	6.671s

DONE 14328 tests, 2328 skipped, 1 failure in 253.834s

Test Plan

Test should stop having this rare flakiness.

@codecov
Copy link

codecov bot commented Oct 21, 2025

❌ 1 Tests Failed:

Tests completed Failed Passed Skipped
106981 1 106980 2539
View the top 1 failed test(s) by shortest run time
github.com/algorand/go-algorand/ledger::TestCatchpointTrackerWaitNotBlocking
Stack Traces | 0.14s run time
=== RUN   TestCatchpointTrackerWaitNotBlocking
    catchpointtracker_test.go:1103: 
        	Error Trace:	.../go-algorand/ledger/catchpointtracker_test.go:1103
        	Error:      	Wait(16) is blocked
        	Test:       	TestCatchpointTrackerWaitNotBlocking
--- FAIL: TestCatchpointTrackerWaitNotBlocking (0.14s)

To view more test analytics, go to the Test Analytics Dashboard
📋 Got 3 mins? Take this short survey to help us improve Test Analytics.

// zero clients allowed, rate limiting window (10s) is greater than queue deadline (1s)
netB, err := NewP2PNetwork(log, cfg, "", nil, GenesisInfo{genesisID, config.Devtestnet}, &nopeNodeInfo{}, nil, nil)
require.NoError(t, err)
defer netB.Stop() // even though netB.Start is not called, NewP2PNetwork creates goroutines to stop
Copy link
Contributor

@algorandskiy algorandskiy Oct 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which goroutines exactly? I only see that MakeCapabilitiesDiscovery can run gorotines indeed DHT is not needed for this test.
Upd: PR description mentions that. Read the test, EnableDHTProviders is needed for this test.

// zero clients allowed, rate limiting window (10s) is greater than queue deadline (1s)
netB, err := NewP2PNetwork(log, cfg, "", nil, GenesisInfo{genesisID, config.Devtestnet}, &nopeNodeInfo{}, nil, nil)
require.NoError(t, err)
defer netB.Stop() // even though netB.Start is not called, NewP2PNetwork creates goroutines to stop
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
defer netB.Stop() // even though netB.Start is not called, NewP2PNetwork creates goroutines to stop
defer netB.Stop() // even though netB.Start is not called, NewP2PNetwork creates DHT goroutines to stop

Copy link
Contributor

@algorandskiy algorandskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well probably DHT start initialization needs to be reworked to have Start running goroutines, not New

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants