-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A bunch of improvements #479
Conversation
WalkthroughThe changes introduce timeout parameters into various shutdown methods across multiple components. The Changes
Sequence Diagram(s)sequenceDiagram
participant Main
participant ReplicationServer
participant ApiServer
Main->>ReplicationServer: WaitForShutdown(10 * time.Second)
ReplicationServer->>ApiServer: Shutdown(10 * time.Second)
alt Graceful Shutdown
ApiServer->>ApiServer: gracefulShutdown(timeout)
else Immediate Shutdown
ApiServer->>ApiServer: Stop gRPC server
end
ApiServer-->>ReplicationServer: Shutdown complete
ReplicationServer-->>Main: Shutdown complete
sequenceDiagram
participant SyncWorker
participant recvChan
participant errChan
SyncWorker->>recvChan: Start listening for messages
SyncWorker->>errChan: Start listening for errors concurrently
alt Message Received
recvChan->>SyncWorker: Process message
else Error Encountered
errChan->>SyncWorker: Log and manage error
end
Possibly related PRs
Suggested reviewers
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms (1)
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (6)
pkg/sync/syncWorker.go (1)
327-363
: Good use of goroutines and channels for non-blocking streaming.This approach nicely separates message reception from processing. However, if there is a high volume of envelopes, consider using a buffered channel or managing backpressure to avoid potential blocking scenarios. Otherwise, the concurrency logic looks correct and well-structured.
pkg/testutils/store.go (2)
23-33
: Consider sanitizing or further trimming the function name.
getCallerName
is straightforward and helpful for generating context-based names. However, if test function names contain unusual characters or become lengthy, database creation might fail in Postgres (with a 63-character limit on identifiers). Consider trimming or replacing non-alphanumeric characters to ensure reliability.
49-50
: Verify length and characters of generated database names.Prefixing
"test_"
plus the function name plus a 12-character random string could exceed Postgres’s identifier length limit or include invalid characters (if any appear in the test name). To prevent failures, you could truncate or sanitize the final name.Example idea:
-dbName := "test_" + getCallerName(3) + "_" + RandomStringLower(12) +rawName := getCallerName(3) +safeName := sanitizeDBName(rawName) // user-implemented function +dbName := "test_" + safeName + "_" + RandomStringLower(12)cmd/replication/main.go (1)
141-141
: Consider making the shutdown timeout configurable.Hardcoding
s.WaitForShutdown(10 * time.Second)
is functional, but allowing configuration or command-line parameters could make it more flexible for different environments or testing scenarios.pkg/api/server.go (1)
149-167
: Consider adding error handling for graceful shutdown.While the implementation is good, it might be helpful to return an error from the Close method to indicate whether the shutdown was graceful or forced.
Apply this diff to enhance error handling:
-func (s *ApiServer) Close(timeout time.Duration) { +func (s *ApiServer) Close(timeout time.Duration) error { s.log.Debug("closing") if s.grpcServer != nil { if timeout != 0 { s.gracefulShutdown(timeout) } else { s.grpcServer.Stop() } } if s.grpcListener != nil { if err := s.grpcListener.Close(); err != nil && !isErrUseOfClosedConnection(err) { s.log.Error("Error while closing grpc listener", zap.Error(err)) + return fmt.Errorf("failed to close grpc listener: %w", err) } s.grpcListener = nil } s.wg.Wait() s.log.Debug("closed") + return nil }pkg/server/server.go (1)
264-269
: Consider adding signal handling timeout.While propagating the timeout to Shutdown is good, the signal handling itself could benefit from a timeout to prevent hanging indefinitely.
Apply this diff to add signal handling timeout:
func (s *ReplicationServer) WaitForShutdown(timeout time.Duration) { termChannel := make(chan os.Signal, 1) signal.Notify(termChannel, syscall.SIGINT, syscall.SIGTERM, syscall.SIGHUP, syscall.SIGQUIT) - <-termChannel + select { + case <-termChannel: + case <-time.After(timeout): + s.log.Warn("Shutdown timeout reached while waiting for signal") + } s.Shutdown(timeout) }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
cmd/replication/main.go
(2 hunks)pkg/api/server.go
(1 hunks)pkg/server/server.go
(3 hunks)pkg/server/server_test.go
(2 hunks)pkg/sync/syncWorker.go
(2 hunks)pkg/testutils/api/api.go
(1 hunks)pkg/testutils/store.go
(3 hunks)
🧰 Additional context used
🪛 golangci-lint (1.62.2)
pkg/sync/syncWorker.go
204-204: ineffectual assignment to err
(ineffassign)
🪛 GitHub Check: Lint-Go
pkg/sync/syncWorker.go
[failure] 204-204:
ineffectual assignment to err (ineffassign)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: Test (Node)
🔇 Additional comments (6)
pkg/api/server.go (1)
132-147
: LGTM! Well-structured graceful shutdown implementation.The implementation correctly handles both graceful and forced shutdown scenarios using concurrent goroutines and context cancellation for coordination.
pkg/testutils/api/api.go (1)
167-167
: LGTM! Appropriate use of immediate shutdown for test cleanup.Using
timeout=0
is correct for test scenarios where we want immediate cleanup without waiting for graceful shutdown.pkg/server/server.go (1)
271-292
: LGTM! Well-structured component shutdown sequence.The shutdown sequence is correct, closing components in a logical order and propagating the timeout to the API server.
pkg/server/server_test.go (3)
143-146
: LGTM! Proper test cleanup with immediate shutdown.Using
timeout=0
ensures quick and deterministic test cleanup.
122-131
: LGTM! Improved node change notification handling.The separate channels with individual cleanup functions improve test isolation and prevent cross-contamination of notifications between nodes.
136-136
: LGTM! Added mock Stop method.The Stop method mock correctly implements the registry interface.
Hard to break these apart as they are all kinda entangled.
We know that
TestReadOwnWritesGuarantee
occasionally fails (Test failure TestReadOwnWritesGuarantee #478) but the output of that test is mangled with the output from the previous test.The mangling is caused by unclean shutdown. As those resources get cleaned up, the output gets printed all over the place.
In tests, we don't need soft shutdown, so this PR introduces a hard shutdown to speed up tests.
The sync worker was leaking connections
Recv() is blocking and does not listen to ctx Done. A separate mechanism has to be introduced to handle shutdowns cleanly. This is the same pattern that we use in
nodeCursorTracker
Create DBs with the name of the test in them. Makes SQL debugging easier.
Summary by CodeRabbit