-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Hitless upgrade: Support initial implementation for synchronous Redis client - no handshake, no failing over notifications support. #3713
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: feat/hitless-upgrade-sync-standalone
Are you sure you want to change the base?
Conversation
0304da5
to
ee27bd2
Compare
0f9734c
to
6d496f0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements hitless upgrade support for the synchronous Redis client by adding maintenance events handling. The implementation allows the Redis client to gracefully handle cluster rebalancing and node migration operations without losing connections or data.
Key changes:
- Added comprehensive maintenance events handling infrastructure for MOVING, MIGRATING, and MIGRATED events
- Implemented connection pool management for hitless upgrades during cluster maintenance
- Enhanced Redis client with maintenance events configuration and handlers
Reviewed Changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 3 comments.
Show a summary per file
File | Description |
---|---|
redis/maintenance_events.py | Core maintenance events classes and handlers for processing cluster rebalancing notifications |
redis/connection.py | Enhanced connection and pool classes with maintenance state management and proactive reconnection support |
redis/client.py | Updated Redis client to support maintenance events configuration and handle connection reconnection during maintenance |
redis/_parsers/base.py | Extended parsers to handle maintenance-related push notifications (MOVING, MIGRATING, MIGRATED) |
redis/_parsers/resp3.py | Updated RESP3 parser to properly handle push notifications during maintenance events |
redis/_parsers/hiredis.py | Updated Hiredis parser to support maintenance event push notifications |
tests/test_maintenance_events_handling.py | Comprehensive integration tests for maintenance events handling with mocked Redis protocol |
tests/test_maintenance_events.py | Unit tests for maintenance event classes and handlers |
tests/test_connection_pool.py | Minor test updates to support new connection pool functionality |
Comments suppressed due to low confidence (2)
redis/connection.py:385
- The parser is being set before protocol-specific configurations are applied. Line 407 shows that RESP3Parser is set when protocol is 3, but the initial parser setup happens before this check. This could lead to incorrect parser being used for maintenance events.
self.health_check_interval = health_check_interval
redis/connection.py:381
- This line appears to be indented incorrectly. The if statement at line 380 should control this block, but the indentation suggests it's not properly nested within the conditional block.
# Update the retry's supported errors with the specified errors
# disconnect them later. | ||
self._connections = [] | ||
finally: | ||
if self._locked: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this flag is redundant. If you try to release a lock that wasn't acquired by this thread it will throw a RuntimeException, if you try to release a lock that was acquired by another thread it will be the same. Since, you anyway surround release()
method with try catch, you will either release lock if it's possible or suppress any exception and move on
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's true, but it will be slower.
We will be in the True case very rarely - only when we have active moving maintenance.
If I remove the check, the exception will be raised all the time, and it is a much more resource-consuming operation than the self._locked check.
|
||
def remove_expired_notifications(self): | ||
with self._lock: | ||
for notification in tuple(self._processed_events): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you need to convert it into a tuple? Set is also iterable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because I'm changing the elements if any of the existing inside notifications have expired - I can't iterate on an iterable object and change its elements inside the cycle.
if isinstance(notification, NodeMovingEvent): | ||
return self.handle_node_moving_event(notification) | ||
else: | ||
logging.error(f"Unhandled notification type: {notification}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we don't expect to handle any other events then NodeMovingEvent
why does signature allows it's parent class to be passed (MaintenanceEvent
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because we don't know if this will be the only one - it is for now.
… tests for maintenance_events.py file
…tion pool - this should be a separate PR
… Refactored the maintenance events tests not to be multithreaded - we don't need it for those tests.
…ot processed in in Moving state. Tests are updated
…ply them during connect
…isting ones more generic
…e better retry_on_error handling on connection initialization.
e8643cd
to
4c6eb44
Compare
Pull Request check-list
Please make sure to review and check all of these items:
NOTE: these things are not required to open a PR and can be done
afterwards / while the PR is open.
Description of change
Hitless upgrade support implementation for synchronous Redis client.