Skip to content

Hitless upgrade: Support initial implementation for synchronous Redis client - no handshake, no failing over notifications support. #3713

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 22 commits into
base: feat/hitless-upgrade-sync-standalone
Choose a base branch
from

Conversation

petyaslavova
Copy link
Collaborator

Pull Request check-list

Please make sure to review and check all of these items:

  • Do tests and lints pass with this change?
  • Do the CI tests pass with this change (enable it first in your forked repo and wait for the github action build to finish)?
  • Is the new or changed code fully tested?
  • Is a documentation update included (if this change modifies existing APIs, or introduces new ones)?
  • Is there an example added to the examples folder (if applicable)?

NOTE: these things are not required to open a PR and can be done
afterwards / while the PR is open.

Description of change

Hitless upgrade support implementation for synchronous Redis client.

Copilot

This comment was marked as outdated.

@petyaslavova petyaslavova force-pushed the ps_hitless_upgrade_sync_redis branch 4 times, most recently from 0304da5 to ee27bd2 Compare July 22, 2025 16:51
@petyaslavova petyaslavova force-pushed the ps_hitless_upgrade_sync_redis branch 2 times, most recently from 0f9734c to 6d496f0 Compare July 24, 2025 13:45
@petyaslavova petyaslavova requested a review from Copilot July 26, 2025 10:11
Copilot

This comment was marked as outdated.

Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements hitless upgrade support for the synchronous Redis client by adding maintenance events handling. The implementation allows the Redis client to gracefully handle cluster rebalancing and node migration operations without losing connections or data.

Key changes:

  • Added comprehensive maintenance events handling infrastructure for MOVING, MIGRATING, and MIGRATED events
  • Implemented connection pool management for hitless upgrades during cluster maintenance
  • Enhanced Redis client with maintenance events configuration and handlers

Reviewed Changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
redis/maintenance_events.py Core maintenance events classes and handlers for processing cluster rebalancing notifications
redis/connection.py Enhanced connection and pool classes with maintenance state management and proactive reconnection support
redis/client.py Updated Redis client to support maintenance events configuration and handle connection reconnection during maintenance
redis/_parsers/base.py Extended parsers to handle maintenance-related push notifications (MOVING, MIGRATING, MIGRATED)
redis/_parsers/resp3.py Updated RESP3 parser to properly handle push notifications during maintenance events
redis/_parsers/hiredis.py Updated Hiredis parser to support maintenance event push notifications
tests/test_maintenance_events_handling.py Comprehensive integration tests for maintenance events handling with mocked Redis protocol
tests/test_maintenance_events.py Unit tests for maintenance event classes and handlers
tests/test_connection_pool.py Minor test updates to support new connection pool functionality
Comments suppressed due to low confidence (2)

redis/connection.py:385

  • The parser is being set before protocol-specific configurations are applied. Line 407 shows that RESP3Parser is set when protocol is 3, but the initial parser setup happens before this check. This could lead to incorrect parser being used for maintenance events.
        self.health_check_interval = health_check_interval

redis/connection.py:381

  • This line appears to be indented incorrectly. The if statement at line 380 should control this block, but the indentation suggests it's not properly nested within the conditional block.
            # Update the retry's supported errors with the specified errors

@petyaslavova petyaslavova changed the base branch from master to feat/hitless-upgrade-sync-standalone August 13, 2025 04:58
@petyaslavova petyaslavova marked this pull request as ready for review August 13, 2025 04:58
@petyaslavova petyaslavova changed the title Hitless upgrade support implementation for synchronous Redis client. Hitless upgrade support initial implementation for synchronous Redis client - no handshake, no failing over notifications support. Aug 14, 2025
# disconnect them later.
self._connections = []
finally:
if self._locked:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this flag is redundant. If you try to release a lock that wasn't acquired by this thread it will throw a RuntimeException, if you try to release a lock that was acquired by another thread it will be the same. Since, you anyway surround release() method with try catch, you will either release lock if it's possible or suppress any exception and move on

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true, but it will be slower.
We will be in the True case very rarely - only when we have active moving maintenance.
If I remove the check, the exception will be raised all the time, and it is a much more resource-consuming operation than the self._locked check.


def remove_expired_notifications(self):
with self._lock:
for notification in tuple(self._processed_events):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need to convert it into a tuple? Set is also iterable

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because I'm changing the elements if any of the existing inside notifications have expired - I can't iterate on an iterable object and change its elements inside the cycle.

if isinstance(notification, NodeMovingEvent):
return self.handle_node_moving_event(notification)
else:
logging.error(f"Unhandled notification type: {notification}")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't expect to handle any other events then NodeMovingEvent why does signature allows it's parent class to be passed (MaintenanceEvent)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we don't know if this will be the only one - it is for now.

@petyaslavova petyaslavova changed the title Hitless upgrade support initial implementation for synchronous Redis client - no handshake, no failing over notifications support. Hitless upgrade: Support initial implementation for synchronous Redis client - no handshake, no failing over notifications support. Aug 15, 2025
@petyaslavova petyaslavova force-pushed the ps_hitless_upgrade_sync_redis branch from e8643cd to 4c6eb44 Compare August 15, 2025 14:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants