Skip to content

Session not reconnected after rolling upgrade  #375

Open
@Jadw1

Description

@Jadw1

Observed in https://github.com/scylladb/scylla-enterprise/pull/4634#issuecomment-2333883650.

The test is running with force_gossip_topology_changes: true, so auth is not managed via raft and auth data is stored in system_auth keyspace with default replication factor 1. Test fails once per several runs.
It is doing rolling upgrade but sometimes the driver is not connected to some of the nodes after the rolling upgrade is finished (all nodes are up).

Reproducer:

@pytest.mark.asyncio
async def test_rolling_restart_with_auth(manager: ManagerClient):
    config = {
        'force_gossip_topology_changes': True,
    }
    servers = [await manager.server_add(config=config) for _ in range(3)]
    cql = manager.get_cql()
    hosts = await wait_for_cql_and_get_hosts(cql, servers, time.time() + 60)

    await manager.rolling_restart(servers)

I was running the reproducer in test/auth_cluster suite (enabled authentication) https://github.com/scylladb/scylladb/blob/master/test/auth_cluster/suite.yaml

During the upgrade, the driver cannot authenticate if replica which owns the part of token ring holding user data (system_auth has RF=1) is down. But it isn't reconnected after the node gets up.

pytest.log

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingtriage

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions