Skip to content

Avoid failing requests when re-establishing connections to the cluster #273

@kostja

Description

@kostja

Currently the python driver is frivolously failing network requests if for whatever reason there is no connection to the cluster. or the socket is dead. However, there may be many valid reasons why the connection is not available, and if the request is not in progress, it should not be dropped.

Observed in scylladb/scylladb#16110 and in scylladb/scylladb#14746 , where a belated notification about a node restart leads to a spurious test failure.

How the driver should behave:

  • if there is no viable connection in the pool, wait until there is an available connection, and then try to execute the request.
  • if there is a write failure when trying to write the request data to the socket, read(0 bytes) from the socket, to see if there is an EOF. If there is an eof, e.g. the physical connection is dead, try to open another connection, and retry the request.

The two steps above should significantly reduce the amount of spurious failures on topology changes.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions