Skip to content

after entire cluster was replaced(decommission->add new node) with new nodes c-s continue to use an old node that was provided in "-host" parameter #259

Open
@aleksbykov

Description

@aleksbykov

Test runs two operations: add new node, decommission random node. Cluster started from 3 nodes. Then on each iteration new node was added, and one random node decommissioned. All operations went fine, while the latest node from initial cluster's nodes - node1 - was starting decommissioned, cassandra-stress terminated with errors:

com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: /10.3.0.133:9042 (com.datastax.driver.core.exceptions.ConnectionException: [/10.3.0.133:9042] Write attempt on defunct connection), ip-10-3-1-72.eu-west-2.compute.internal/10.3.1.72:9042 (com.datastax.driver.core.exceptions.ConnectionException: [ip-10-3-1-72.eu-west-2.compute.internal/10.3.1.72:9042] Write attempt on defunct connection))
        at org.apache.cassandra.stress.operations.predefined.CqlOperation.run(CqlOperation.java:264)
com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: /10.3.0.133:9042 (com.datastax.driver.core.exceptions.ConnectionException: [/10.3.0.133:9042] Write attempt on defunct connection), ip-10-3-1-72.eu-west-2.compute.internal/10.3.1.72:9042 (com.datastax.driver.core.exceptions.ConnectionException: [ip-10-3-1-72.eu-west-2.compute.internal/10.3.1.72:9042] Write attempt on defunct connection))
        at org.apache.cassandra.stress.StressAction$Consumer.run(StressAction.java:473)
com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: /10.3.0.133:9042 (com.datastax.driver.core.exceptions.ConnectionException: [/10.3.0.133:9042] Write attempt on defunct connection), ip-10-3-1-72.eu-west-2.compute.internal/10.3.1.72:9042 (com.datastax.driver.core.exceptions.ConnectionException: [ip-10-3-1-72.eu-west-2.compute.internal/10.3.1.72:9042] Write attempt on defunct connection))
java.io.IOException: Operation x10 on key(s) [395038363034324c4b30]: Error executing: (NoHostAvailableException): All host(s) tried for query failed (tried: ip-10-3-1-72.eu-west-2.compute.internal/10.3.1.72:9042 (com.datastax.driver.core.exceptions.ConnectionException: [ip-10-3-1-72.eu-west-2.compute.internal/10.3.1.72:9042] Write attempt on defunct connection))
com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: /10.3.0.133:9042 (com.datastax.driver.core.exceptions.ConnectionException: [/10.3.0.133:9042] Write attempt on defunct connection), ip-10-3-1-72.eu-west-2.compute.internal/10.3.1.72:9042 (com.datastax.driver.core.exceptions.ConnectionException: [ip-10-3-1-72.eu-west-2.compute.internal/10.3.1.72:9042] Write attempt on defunct connection))

com.datastax.driver.core.exceptions.TransportException: [ip-10-3-1-72.eu-west-2.compute.internal/10.3.1.72:9042] Error writing
        at org.apache.cassandra.stress.Operation.error(Operation.java:141)
com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: ip-10-3-1-72.eu-west-2.compute.internal/10.3.1.72:9042 (com.datastax.driver.core.exceptions.ConnectionException: [ip-10-3-1-72.eu-west-2.compute.internal/10.3.1.72:9042] Write attempt on defunct connection))
        at org.apache.cassandra.stress.Operation.timeWithRetry(Operation.java:119)
com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: ip-10-3-1-72.eu-west-2.compute.internal/10.3.1.72:9042 (com.datastax.driver.core.exceptions.ConnectionException: [ip-10-3-1-72.eu-west-2.compute.internal/10.3.1.72:9042] Write attempt on defunct connection))
        at org.apache.cassandra.stress.operations.predefined.CqlOperation.run(CqlOperation.java:101)
com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: ip-10-3-1-72.eu-west-2.compute.internal/10.3.1.72:9042 (com.datastax.driver.core.exceptions.ConnectionException: [ip-10-3-1-72.eu-west-2.compute.internal/10.3.1.72:9042] Write attempt on defunct connection))
        at org.apache.cassandra.stress.operations.predefined.CqlOperation.run(CqlOperation.java:109)
com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: ip-10-3-1-72.eu-west-2.compute.internal/10.3.1.72:9042 (com.datastax.driver.core.exceptions.ConnectionException: [ip-10-3-1-72.eu-west-2.compute.internal/10.3.1.72:9042] Write attempt on defunct connection))
        at org.apache.cassandra.stress.operations.predefined.CqlOperation.run(CqlOperation.java:264)
com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: ip-10-3-1-72.eu-west-2.compute.internal/10.3.1.72:9042 (com.datastax.driver.core.exceptions.ConnectionException: [ip-10-3-1-72.eu-west-2.compute.internal/10.3.1.72:9042] Write attempt on defunct connection))
        at org.apache.cassandra.stress.StressAction$Consumer.run(StressAction.java:473)
com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: ip-10-3-1-72.eu-west-2.compute.internal/10.3.1.72:9042 (com.datastax.driver.core.exceptions.ConnectionException: [ip-10-3-1-72.eu-west-2.compute.internal/10.3.1.72:9042] Write attempt on defunct connection))
java.io.IOException: Operation x10 on key(s) [34354f324c3450503130]: Error executing: (NoHostAvailableException): All host(s) tried for query failed (tried: /10.3.0.133:9042 (com.datastax.driver.core.exceptions.ConnectionException: [/10.3.0.133:9042] Write attempt on defunct connection))
com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: ip-10-3-1-72.eu-west-2.compute.internal/10.3.1.72:9042 (com.datastax.driver.core.exceptions.ConnectionException: [ip-10-3-1-72.eu-west-2.compute.internal/10.3.1.72:9042] Write attempt on defunct connection))

com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: ip-10-3-1-72.eu-west-2.compute.internal/10.3.1.72:9042 (com.datastax.driver.core.exceptions.ConnectionExce

The issue happend periodically (not very often) and attempt to reproduce or get exact steps were finished without sucess.

There are couple thoughts why it could happened:

  1. C-S failed with java.io.IOException: Operation x10 on key(s) once latest node from initial set was starting decommissioned scylladb#15803 (comment)
  2. C-S failed with java.io.IOException: Operation x10 on key(s) once latest node from initial set was starting decommissioned scylladb#15803 (comment)
  3. C-S failed with java.io.IOException: Operation x10 on key(s) once latest node from initial set was starting decommissioned scylladb#15803 (comment)

More details could found in issue: scylladb/scylladb#15803

Scylla version (or git commit hash): 5.5.0~dev-20231108.a4aeef2eb0aa

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions