Open
Description
Issue description
- This issue is a regression.
- It is unknown if this issue is a regression.
During the end of the stress command one node is gone (10.0.1.17), and this driver shutdown seems to be stuck
total, 4425734, 19327, 19327, 19327, 4.1, 3.9, 6.9, 9.9, 24.0, 58.8, 325.0, 0.03100, 0, 0, 0, 0, 0, 0
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Connection has been closed
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Connection has been closed
total, 4508016, 16456, 16456, 16456, 4.8, 4.0, 9.6, 19.7, 91.6, 218.0, 330.0, 0.03065, 0, 0, 0, 0, 0, 0
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Connection has been closed
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Connection has been closed
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Connection has been closed
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Connection has been closed
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Error writing
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Connection has been closed
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Error writing
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Error writing
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Channel has been closed
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Channel has been closed
total, 4583006, 14998, 14998, 14998, 5.3, 4.5, 11.2, 18.1, 31.7, 62.0, 335.0, 0.03018, 0, 0, 0, 0, 0, 0
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Channel has been closed
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Connection has been closed
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Error writing
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Connection has been closed
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Error writing
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Error writing
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Connection has been closed
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Error writing
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Connection has been closed
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Error writing
total, 4677544, 18908, 18908, 18908, 4.2, 4.0, 6.4, 9.0, 14.4, 20.2, 340.0, 0.03007, 0, 0, 0, 0, 0, 0
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Error writing
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Error writing
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Error writing
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Error writing
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Error writing
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Error writing
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Connection has been closed
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Connection has been closed
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Error writing
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Error writing
total, 4780430, 20577, 20577, 20577, 3.9, 3.7, 6.6, 8.5, 13.4, 16.5, 345.0, 0.03022, 0, 0, 0, 0, 0, 0
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Error writing
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Error writing
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Error writing
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Error writing
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Error writing
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Error writing
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Error writing
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Error writing
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Error writing
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Error writing
total, 4882713, 20457, 20457, 20457, 3.9, 3.7, 6.6, 9.1, 14.4, 26.8, 350.0, 0.03032, 0, 0, 0, 0, 0, 0
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Error writing
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Error writing
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Error writing
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Error writing
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Error writing
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Error writing
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Error writing
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Error writing
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Error writing
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Error writing
total, 4985711, 20600, 20600, 20600, 3.9, 3.7, 6.6, 9.0, 13.9, 23.9, 355.0, 0.03040, 0, 0, 0, 0, 0, 0
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Error writing
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Error writing
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Error writing
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Error writing
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Error writing
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Error writing
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Error writing
total, 5048570, 20374, 20374, 20374, 3.9, 3.7, 6.7, 9.2, 19.2, 24.3, 358.1, 0.03041, 0, 0, 0, 0, 0, 0
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Error writing
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Error writing
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Error writing
Results:
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Connection has been closed
Op rate : 14,099 op/s [WRITE: 14,099 op/s]
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Connection has been closed
Partition rate : 14,099 pk/s [WRITE: 14,099 pk/s]
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Connection has been closed
Row rate : 14,099 row/s [WRITE: 14,099 row/s]
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Connection has been closed
Latency mean : 5.6 ms [WRITE: 5.6 ms]
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Connection has been closed
Latency median : 4.8 ms [WRITE: 4.8 ms]
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Connection has been closed
Latency 95th percentile : 11.6 ms [WRITE: 11.6 ms]
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Connection has been closed
Latency 99th percentile : 20.7 ms [WRITE: 20.7 ms]
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Connection has been closed
Latency 99.9th percentile : 41.6 ms [WRITE: 41.6 ms]
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Connection has been closed
Latency max : 365.7 ms [WRITE: 365.7 ms]
com.datastax.driver.core.exceptions.TransportException: [ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042] Connection has been closed
Total partitions : 5,048,570 [WRITE: 5,048,570]
WARN 00:44:50,156 Error creating netty channel to ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042
Total errors : 0 [WRITE: 0]
com.datastax.shaded.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: ip-10-0-1-17.eu-north-1.compute.internal/10.0.1.17:9042
Total GC count : 0
Caused by: java.net.ConnectException: Connection refused
Total GC memory : 0.000 KiB
at java.base/sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
Total GC time : 0.0 seconds
at java.base/sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:777)
Avg GC time : NaN ms
at com.datastax.shaded.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:337)
StdDev GC time : 0.0 ms
at com.datastax.shaded.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334)
Total operation time : 00:05:58
at com.datastax.shaded.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:776)
at com.datastax.shaded.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724)
END
Impact
Stress command is getting stuck
How frequently does it reproduce?
seen once so far on the nightly runs
Installation details
Kernel Version: 5.15.0-1049-aws
Scylla version (or git commit hash): 5.5.0~dev-20231113.7b08886e8dd8
with build-id 7548c48606c5b58f282fc2b596019226de0df6ed
Cluster size: 3 nodes (i4i.large)
Scylla Nodes used in this run:
- longevity-5gb-1h-NodeTerminateAndRe-db-node-dfb19859-9 (13.49.80.230 | 10.0.1.175) (shards: 2)
- longevity-5gb-1h-NodeTerminateAndRe-db-node-dfb19859-8 (51.20.109.226 | 10.0.3.74) (shards: 2)
- longevity-5gb-1h-NodeTerminateAndRe-db-node-dfb19859-7 (16.16.201.232 | 10.0.1.202) (shards: 2)
- longevity-5gb-1h-NodeTerminateAndRe-db-node-dfb19859-6 (13.51.48.183 | 10.0.1.141) (shards: 2)
- longevity-5gb-1h-NodeTerminateAndRe-db-node-dfb19859-5 (51.20.123.34 | 10.0.2.227) (shards: 2)
- longevity-5gb-1h-NodeTerminateAndRe-db-node-dfb19859-4 (13.51.178.61 | 10.0.1.211) (shards: 2)
- longevity-5gb-1h-NodeTerminateAndRe-db-node-dfb19859-3 (16.171.236.237 | 10.0.1.17) (shards: 2)
- longevity-5gb-1h-NodeTerminateAndRe-db-node-dfb19859-2 (51.20.144.28 | 10.0.1.61) (shards: 2)
- longevity-5gb-1h-NodeTerminateAndRe-db-node-dfb19859-17 (16.171.239.81 | 10.0.2.188) (shards: 2)
- longevity-5gb-1h-NodeTerminateAndRe-db-node-dfb19859-16 (16.171.54.15 | 10.0.1.41) (shards: 2)
- longevity-5gb-1h-NodeTerminateAndRe-db-node-dfb19859-15 (51.20.143.39 | 10.0.3.169) (shards: 2)
- longevity-5gb-1h-NodeTerminateAndRe-db-node-dfb19859-14 (16.16.64.126 | 10.0.0.194) (shards: 2)
- longevity-5gb-1h-NodeTerminateAndRe-db-node-dfb19859-13 (16.170.157.28 | 10.0.2.96) (shards: 2)
- longevity-5gb-1h-NodeTerminateAndRe-db-node-dfb19859-12 (51.20.66.232 | 10.0.2.150) (shards: 2)
- longevity-5gb-1h-NodeTerminateAndRe-db-node-dfb19859-11 (13.48.26.181 | 10.0.1.121) (shards: 2)
- longevity-5gb-1h-NodeTerminateAndRe-db-node-dfb19859-10 (51.20.106.233 | 10.0.1.211) (shards: 2)
- longevity-5gb-1h-NodeTerminateAndRe-db-node-dfb19859-1 (16.171.110.198 | 10.0.2.74) (shards: 2)
OS / Image: ami-02ea62bee686ef03b
(aws: undefined_region)
Test: longevity-5gb-1h-NodeTerminateAndReplace-aws-test
Test id: dfb19859-9fbb-4408-a2be-a42c7400f7cb
Test name: scylla-master/nemesis/longevity-5gb-1h-NodeTerminateAndReplace-aws-test
Test config file(s):
Logs and commands
- Restore Monitor Stack command:
$ hydra investigate show-monitor dfb19859-9fbb-4408-a2be-a42c7400f7cb
- Restore monitor on AWS instance using Jenkins job
- Show all stored logs command:
$ hydra investigate show-logs dfb19859-9fbb-4408-a2be-a42c7400f7cb
Logs:
- db-cluster-dfb19859.tar.gz - https://cloudius-jenkins-test.s3.amazonaws.com/dfb19859-9fbb-4408-a2be-a42c7400f7cb/20231114_120128/db-cluster-dfb19859.tar.gz
- sct-runner-events-dfb19859.tar.gz - https://cloudius-jenkins-test.s3.amazonaws.com/dfb19859-9fbb-4408-a2be-a42c7400f7cb/20231114_120128/sct-runner-events-dfb19859.tar.gz
- sct-dfb19859.log.tar.gz - https://cloudius-jenkins-test.s3.amazonaws.com/dfb19859-9fbb-4408-a2be-a42c7400f7cb/20231114_120128/sct-dfb19859.log.tar.gz
- loader-set-dfb19859.tar.gz - https://cloudius-jenkins-test.s3.amazonaws.com/dfb19859-9fbb-4408-a2be-a42c7400f7cb/20231114_120128/loader-set-dfb19859.tar.gz
- monitor-set-dfb19859.tar.gz - https://cloudius-jenkins-test.s3.amazonaws.com/dfb19859-9fbb-4408-a2be-a42c7400f7cb/20231114_120128/monitor-set-dfb19859.tar.gz