Skip to content
This repository was archived by the owner on Nov 6, 2020. It is now read-only.

Client doesn't shutdown gracefully on Ctrl-C #9807

Closed
ordian opened this issue Oct 24, 2018 · 12 comments · Fixed by #9886
Closed

Client doesn't shutdown gracefully on Ctrl-C #9807

ordian opened this issue Oct 24, 2018 · 12 comments · Fixed by #9886
Assignees
Labels
F2-bug 🐞 The client fails to follow expected behavior. M4-core ⛓ Core client code / Rust. P2-asap 🌊 No need to stop dead in your tracks, however issue should be addressed as soon as possible.
Milestone

Comments

@ordian
Copy link
Member

ordian commented Oct 24, 2018

  • Parity Ethereum version: master fe84718
  • Operating system: Linux
  • Installation: built from source
  • Fully synchronized: yes, but shouldn't be relevant
  • Network: kovan, but shouldn't be relevant
  • Restarted: yes

Works fine with b8da38f, so it was introduced with #9657:

^C2018-10-24 15:45:38  main WARN parity_ethereum::run  Shutdown is taking longer than expected.
2018-10-24 15:49:38  main WARN parity_ethereum::run  Shutdown timeout reached, exiting uncleanly.
@ordian ordian added P5-sometimesoon 🌲 Issue is worth doing soon. F3-annoyance 💩 The client behaves within expectations, however this “expected behaviour” itself is at issue. labels Oct 24, 2018
@ordian ordian added this to the 2.2 milestone Oct 24, 2018
@EBGToo
Copy link

EBGToo commented Oct 27, 2018

A 15 minute shutdown:

2018-10-26 16:33:08  Imported #6589794 0x146d…b8c1 (59 txs, 3.10 Mgas, 1181 ms, 13.09 KiB) + another 1 block(s) containing 26 tx(s)
2018-10-26 16:33:22  Imported #6589795 0x0a86…59e8 (86 txs, 7.96 Mgas, 5502 ms, 14.35 KiB)
2018-10-26 16:33:48  Imported #6589798 0xf339…b808 (29 txs, 7.98 Mgas, 6286 ms, 5.22 KiB) + another 2 block(s) containing 102 tx(s)
2018-10-26 16:34:02  Imported #6589799 0xabd6…391a (5 txs, 7.95 Mgas, 130 ms, 13.99 KiB)
2018-10-26 16:34:12     8/25 peers   216 MiB chain 27 MiB db 0 bytes queue 251 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-10-26 16:34:12     8/25 peers   216 MiB chain 27 MiB db 0 bytes queue 251 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-10-26 16:36:01  Syncing #6589803 0x9613…e93a     0.04 blk/s    0.8 tx/s    0 Mgas/s      0+    8 Qed  #6589811   19/25 peers   217 MiB chain 27 MiB db 371 KiB queue 251 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-10-26 16:36:12  Syncing #6589803 0x9613…e93a     0.00 blk/s    0.0 tx/s    0 Mgas/s      0+    9 Qed  #6589812   21/25 peers   217 MiB chain 27 MiB db 392 KiB queue 251 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
^C2018-10-26 16:36:13  Finishing work, please wait...
2018-10-26 16:36:39  Syncing #6589803 0x9613…e93a     0.00 blk/s    0.0 tx/s    0 Mgas/s      0+   11 Qed  #6589813   15/25 peers   217 MiB chain 27 MiB db 535 KiB queue 251 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-10-26 16:37:55  Syncing #6589803 0x9613…e93a     0.00 blk/s    0.0 tx/s    0 Mgas/s      0+   17 Qed  #6589819   19/25 peers   219 MiB chain 27 MiB db 858 KiB queue 251 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-10-26 16:40:55  Syncing #6589803 0x9613…e93a     0.00 blk/s    0.0 tx/s    0 Mgas/s      0+   31 Qed  #6589831   24/25 peers   225 MiB chain 27 MiB db 1 MiB queue 251 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-10-26 16:42:42  Syncing #6589803 0x9613…e93a     0.00 blk/s    0.0 tx/s    0 Mgas/s      0+   36 Qed  #6589836   26/50 peers   226 MiB chain 27 MiB db 2 MiB queue 251 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
^C
2018-10-26 16:44:05  Syncing #6589803 0x9613…e93a     0.00 blk/s    0.0 tx/s    0 Mgas/s      0+   46 Qed  #6589845   25/25 peers   227 MiB chain 27 MiB db 2 MiB queue 251 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-10-26 16:45:12  Syncing #6589803 0x9613…e93a     0.00 blk/s    0.0 tx/s    0 Mgas/s      0+   51 Qed  #6589850   25/25 peers   229 MiB chain 27 MiB db 3 MiB queue 251 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-10-26 16:46:28  Syncing #6589803 0x9613…e93a     0.00 blk/s    0.0 tx/s    0 Mgas/s      0+   60 Qed  #6589857   25/25 peers   230 MiB chain 27 MiB db 3 MiB queue 252 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-10-26 16:47:41  Syncing #6589803 0x9613…e93a     0.00 blk/s    0.0 tx/s    0 Mgas/s      0+   71 Qed  #6589865   25/25 peers   231 MiB chain 27 MiB db 4 MiB queue 251 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-10-26 16:48:30  Syncing #6589803 0x9613…e93a     0.00 blk/s    0.0 tx/s    0 Mgas/s      0+   73 Qed  #6589867   24/25 peers   231 MiB chain 27 MiB db 4 MiB queue 251 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-10-26 16:49:20  Syncing #6589803 0x9613…e93a     0.00 blk/s    0.0 tx/s    0 Mgas/s      0+   79 Qed  #6589873   25/25 peers   232 MiB chain 27 MiB db 4 MiB queue 251 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
^C^C^C
2018-10-26 16:49:53  Syncing #6589803 0x9613…e93a     0.00 blk/s    0.0 tx/s    0 Mgas/s      0+   83 Qed  #6589875   24/25 peers   232 MiB chain 27 MiB db 5 MiB queue 251 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-10-26 16:50:17  Syncing #6589803 0x9613…e93a     0.00 blk/s    0.0 tx/s    0 Mgas/s      0+   86 Qed  #6589878   25/25 peers   232 MiB chain 27 MiB db 5 MiB queue 251 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-10-26 16:50:28  Syncing #6589803 0x9613…e93a     0.00 blk/s    0.0 tx/s    0 Mgas/s      0+   86 Qed  #6589878   25/25 peers   232 MiB chain 27 MiB db 5 MiB queue 251 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-10-26 16:50:34  Syncing #6589803 0x9613…e93a     0.00 blk/s    0.0 tx/s    0 Mgas/s      0+   86 Qed  #6589878   25/25 peers   232 MiB chain 27 MiB db 5 MiB queue 251 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-10-26 16:50:41  Syncing #6589803 0x9613…e93a     0.00 blk/s    0.0 tx/s    0 Mgas/s      0+   87 Qed  #6589879   25/25 peers   232 MiB chain 27 MiB db 5 MiB queue 251 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-10-26 16:50:47  Syncing #6589803 0x9613…e93a     0.00 blk/s    0.0 tx/s    0 Mgas/s      0+   88 Qed  #6589879   26/50 peers   232 MiB chain 27 MiB db 5 MiB queue 251 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-10-26 16:50:54  Syncing #6589803 0x9613…e93a     0.00 blk/s    0.0 tx/s    0 Mgas/s      0+   88 Qed  #6589879   26/50 peers   232 MiB chain 27 MiB db 5 MiB queue 251 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-10-26 16:51:00  Syncing #6589803 0x9613…e93a     0.00 blk/s    0.0 tx/s    0 Mgas/s      0+   90 Qed  #6589880   25/25 peers   232 MiB chain 27 MiB db 5 MiB queue 251 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-10-26 16:51:06  Syncing #6589803 0x9613…e93a     0.00 blk/s    0.0 tx/s    0 Mgas/s      0+   90 Qed  #6589880   25/25 peers   232 MiB chain 27 MiB db 5 MiB queue 251 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-10-26 16:51:12  Syncing #6589803 0x9613…e93a     0.00 blk/s    0.0 tx/s    0 Mgas/s      0+   92 Qed  #6589882   25/25 peers   232 MiB chain 27 MiB db 5 MiB queue 251 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-10-26 16:51:19  Syncing #6589803 0x9613…e93a     0.00 blk/s    0.0 tx/s    0 Mgas/s      0+   93 Qed  #6589883   25/25 peers   232 MiB chain 27 MiB db 5 MiB queue 251 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-10-26 16:51:25  Syncing #6589803 0x9613…e93a     0.00 blk/s    0.0 tx/s    0 Mgas/s      0+   93 Qed  #6589883   25/25 peers   232 MiB chain 27 MiB db 5 MiB queue 251 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-10-26 16:51:32  Syncing #6589803 0x9613…e93a     0.00 blk/s    0.0 tx/s    0 Mgas/s      0+   93 Qed  #6589883   26/50 peers   232 MiB chain 27 MiB db 5 MiB queue 251 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-10-26 16:51:38  Syncing #6589803 0x9613…e93a     0.00 blk/s    0.0 tx/s    0 Mgas/s      0+   93 Qed  #6589883   26/50 peers   232 MiB chain 27 MiB db 5 MiB queue 251 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
<shutdown>

@5chdn 5chdn modified the milestones: 2.2, 2.3 Oct 29, 2018
@c0gent
Copy link
Contributor

c0gent commented Oct 29, 2018

This appears to be an Arc<Client> that isn't being dropped somewhere. #9657 must have uncovered a pre-existing circular reference that probably was dropped before with the old reactor but isn't now with the new Runtime. I'll look more deeply into this as soon as I can.

@5chdn 5chdn added F2-bug 🐞 The client fails to follow expected behavior. P2-asap 🌊 No need to stop dead in your tracks, however issue should be addressed as soon as possible. M4-core ⛓ Core client code / Rust. and removed F3-annoyance 💩 The client behaves within expectations, however this “expected behaviour” itself is at issue. P5-sometimesoon 🌲 Issue is worth doing soon. labels Nov 2, 2018
@5chdn
Copy link
Contributor

5chdn commented Nov 2, 2018

Bump priority here. This makes our Gitlab pipelines fail:

2018-11-02 01:40:38 UTC Starting Parity-Ethereum/v2.3.0-unstable-3a6e04b-20181101/x86_64-linux-gnu/rustc1.30.0
2018-11-02 01:40:38 UTC Keys path /root/.local/share/io.parity.ethereum/keys/ethereum
2018-11-02 01:40:38 UTC DB path /root/.local/share/io.parity.ethereum/chains/ethereum/db/906a34e69aec8c0d
2018-11-02 01:40:38 UTC State DB configuration: fast
2018-11-02 01:40:38 UTC Operating mode: active
2018-11-02 01:40:39 UTC Configured for Foundation using Ethash engine
2018-11-02 01:40:43 UTC Updated conversion rate to Ξ1 = US$200.12 (23795246 wei/gas)
2018-11-02 01:40:48 UTC Finishing work, please wait...
2018-11-02 01:40:48 UTC Syncing       #0 0xd4e5…8fa3     0.00 blk/s    0.0 tx/s    0 Mgas/s      0+    0 Qed        #0    4/25 peers      8 KiB chain    3 MiB db  0 bytes queue   11 KiB sync  RPC:  0 conn,    0 req/s,  203 µs
2018-11-02 01:40:51 UTC Public node URL: enode://3f0efae32a2999cbdfeff9aeab52db616cd8432d22a9181d6ef7221c8ae405a06ee892d7ad8eb6bc716a635890ef2ec9e2e5746d6d1281f2b51b518c82441d72@172.17.0.3:30303
2018-11-02 01:41:48 UTC Shutdown is taking longer than expected.
2018-11-02 01:45:48 UTC Shutdown timeout reached, exiting uncleanly.
pure virtual method called
terminate called without an active exception
RPC output: {"jsonrpc":"2.0","result":{"hash":"0x3a6e04ba150d4bc2badb707cdc0efdfc5159aef6","track":"null","version":{"major":2,"minor":3,"patch":0}},"id":1}Aborted (core dumped)
ERROR: Job failed: exit code 1

https://gitlab.parity.io/parity/parity-ethereum/-/jobs/107223

Can you look at this @c0gent ?

@c0gent
Copy link
Contributor

c0gent commented Nov 2, 2018

I'll work on it today

@c0gent
Copy link
Contributor

c0gent commented Nov 5, 2018

Somewhere, somehow, an Arc<Client> isn't being dropped as a result of the change from the old tokio-core reactor + futures-cpupool CpuPool to the new tokio Runtime (an all-in-one reactor + pool).

I've spent a fair bit of time trying to figure out where and why this Arc<Client> isn't being dropped the same way it was before. I'm coming up with no clear answers.

There must be someone else with more intuition than I about where client references might be hiding (behind trait objects for example). Perhaps that person could help shine some light on this subject.

I'll keep investigating regardless...

@folsen
Copy link
Contributor

folsen commented Nov 5, 2018

This has happened once before and it took ages to track it down :(
I have no intuition or advice to give other than some word of encouragement and thanks for even trying.

Hopefully someone else (whoever fixed it last time) might be able to show up with some better advice.

@niklasad1 niklasad1 self-assigned this Nov 5, 2018
@eira-fransham
Copy link
Contributor

eira-fransham commented Nov 6, 2018

Maybe run memcheck on the binary? http://valgrind.org/docs/manual/mc-manual.html#mc-manual.leaks

This will only help if there's a reference cycle somewhere, not if it's being held on the stack.

@niklasad1
Copy link
Collaborator

Sure, but I found yesterday that we have 24 strong references to the Client by just naively print it out from the Arc. Thus, it never gets dropped!

@mbr
Copy link

mbr commented Nov 6, 2018

Sure, but I found yesterday that we have 24 strong references to the Client by just naively print it out from the Arc. Thus, it never gets dropped!

I found the same, I guess I should have pointed it out here earlier. I am currently trying to debug this with a tracking-Arc implementation, if someone wants to chip in, let me know.

@mbr
Copy link

mbr commented Nov 6, 2018

For those interested, here's a draft: https://github.com/mbr/snarc-rs

peterjgilbert added a commit to oasislabs/parity-ethereum that referenced this issue Nov 8, 2018
peterjgilbert added a commit to oasislabs/parity-ethereum that referenced this issue Nov 8, 2018
peterjgilbert added a commit to oasislabs/parity-ethereum that referenced this issue Nov 10, 2018
@c0deright
Copy link

c0deright commented Dec 17, 2018

I'm in the progress of upgrading my 2.2.3-beta test-node to 2.2.5-beta and upon stopping 2.2.3-beta it get's stuck with the same message:

2018-12-17 17:44:26  Verifier #1 INFO import  Imported #6904209 0x5f0c…05c8 (126 txs, 7.99 Mgas, 682 ms, 26.00 KiB)
2018-12-17 17:44:30  IO Worker #0 INFO import    11/15 peers    140 MiB chain  401 MiB db  0 bytes queue  250 KiB sync  RPC:  0 conn,    0 req/s,   18 µs
2018-12-17 17:44:44  main INFO parity_ethereum::run  Finishing work, please wait...
2018-12-17 17:45:44  main WARN parity_ethereum::run  Shutdown is taking longer than expected.
2018-12-17 17:49:44  main WARN parity_ethereum::run  Shutdown timeout reached, exiting uncleanly.

Running as daemon here, logging to logfile.

Starting 2.2.5-beta afterwards didn't log anything unusual.

@c0deright
Copy link

This might not be fully fixed: #9101 (comment)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
F2-bug 🐞 The client fails to follow expected behavior. M4-core ⛓ Core client code / Rust. P2-asap 🌊 No need to stop dead in your tracks, however issue should be addressed as soon as possible.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants