Script to monitor Tendermint-based nodes and optionally send out alerts using healthchecks.
This has been written by ZENODE and is licensed under the MIT-license (see LICENSE).
- jq
- curl
- Will give an error message if the node is unreachable (code 1).
- Will give an error message if the node is catching up (code 2).
- Will give an error message if the node turned stale for x amount of seconds (code 3).
- Can optionally ping and send logs to a monitor cronjob service as healthchecks.io.
- Not just limited to local nodes.
- Makes logs for historical data (a max of 1MB per log for one monitoring instance; this log gets archives as a .old-file, so it's actually 2MB of data you can have at max until you start to lose historical data).
git clone https://github.com/zenodeapp/monitor-node.git
Warning
Make sure not to set the threshold too low; your local server's time might be inconsistent with the node.
bash monitor_node.sh [title] [rpc_url] [hc_id] [threshold_in_secs]
[title] works as an identifier for this monitoring instance (e.g. "gaia_node", "namada_node") [default: "node"].
[rpc_url] is the rpc endpoint for your node [default: "http://localhost:26657"].
[hc_id] is optional. Insert a healthchecks id here if you want to receive alerts.
[threshold_in_secs] is the stale block threshold. This tells after how many seconds the monitor has to conclude our node has halted [default: 300].
bash monitor_node.sh "namada" "http://localhost:26657" "" 600
This won't ping healthchecks and will decide a node halted if a block hasn't moved for 10 minutes.
Running the script as a cronjob is recommended. Use the command:
crontab -e
If this is your first time opening crontab, you may get asked to choose which editor you wish to use. Use whichever you're most comfortable with.
At the end of the file, add a line similar to the one below.
*/5 * * * * /bin/bash /full/path/to/monitor_node.sh [title] [rpc_url] [hc_id] [interval_in_secs]
This will run the script every 5 minutes.
*/10 * * * * /bin/bash /full/path/to/monitor_node.sh "gaia" "http://localhost:26657" "4ee83d8e-adad-4716-9b1b-7e1f759552f9"
This will run the script every 10 minutes with the title "gaia", it will ping healthchecks at ID: "4ee83d8e-adad-4716-9b1b-7e1f759452f9" (dummy ID) and will default to a threshold of 300 (5 minutes).
— ZEN
Copyright (c) 2024 ZENODE