Skip to content

Commit

Permalink
Document actions for diverged state
Browse files Browse the repository at this point in the history
  • Loading branch information
pakhomov-dfinity committed Sep 15, 2022
1 parent fbed807 commit e59789e
Showing 1 changed file with 28 additions and 0 deletions.
28 changes: 28 additions & 0 deletions testnet/docs/alerts/IC_Replica_DivergedState.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
= IC_Replica_DivergedState
:icons: font
ifdef::env-github,env-browser[:outfilesuffix:.adoc]

== Triggered by

A replica's state has diverged from that certified by the subnet. When the
replicated state diverges, the replica saves the checkpoint into the
`diverged_checkpoints` dir and crashes; upon restart it reports the timestamp
of the latest diverged state, if any.

== Impact

Indicates a serious condition that is never supposed to happen.
Requires immediate debugging to prevent a potential subnet stall, if more
than one third of replicas diverge (e.g. as a result of increased load).

== Possible causes (non-exhaustive)

* A bug that leads to non-determinism in the Deterministic State Machine.

== Troubleshooting and remediation

* Inform the Message Routing team (at
https://dfinity.slack.com/archives/CKXPC1928[`#eng-messaging`] or via
`@team-messaging`) to start investigating the root cause and find a
permanent solution
* Silence the alert as it won't stop on its own

0 comments on commit e59789e

Please sign in to comment.