Skip to content

Commit 5ee84b0

Browse files
committed
Integrate Nick's draft
1 parent b235c18 commit 5ee84b0

File tree

1 file changed

+118
-14
lines changed

1 file changed

+118
-14
lines changed

docs/website/contents/for-developers/Whitepaper.md

+118-14
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,26 @@
11
In this document, we describe the necessary components comprising the Consensus layer of a Cardano blockchain node. The main goal of this report is to provide guidance to software engineers that intend to implement the Consensus layer of a Cardano node from scratch.
22

3+
This document is a work in progress. We strive to provide a set of requirements for and responsibilities of the Consensus layer that is agnostic of a particular implementation. However, we still very much are informed by the Haskell implementation in [ouroboros-consensus](https://github.com/IntersectMBO/ouroboros-consensus) as it is the only one.
4+
35
# Introduction
46

57
# What does Consensus and Storage need to do: responsibilities and requirements
68

79
## Responsibilities of the Consensus Layer
810

11+
```mermaid
12+
flowchart TD
13+
A("Consensus") --> B("Mint Blocks")
14+
A("Consensus") --> C("Select From Candidate Chains")
15+
A("Consensus") --> D("Accept transactions into Mempool")
16+
A("Consensus") --> E("Store and Serve Historical Chain")
17+
18+
19+
B("Mint Blocks") --> B1("Ask Ledger to Validate Transations")
20+
21+
C("Select From Candidate Chains") --> A1("Fetch Chains from Peers")
22+
```
23+
924
The Consensus layer is responsible to choose among the different chains that might
1025
co-exist on the network. The candidate chains may arise both from honest and adversarial participation.
1126
The Consensus layer must implement the consensus protocol or the current Cardano era to replicate the blockchain across all Cardano nodes.
@@ -59,17 +74,94 @@ for blocks to arrive on time to the next block minters.
5974

6075
Transmit chains as fast as possible so that blocks arrive to the next forger in time.
6176

62-
# Single-era Consensus Layer
77+
In Cardano, transactions are distributed twice: once as pending transactions that exist outside of a block and again when the transaction is directly included within some minted block.
6378

64-
This section describes the components of the Consensus layer of the Cardano Node, as if Cardano only ever had one era. While this assumption greatly simplifies the implementation of the Consensus layer, one must keep that if such an assumption is made, in may not be straightforward or even possible to implement a node that is capable of syncing with Cardano mainnet. However, we still argue that this section is useful as an educational material.
79+
A block is distributed once among the caught-up nodes when it's minted, and then potentially any number of times later to nodes that are trying to join/catch back up to the network after this block was minted.
6580

66-
We need to take both advantage of the rigorous structure we have in the code base, but at the same time take care not to expose the non-Haskell target audience to the full power of abstraction it provides. We can achieve it by instantiating the abstractions we have at concrete types. We may also want to monomorphise, i.e. remove the typeclass constraints as much as possible after instantiating them. We may then proceed to obfuscate the Haskell syntax as something else.
81+
In today's Cardano network, moreover, block headers diffuse before their blocks, so that nodes only download blocks they prefer in that moment.
6782

68-
The key typeclass we need to instantiate is ConsensusProtocol.
83+
# Single-era Consensus Layer
6984

7085
## Outline of the Consensus components
7186

72-
The section 5.2 of the `network-design` document can be imported into the whitepaper almost wholesale. It gives a good outline of the tasks that the consensus layer is supposed to be able to perform.
87+
### ChainSync client
88+
89+
Each ChainSync client maintains an upstream peer's candidate chain.
90+
91+
The protocol state is also independently maintained alongside each candidate chain by ChainSync.
92+
93+
#### Details
94+
95+
It disconnects if the peer violates the mini protocol, if the peer sends an invalid header, or if switching to their chain would require rolling back more than kcp blocks of the local selection.
96+
97+
*TODO*(georgy): what is kcp here?
98+
99+
It's able to their validate headers past the intersection with the local selection because the parts of the ledger state necessary to validate a header were completely determined some positive number of slots ago on that header's chain; see _forecasting_ within the ledger rules and _snapshots_ within the ledger state.
100+
101+
ChainSync blocks while the candidate chain is past the forecast range of the local selection.
102+
The forecast range must be great enough for the peer to rescue the local node from the worse-case scenario.
103+
The Praos paper bounds this to needing at most kcp+1 headers from the peer, which that paper also bounds to requiring at most scg slots of forecast range.
104+
The Genesis paper is also satisfied by a forecast range of scg.
105+
(The ChainSync server is much simpler than the client; see _followers_ below.)
106+
107+
### ChainSync server
108+
109+
ChainSync server provides an iterator into the ChainDB for downstream peers to be able to download headers and blocks.
110+
111+
#### Details
112+
113+
Moreover — because the node must serve the whole chain and not only the historical chain — each ChainSync server actually uses a _follower_ abstraction, which is implemented via iterators and additionally supports the fact that ChainSel might have to rollback a follower if it's already within kcp of the local selection's tip.
114+
(Even the pipelining signaling from ChainSel to ChainSync clients is implemented via a follower, one that follows the so-called _tentative chain_ instead of just the actual local selection.)
115+
116+
### BlockFetch client and client coordinator
117+
118+
The client-side of the BlockFetch mini-protocol comprises the client itself and the centralised logic that coordinates multiple clients.
119+
Based on the set of all candidate chains and the local selection, the centralized BlockFetch logic (one instance, no matter how many peers) decides which of the candidate chains to fetch and which particular blocks to fetch from which peers.
120+
It instructs the corresponding BlockFetch clients (one per upstream peer) to fetch those blocks and add them to the ChainDB.
121+
The client disconnects if the peer violates the mini protocol or if it sends a block that doesn't match (eg different hash) the requested portion of the snapshot of its candidate chain that lead to that request.
122+
123+
### BlockFetch server
124+
125+
The BlockFetch server uses a mere iterator instead of a follower because each fetch request is so specific; their only corner case involves garbage collection discarding a block while a corresponding fetch request is being served.
126+
127+
### ChainSel
128+
129+
The ChainDB's ChainSel logic persists each fetched block to the ChainDB and then uses that block to improve the local selection if possible.
130+
(No other component persists blocks or mutates the selection, only ChainSel.)
131+
Improving the selection requires validation of the fetched block (and maybe more, if blocks arrived out of order).
132+
If the fetched block is invalid, ChainSel disconnects from the upstream peer who sent it, unless that block may have been pipelined; see the specific pipelining rules.
133+
In turn, if a fetched block should be pipelined, ChainSel signals the ChainSync servers to send that header just before it starts validating that block.
134+
If it turns out to be invalid, ChainSel promptly signals the ChainSync servers to send the corresponding MsgRollBack.
135+
136+
The combined ledger and protocol state is maintained alongside the local selection by ChainSel, so that blocks can be validated.
137+
138+
### ChainDB
139+
140+
A Praos node must not introduce unnecessary delays between receiving a block and forwarding it along.
141+
It is therefore an important observation that the ChainDB does not require the Durability property of ACID: upstream peers will always be available to replace any blocks a node loses.
142+
143+
### LedgerDB
144+
145+
In both ChainSel and ChainSync, rollbacks require maintenance of/access to the past kcp+1 states, not only the tip's state — access to any such state must be fast enough to avoid disrupting the semantics of the worst-case delay parameter Delta assumed in the Praos paper's security argument.
146+
147+
In addition to validation in ChainSel and ChainSync, these ledger states are how the node handles a fixed set of queries used by wallets, CLI tools, etc via the LocalStateQuery mini protocol.
148+
149+
### Mempool & TxSubmission
150+
151+
The Mempool maintains a sequence of transactions that could inhabit a hypothetical block that is valid and extends the current local selection.
152+
The Mempool is bounded via a multi-dimensional notion of size such that it never contains more transactions than could fit in N blocks.
153+
Peers synchronize their Mempools via the TxSubmission protocol.
154+
This mini protocol leverages the fact that the Mempool is a sequence as opposed to an unordered set; a simple integer suffices as the iterator state for the TxSub client.
155+
(Recall that transactions flow from client to server, since the orientation is determined by the flow of blocks and pending transactions naturally flow opposite of blocks.)
156+
157+
### The Mint aka Block Forge
158+
159+
Whenever the wall clock enters a new slot, the node checks whether its configured minting credentials (if any) were elected by the protocol state and forecasted ledger state to lead this slot.
160+
If so, it mints a block that extends its local selection (or its immediate predecessor if the tip somehow already occupies this slot or greater) and contains the longest prefix of the Mempool's transactions that can fit.
161+
That block is then sent to ChainSel, as if it had been fetched.
162+
163+
When (if) the node selects that block, the Mempool will react as it does for any change in the local selection: it discards any transactions that are no longer valid in the updated hypothetical block the Mempool continuously represents.
164+
Because every Ouroboros transaction _consumes_ at least one UTxO, the transactions in a newly minted and selected block will definitely be discarded.
73165

74166
## Interaction with the Networking layer
75167

@@ -81,8 +173,23 @@ The implementations of the mini-protocols are here: `ouroboros-consensus/src/our
81173

82174
This section should describe the concepts necessary to implement the storage subsystem of a Cardano node. The description should focus on things that are necessary to keep track of to implement i.e. Praos, but do not go into the details about how these things are stored. No need to discuss in-memory vs on-dist storage and mutable vs persistent data structures, as we have separate documents for this.
83175

176+
## Some Important Optimizations
177+
178+
Both both blocks and transactions can be applied much more quickly if they are known to have been previously validated.
179+
For blocks, the outcome will be exact same, since they can only validly extend a single chain.
180+
A transaction, though, might be validated against a different chain/ledger state than the first time, so many checks still need to happen.
181+
But many "static" checks don't need to be repeated, since they'd fail regardless of the ledger state.
182+
183+
Despite block reapplication being much faster than initial validation, the node should not need to reapply their entire historical chain whenever it is restarted.
184+
The node instead occasionally persists its oldest ledger state (ie the kcp+1st).
185+
On startup, the node only needs to deserialize that snapshotted ledger state and then replay the best chain amongs the persisted blocks it has that extends this ledger state in order to re-establish its kcp+1 ledger states.
186+
84187
# Multi-era Considerations
85188

189+
Ledger rules and consensus protocols evolve over time.
190+
In Cardano, the on-chain governance ultimately decides when to adopt backwards-incompatible changes, by incrementing the major component of the protocol version.
191+
The first such _era_ transition switched to the Praos protocol.
192+
86193
| Responsibility | Timing | Description | | |
87194
|:-------------------------|:---------------------------------------------------|:-----------------------------------------------------------------------------------------------------|:--|:--|
88195
| Process historical chain | slot number of the era boundaries known statically | switch and translate between the statically known historical sequence of revisions of block formats, | | |
@@ -111,22 +218,19 @@ Cons: abstraction is a double-edged sword and may be difficult to encode in some
111218

112219
## Replaying history of multiple eras
113220

114-
When replaying the chain in `ouroboros-consensus`, as I understand, we use the same code of the HardFork combinator that is used when actually producing/validating these blocks. This is nice because we do not have to have separate code path for the historic chain and caught-up protocol participation.
115-
116-
But obviously we don not have to use the HFC. What would be a reasonable way to enable a non-HFC node to catch up with the chain? Would this hypothetical node be able to enjoy code reuse for the historic and non-historic blocks? In the worst case, we should be able to hard-code the historical eras.
117-
118-
## The Hard Fork Combinator: a uniform way to support multiple eras}
119-
120-
Ideally, this would be a short section that should outline the core ideas behind the HFC without a single line of Haskell. The purpose should be to demonstrate the benefits of having an abstract interface for describing mixed-era blocks. The interested reader would be then referred to an extended, Haskell-enabled document that described the HFC in its full glory.
121-
122221
Cardano has the peculiarity of being a multi-era network, in which at given
123222
points in the chain, new backwards-incompatible features were added to the
124223
ledger. Consensus, as it needs to be able to replay the whole chain, needs to
125224
implement some mechanism to switch the logic used for each of the eras, in
126225
particular the Ledger layer exposes different implementations for each one of
127226
the ledger eras.
128227

129-
*Hard Fork Combinator (HFC)* is mechanism that handles era transitions, including changes to ledger rules, block formats and even consensus protocols. It automates translations between different eras and provides a minimal interface for defining specific translations that obey rigorous laws.
228+
## The Hard Fork Combinator: a uniform way to support multiple eras}
229+
230+
*Hard Fork Combinator (HFC)* is mechanism that handles era transitions, including changes to ledger rules, block formats and even consensus protocols. It automates translations between different eras and provides a minimal interface for defining specific translations that obey rigorous laws. The HFC was primarily introduced to handle the fundamental complexity that arises when the translation between wall clocks and slot onsets depends on the ledger state (since on-chain governance determines when the era transition happens).
231+
It also handles the comparatively simple bookkeeping of the protocol, ledger, codecs, and so on changing on the era boundary — ie a variety of blocks, transactions, etc co-existing on the same chain in a pre-specified sequence but not according to a pre-specified schedule.
232+
Lastly, it handles the fact that ticking and forecasting can cross era boundaries, which requires translation of ledger states, protocol states, and pending transactions from one era to the next.
233+
The HFC cannot automatically infer the implementation of these necessities, but it automates as much as possible against a minimal interface that requires the definition of the specific translations etc.
130234

131235
| Component | Responsibility | | Description | | |
132236
|:--------------------|:-------------------------|:---------------------------------------------------|:-----------------------------------------------------------------------------------------------------|:--|:--|

0 commit comments

Comments
 (0)