Markdown linter (#1309)

* Add markdown linting

- add linter default rules
- adapt rules to current code
- fix the code for linting to pass
- add CI check

fix #1243

* Fix markdown for Substrate
* Fix tooling install
* Fix workflow
* Add documentation
* Remove trailing spaces
* Update .github/.markdownlint.yaml

Co-authored-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io>
* Fix mangled markdown/lists
* Fix captalization issues on known words
This commit is contained in:
Chevdor
2023-09-04 11:02:32 +02:00
committed by GitHub
parent 830fde2a60
commit a30092ab42
271 changed files with 6289 additions and 4450 deletions
@@ -2,30 +2,49 @@
## Design Goals
* Modularity: Components of the system should be as self-contained as possible. Communication boundaries between components should be well-defined and mockable. This is key to creating testable, easily reviewable code.
* Minimizing side effects: Components of the system should aim to minimize side effects and to communicate with other components via message-passing.
* Operational Safety: The software will be managing signing keys where conflicting messages can lead to large amounts of value to be slashed. Care should be taken to ensure that no messages are signed incorrectly or in conflict with each other.
* Modularity: Components of the system should be as self-contained as possible. Communication boundaries between
components should be well-defined and mockable. This is key to creating testable, easily reviewable code.
* Minimizing side effects: Components of the system should aim to minimize side effects and to communicate with other
components via message-passing.
* Operational Safety: The software will be managing signing keys where conflicting messages can lead to large amounts of
value to be slashed. Care should be taken to ensure that no messages are signed incorrectly or in conflict with each
other.
The architecture of the node-side behavior aims to embody the Rust principles of ownership and message-passing to create clean, isolatable code. Each resource should have a single owner, with minimal sharing where unavoidable.
The architecture of the node-side behavior aims to embody the Rust principles of ownership and message-passing to create
clean, isolatable code. Each resource should have a single owner, with minimal sharing where unavoidable.
Many operations that need to be carried out involve the network, which is asynchronous. This asynchrony affects all core subsystems that rely on the network as well. The approach of hierarchical state machines is well-suited to this kind of environment.
Many operations that need to be carried out involve the network, which is asynchronous. This asynchrony affects all core
subsystems that rely on the network as well. The approach of hierarchical state machines is well-suited to this kind of
environment.
We introduce
## Components
The node architecture consists of the following components:
* The Overseer (and subsystems): A hierarchy of state machines where an overseer supervises subsystems. Subsystems can contain their own internal hierarchy of jobs. This is elaborated on in the next section on Subsystems.
* The Overseer (and subsystems): A hierarchy of state machines where an overseer supervises subsystems. Subsystems can
contain their own internal hierarchy of jobs. This is elaborated on in the next section on Subsystems.
* A block proposer: Logic triggered by the consensus algorithm of the chain when the node should author a block.
* A GRANDPA voting rule: A strategy for selecting chains to vote on in the GRANDPA algorithm to ensure that only valid parachain candidates appear in finalized relay-chain blocks.
* A GRANDPA voting rule: A strategy for selecting chains to vote on in the GRANDPA algorithm to ensure that only valid
parachain candidates appear in finalized relay-chain blocks.
## Assumptions
The Node-side code comes with a set of assumptions that we build upon. These assumptions encompass most of the fundamental blockchain functionality.
The Node-side code comes with a set of assumptions that we build upon. These assumptions encompass most of the
fundamental blockchain functionality.
We assume the following constraints regarding provided basic functionality:
* The underlying **consensus** algorithm, whether it is BABE or SASSAFRAS is implemented.
* There is a **chain synchronization** protocol which will search for and download the longest available chains at all times.
* The **state** of all blocks at the head of the chain is available. There may be **state pruning** such that state of the last `k` blocks behind the last finalized block are available, as well as the state of all their descendants. This assumption implies that the state of all active leaves and their last `k` ancestors are all available. The underlying implementation is expected to support `k` of a few hundred blocks, but we reduce this to a very conservative `k=5` for our purposes.
* There is an underlying **networking** framework which provides **peer discovery** services which will provide us with peers and will not create "loopback" connections to our own node. The number of peers we will have is assumed to be bounded at 1000.
* There is a **transaction pool** and a **transaction propagation** mechanism which maintains a set of current transactions and distributes to connected peers. Current transactions are those which are not outdated relative to some "best" fork of the chain, which is part of the active heads, and have not been included in the best fork.
* There is a **chain synchronization** protocol which will search for and download the longest available chains at all
times.
* The **state** of all blocks at the head of the chain is available. There may be **state pruning** such that state of
the last `k` blocks behind the last finalized block are available, as well as the state of all their descendants.
This assumption implies that the state of all active leaves and their last `k` ancestors are all available. The
underlying implementation is expected to support `k` of a few hundred blocks, but we reduce this to a very
conservative `k=5` for our purposes.
* There is an underlying **networking** framework which provides **peer discovery** services which will provide us
with peers and will not create "loopback" connections to our own node. The number of peers we will have is assumed
to be bounded at 1000.
* There is a **transaction pool** and a **transaction propagation** mechanism which maintains a set of current
transactions and distributes to connected peers. Current transactions are those which are not outdated relative to
some "best" fork of the chain, which is part of the active heads, and have not been included in the best fork.
@@ -2,6 +2,9 @@
The approval subsystems implement the node-side of the [Approval Protocol](../../protocol-approval.md).
We make a divide between the [assignment/voting logic](approval-voting.md) and the [distribution logic](approval-distribution.md) that distributes assignment certifications and approval votes. The logic in the assignment and voting also informs the GRANDPA voting rule on how to vote.
We make a divide between the [assignment/voting logic](approval-voting.md) and the [distribution
logic](approval-distribution.md) that distributes assignment certifications and approval votes. The logic in the
assignment and voting also informs the GRANDPA voting rule on how to vote.
These subsystems are intended to flag issues and begin participating in live disputes. Dispute subsystems also track all observed votes (backing, approval, and dispute-specific) by all validators on all candidates.
These subsystems are intended to flag issues and begin participating in live disputes. Dispute subsystems also track all
observed votes (backing, approval, and dispute-specific) by all validators on all candidates.
@@ -2,50 +2,73 @@
A subsystem for the distribution of assignments and approvals for approval checks on candidates over the network.
The [Approval Voting](approval-voting.md) subsystem is responsible for active participation in a protocol designed to select a sufficient number of validators to check each and every candidate which appears in the relay chain. Statements of participation in this checking process are divided into two kinds:
- **Assignments** indicate that validators have been selected to do checking
- **Approvals** indicate that validators have checked and found the candidate satisfactory.
The [Approval Voting](approval-voting.md) subsystem is responsible for active participation in a protocol designed to
select a sufficient number of validators to check each and every candidate which appears in the relay chain. Statements
of participation in this checking process are divided into two kinds:
* **Assignments** indicate that validators have been selected to do checking
* **Approvals** indicate that validators have checked and found the candidate satisfactory.
The [Approval Voting](approval-voting.md) subsystem handles all the issuing and tallying of this protocol, but this subsystem is responsible for the disbursal of statements among the validator-set.
The [Approval Voting](approval-voting.md) subsystem handles all the issuing and tallying of this protocol, but this
subsystem is responsible for the disbursal of statements among the validator-set.
The inclusion pipeline of candidates concludes after availability, and only after inclusion do candidates actually get pushed into the approval checking pipeline. As such, this protocol deals with the candidates _made available by_ particular blocks, as opposed to the candidates which actually appear within those blocks, which are the candidates _backed by_ those blocks. Unless stated otherwise, whenever we reference a candidate partially by block hash, we are referring to the set of candidates _made available by_ those blocks.
The inclusion pipeline of candidates concludes after availability, and only after inclusion do candidates actually get
pushed into the approval checking pipeline. As such, this protocol deals with the candidates _made available by_
particular blocks, as opposed to the candidates which actually appear within those blocks, which are the candidates
_backed by_ those blocks. Unless stated otherwise, whenever we reference a candidate partially by block hash, we are
referring to the set of candidates _made available by_ those blocks.
We implement this protocol as a gossip protocol, and like other parachain-related gossip protocols our primary concerns are about ensuring fast message propagation while maintaining an upper bound on the number of messages any given node must store at any time.
We implement this protocol as a gossip protocol, and like other parachain-related gossip protocols our primary concerns
are about ensuring fast message propagation while maintaining an upper bound on the number of messages any given node
must store at any time.
Approval messages should always follow assignments, so we need to be able to discern two pieces of information based on our [View](../../types/network.md#universal-types):
Approval messages should always follow assignments, so we need to be able to discern two pieces of information based on
our [View](../../types/network.md#universal-types):
1. Is a particular assignment relevant under a given `View`?
2. Is a particular approval relevant to any assignment in a set?
For our own local view, these two queries must not yield false negatives. When applied to our peers' views, it is acceptable for them to yield false negatives. The reason for that is that our peers' views may be beyond ours, and we are not capable of fully evaluating them. Once we have caught up, we can check again for false negatives to continue distributing.
For our own local view, these two queries must not yield false negatives. When applied to our peers' views, it is
acceptable for them to yield false negatives. The reason for that is that our peers' views may be beyond ours, and we
are not capable of fully evaluating them. Once we have caught up, we can check again for false negatives to continue
distributing.
For assignments, what we need to be checking is whether we are aware of the (block, candidate) pair that the assignment references. For approvals, we need to be aware of an assignment by the same validator which references the candidate being approved.
For assignments, what we need to be checking is whether we are aware of the (block, candidate) pair that the assignment
references. For approvals, we need to be aware of an assignment by the same validator which references the candidate
being approved.
However, awareness on its own of a (block, candidate) pair would imply that even ancient candidates all the way back to the genesis are relevant. We are actually not interested in anything before finality.
However, awareness on its own of a (block, candidate) pair would imply that even ancient candidates all the way back to
the genesis are relevant. We are actually not interested in anything before finality.
We gossip assignments along a grid topology produced by the [Gossip Support Subsystem](../utility/gossip-support.md) and also to a few random peers. The first time we accept an assignment or approval, regardless of the source, which originates from a validator peer in a shared dimension of the grid, we propagate the message to validator peers in the unshared dimension as well as a few random peers.
We gossip assignments along a grid topology produced by the [Gossip Support Subsystem](../utility/gossip-support.md) and
also to a few random peers. The first time we accept an assignment or approval, regardless of the source, which
originates from a validator peer in a shared dimension of the grid, we propagate the message to validator peers in the
unshared dimension as well as a few random peers.
But, in case these mechanisms don't work on their own, we need to trade bandwidth for protocol liveness by introducing aggression.
But, in case these mechanisms don't work on their own, we need to trade bandwidth for protocol liveness by introducing
aggression.
Aggression has 3 levels:
Aggression Level 0: The basic behaviors described above.
Aggression Level 1: The originator of a message sends to all peers. Other peers follow the rules above.
Aggression Level 2: All peers send all messages to all their row and column neighbors. This means that each validator will, on average, receive each message approximately 2*sqrt(n) times.
* Aggression Level 0: The basic behaviors described above.
* Aggression Level 1: The originator of a message sends to all peers. Other peers follow the rules above.
* Aggression Level 2: All peers send all messages to all their row and column neighbors. This means that each validator
will, on average, receive each message approximately 2*sqrt(n) times.
These aggression levels are chosen based on how long a block has taken to finalize: assignments and approvals related to the unfinalized block will be propagated with more aggression. In particular, it's only the earliest unfinalized blocks that aggression should be applied to, because descendants may be unfinalized only by virtue of being descendants.
These aggression levels are chosen based on how long a block has taken to finalize: assignments and approvals related to
the unfinalized block will be propagated with more aggression. In particular, it's only the earliest unfinalized blocks
that aggression should be applied to, because descendants may be unfinalized only by virtue of being descendants.
## Protocol
Input:
- `ApprovalDistributionMessage::NewBlocks`
- `ApprovalDistributionMessage::DistributeAssignment`
- `ApprovalDistributionMessage::DistributeApproval`
- `ApprovalDistributionMessage::NetworkBridgeUpdate`
- `OverseerSignal::BlockFinalized`
* `ApprovalDistributionMessage::NewBlocks`
* `ApprovalDistributionMessage::DistributeAssignment`
* `ApprovalDistributionMessage::DistributeApproval`
* `ApprovalDistributionMessage::NetworkBridgeUpdate`
* `OverseerSignal::BlockFinalized`
Output:
- `ApprovalVotingMessage::CheckAndImportAssignment`
- `ApprovalVotingMessage::CheckAndImportApproval`
- `NetworkBridgeMessage::SendValidationMessage::ApprovalDistribution`
* `ApprovalVotingMessage::CheckAndImportAssignment`
* `ApprovalVotingMessage::CheckAndImportApproval`
* `NetworkBridgeMessage::SendValidationMessage::ApprovalDistribution`
## Functionality
@@ -134,28 +157,37 @@ Iterate over every `BlockEntry` and remove `PeerId` from it.
#### `NetworkBridgeEvent::OurViewChange`
Remove entries in `pending_known` for all hashes not present in the view.
Ensure a vector is present in `pending_known` for each hash in the view that does not have an entry in `blocks`.
Remove entries in `pending_known` for all hashes not present in the view. Ensure a vector is present in `pending_known`
for each hash in the view that does not have an entry in `blocks`.
#### `NetworkBridgeEvent::PeerViewChange`
Invoke `unify_with_peer(peer, view)` to catch them up to messages we have.
We also need to use the `view.finalized_number` to remove the `PeerId` from any blocks that it won't be wanting information about anymore. Note that we have to be on guard for peers doing crazy stuff like jumping their `finalized_number` forward 10 trillion blocks to try and get us stuck in a loop for ages.
We also need to use the `view.finalized_number` to remove the `PeerId` from any blocks that it won't be wanting
information about anymore. Note that we have to be on guard for peers doing crazy stuff like jumping their
`finalized_number` forward 10 trillion blocks to try and get us stuck in a loop for ages.
One of the safeguards we can implement is to reject view updates from peers where the new `finalized_number` is less than the previous.
One of the safeguards we can implement is to reject view updates from peers where the new `finalized_number` is less
than the previous.
We augment that by defining `constrain(x)` to output the x bounded by the first and last numbers in `state.blocks_by_number`.
We augment that by defining `constrain(x)` to output the x bounded by the first and last numbers in
`state.blocks_by_number`.
From there, we can loop backwards from `constrain(view.finalized_number)` until `constrain(last_view.finalized_number)` is reached, removing the `PeerId` from all `BlockEntry`s referenced at that height. We can break the loop early if we ever exit the bound supplied by the first block in `state.blocks_by_number`.
From there, we can loop backwards from `constrain(view.finalized_number)` until `constrain(last_view.finalized_number)`
is reached, removing the `PeerId` from all `BlockEntry`s referenced at that height. We can break the loop early if we
ever exit the bound supplied by the first block in `state.blocks_by_number`.
#### `NetworkBridgeEvent::PeerMessage`
If the block hash referenced by the message exists in `pending_known`, add it to the vector of pending messages and return.
If the block hash referenced by the message exists in `pending_known`, add it to the vector of pending messages and
return.
If the message is of type `ApprovalDistributionV1Message::Assignment(assignment_cert, claimed_index)`, then call `import_and_circulate_assignment(MessageSource::Peer(sender), assignment_cert, claimed_index)`
If the message is of type `ApprovalDistributionV1Message::Assignment(assignment_cert, claimed_index)`, then call
`import_and_circulate_assignment(MessageSource::Peer(sender), assignment_cert, claimed_index)`
If the message is of type `ApprovalDistributionV1Message::Approval(approval_vote)`, then call `import_and_circulate_approval(MessageSource::Peer(sender), approval_vote)`
If the message is of type `ApprovalDistributionV1Message::Approval(approval_vote)`, then call
`import_and_circulate_approval(MessageSource::Peer(sender), approval_vote)`
### Subsystem Updates
@@ -164,7 +196,8 @@ If the message is of type `ApprovalDistributionV1Message::Approval(approval_vote
Create `BlockEntry` and `CandidateEntries` for all blocks.
For all entries in `pending_known`:
* If there is now an entry under `blocks` for the block hash, drain all messages and import with `import_and_circulate_assignment` and `import_and_circulate_approval`.
* If there is now an entry under `blocks` for the block hash, drain all messages and import with
`import_and_circulate_assignment` and `import_and_circulate_approval`.
For all peers:
* Compute `view_intersection` as the intersection of the peer's view blocks with the hashes of the new blocks.
@@ -180,7 +213,8 @@ Call `import_and_circulate_approval` with `MessageSource::Local`.
#### `OverseerSignal::BlockFinalized`
Prune all lists from `blocks_by_number` with number less than or equal to `finalized_number`. Prune all the `BlockEntry`s referenced by those lists.
Prune all lists from `blocks_by_number` with number less than or equal to `finalized_number`. Prune all the
`BlockEntry`s referenced by those lists.
### Utility
@@ -192,9 +226,14 @@ enum MessageSource {
}
```
#### `import_and_circulate_assignment(source: MessageSource, assignment: IndirectAssignmentCert, claimed_candidate_index: CandidateIndex)`
#### `import_and_circulate_assignment(...)`
Imports an assignment cert referenced by block hash and candidate index. As a postcondition, if the cert is valid, it will have distributed the cert to all peers who have the block in their view, with the exclusion of the peer referenced by the `MessageSource`.
`import_and_circulate_assignment(source: MessageSource, assignment: IndirectAssignmentCert, claimed_candidate_index:
CandidateIndex)`
Imports an assignment cert referenced by block hash and candidate index. As a postcondition, if the cert is valid, it
will have distributed the cert to all peers who have the block in their view, with the exclusion of the peer referenced
by the `MessageSource`.
We maintain a few invariants:
* we only send an assignment to a peer after we add its fingerprint to our knowledge
@@ -202,61 +241,84 @@ We maintain a few invariants:
The algorithm is the following:
* Load the `BlockEntry` using `assignment.block_hash`. If it does not exist, report the source if it is `MessageSource::Peer` and return.
* Load the `BlockEntry` using `assignment.block_hash`. If it does not exist, report the source if it is
`MessageSource::Peer` and return.
* Compute a fingerprint for the `assignment` using `claimed_candidate_index`.
* If the source is `MessageSource::Peer(sender)`:
* check if `peer` appears under `known_by` and whether the fingerprint is in the knowledge of the peer. If the peer does not know the block, report for providing data out-of-view and proceed. If the peer does know the block and the `sent` knowledge contains the fingerprint, report for providing replicate data and return, otherwise, insert into the `received` knowledge and return.
* If the message fingerprint appears under the `BlockEntry`'s `Knowledge`, give the peer a small positive reputation boost,
add the fingerprint to the peer's knowledge only if it knows about the block and return.
Note that we must do this after checking for out-of-view and if the peers knows about the block to avoid being spammed.
If we did this check earlier, a peer could provide data out-of-view repeatedly and be rewarded for it.
* check if `peer` appears under `known_by` and whether the fingerprint is in the knowledge of the peer. If the peer
does not know the block, report for providing data out-of-view and proceed. If the peer does know the block and
the `sent` knowledge contains the fingerprint, report for providing replicate data and return, otherwise, insert
into the `received` knowledge and return.
* If the message fingerprint appears under the `BlockEntry`'s `Knowledge`, give the peer a small positive reputation
boost, add the fingerprint to the peer's knowledge only if it knows about the block and return. Note that we must do
this after checking for out-of-view and if the peers knows about the block to avoid being spammed. If we did this
check earlier, a peer could provide data out-of-view repeatedly and be rewarded for it.
* Dispatch `ApprovalVotingMessage::CheckAndImportAssignment(assignment)` and wait for the response.
* If the result is `AssignmentCheckResult::Accepted`
* If the vote was accepted but not duplicate, give the peer a positive reputation boost
* add the fingerprint to both our and the peer's knowledge in the `BlockEntry`. Note that we only doing this after making sure we have the right fingerprint.
* If the result is `AssignmentCheckResult::AcceptedDuplicate`, add the fingerprint to the peer's knowledge if it knows about the block and return.
* add the fingerprint to both our and the peer's knowledge in the `BlockEntry`. Note that we only doing this after
making sure we have the right fingerprint.
* If the result is `AssignmentCheckResult::AcceptedDuplicate`, add the fingerprint to the peer's knowledge if it
knows about the block and return.
* If the result is `AssignmentCheckResult::TooFarInFuture`, mildly punish the peer and return.
* If the result is `AssignmentCheckResult::Bad`, punish the peer and return.
* If the source is `MessageSource::Local(CandidateIndex)`
* check if the fingerprint appears under the `BlockEntry's` knowledge. If not, add it.
* Load the candidate entry for the given candidate index. It should exist unless there is a logic error in the approval voting subsystem.
* Set the approval state for the validator index to `ApprovalState::Assigned` unless the approval state is set already. This should not happen as long as the approval voting subsystem instructs us to ignore duplicate assignments.
* Dispatch a `ApprovalDistributionV1Message::Assignment(assignment, candidate_index)` to all peers in the `BlockEntry`'s `known_by` set, excluding the peer in the `source`, if `source` has kind `MessageSource::Peer`. Add the fingerprint of the assignment to the knowledge of each peer.
* Load the candidate entry for the given candidate index. It should exist unless there is a logic error in the
approval voting subsystem.
* Set the approval state for the validator index to `ApprovalState::Assigned` unless the approval state is set
already. This should not happen as long as the approval voting subsystem instructs us to ignore duplicate
assignments.
* Dispatch a `ApprovalDistributionV1Message::Assignment(assignment, candidate_index)` to all peers in the
`BlockEntry`'s `known_by` set, excluding the peer in the `source`, if `source` has kind `MessageSource::Peer`. Add
the fingerprint of the assignment to the knowledge of each peer.
#### `import_and_circulate_approval(source: MessageSource, approval: IndirectSignedApprovalVote)`
Imports an approval signature referenced by block hash and candidate index:
* Load the `BlockEntry` using `approval.block_hash` and the candidate entry using `approval.candidate_entry`. If either does not exist, report the source if it is `MessageSource::Peer` and return.
* Load the `BlockEntry` using `approval.block_hash` and the candidate entry using `approval.candidate_entry`. If
either does not exist, report the source if it is `MessageSource::Peer` and return.
* Compute a fingerprint for the approval.
* Compute a fingerprint for the corresponding assignment. If the `BlockEntry`'s knowledge does not contain that fingerprint, then report the source if it is `MessageSource::Peer` and return. All references to a fingerprint after this refer to the approval's, not the assignment's.
* Compute a fingerprint for the corresponding assignment. If the `BlockEntry`'s knowledge does not contain that
fingerprint, then report the source if it is `MessageSource::Peer` and return. All references to a fingerprint after
this refer to the approval's, not the assignment's.
* If the source is `MessageSource::Peer(sender)`:
* check if `peer` appears under `known_by` and whether the fingerprint is in the knowledge of the peer. If the peer does not know the block, report for providing data out-of-view and proceed. If the peer does know the block and the `sent` knowledge contains the fingerprint, report for providing replicate data and return, otherwise, insert into the `received` knowledge and return.
* If the message fingerprint appears under the `BlockEntry`'s `Knowledge`, give the peer a small positive reputation boost,
add the fingerprint to the peer's knowledge only if it knows about the block and return.
Note that we must do this after checking for out-of-view to avoid being spammed. If we did this check earlier, a peer could provide data out-of-view repeatedly and be rewarded for it.
* check if `peer` appears under `known_by` and whether the fingerprint is in the knowledge of the peer. If the peer
does not know the block, report for providing data out-of-view and proceed. If the peer does know the block and
the `sent` knowledge contains the fingerprint, report for providing replicate data and return, otherwise, insert
into the `received` knowledge and return.
* If the message fingerprint appears under the `BlockEntry`'s `Knowledge`, give the peer a small positive reputation
boost, add the fingerprint to the peer's knowledge only if it knows about the block and return. Note that we must do
this after checking for out-of-view to avoid being spammed. If we did this check earlier, a peer could provide data
out-of-view repeatedly and be rewarded for it.
* Dispatch `ApprovalVotingMessage::CheckAndImportApproval(approval)` and wait for the response.
* If the result is `VoteCheckResult::Accepted(())`:
* Give the peer a positive reputation boost and add the fingerprint to both our and the peer's knowledge.
* If the result is `VoteCheckResult::Bad`:
* Report the peer and return.
* Load the candidate entry for the given candidate index. It should exist unless there is a logic error in the approval voting subsystem.
* Set the approval state for the validator index to `ApprovalState::Approved`. It should already be in the `Assigned` state as our `BlockEntry` knowledge contains a fingerprint for the assignment.
* Dispatch a `ApprovalDistributionV1Message::Approval(approval)` to all peers in the `BlockEntry`'s `known_by` set, excluding the peer in the `source`, if `source` has kind `MessageSource::Peer`. Add the fingerprint of the assignment to the knowledge of each peer. Note that this obeys the politeness conditions:
* Load the candidate entry for the given candidate index. It should exist unless there is a logic error in the
approval voting subsystem.
* Set the approval state for the validator index to `ApprovalState::Approved`. It should already be in the `Assigned`
state as our `BlockEntry` knowledge contains a fingerprint for the assignment.
* Dispatch a `ApprovalDistributionV1Message::Approval(approval)` to all peers in the `BlockEntry`'s `known_by` set,
excluding the peer in the `source`, if `source` has kind `MessageSource::Peer`. Add the fingerprint of the
assignment to the knowledge of each peer. Note that this obeys the politeness conditions:
* We guarantee elsewhere that all peers within `known_by` are aware of all assignments relative to the block.
* We've checked that this specific approval has a corresponding assignment within the `BlockEntry`.
* Thus, all peers are aware of the assignment or have a message to them in-flight which will make them so.
#### `unify_with_peer(peer: PeerId, view)`:
#### `unify_with_peer(peer: PeerId, view)`
1. Initialize a set `missing_knowledge = {}`
For each block in the view:
2. Load the `BlockEntry` for the block. If the block is unknown, or the number is less than or equal to the view's finalized number go to step 6.
3. Inspect the `known_by` set of the `BlockEntry`. If the peer already knows all assignments/approvals, go to step 6.
4. Add the peer to `known_by` and add the hash and missing knowledge of the block to `missing_knowledge`.
5. Return to step 2 with the ancestor of the block.
1. Load the `BlockEntry` for the block. If the block is unknown, or the number is less than or equal to the view's
finalized number go to step 6.
1. Inspect the `known_by` set of the `BlockEntry`. If the peer already knows all assignments/approvals, go to step 6.
1. Add the peer to `known_by` and add the hash and missing knowledge of the block to `missing_knowledge`.
1. Return to step 2 with the ancestor of the block.
6. For each block in `missing_knowledge`, send all assignments and approvals for all candidates in those blocks to the peer.
1. For each block in `missing_knowledge`, send all assignments and approvals for all candidates in those blocks to the
peer.
@@ -1,35 +1,61 @@
# Approval Voting
Reading the [section on the approval protocol](../../protocol-approval.md) will likely be necessary to understand the aims of this subsystem.
Reading the [section on the approval protocol](../../protocol-approval.md) will likely be necessary to understand the
aims of this subsystem.
Approval votes are split into two parts: Assignments and Approvals. Validators first broadcast their assignment to indicate intent to check a candidate. Upon successfully checking, they broadcast an approval vote. If a validator doesn't broadcast their approval vote shortly after issuing an assignment, this is an indication that they are being prevented from recovering or validating the block data and that more validators should self-select to check the candidate. This is known as a "no-show".
Approval votes are split into two parts: Assignments and Approvals. Validators first broadcast their assignment to
indicate intent to check a candidate. Upon successfully checking, they broadcast an approval vote. If a validator
doesn't broadcast their approval vote shortly after issuing an assignment, this is an indication that they are being
prevented from recovering or validating the block data and that more validators should self-select to check the
candidate. This is known as a "no-show".
The core of this subsystem is a Tick-based timer loop, where Ticks are 500ms. We also reason about time in terms of `DelayTranche`s, which measure the number of ticks elapsed since a block was produced. We track metadata for all un-finalized but included candidates. We compute our local assignments to check each candidate, as well as which `DelayTranche` those assignments may be minimally triggered at. As the same candidate may appear in more than one block, we must produce our potential assignments for each (Block, Candidate) pair. The timing loop is based on waiting for assignments to become no-shows or waiting to broadcast and begin our own assignment to check.
The core of this subsystem is a Tick-based timer loop, where Ticks are 500ms. We also reason about time in terms of
`DelayTranche`s, which measure the number of ticks elapsed since a block was produced. We track metadata for all
un-finalized but included candidates. We compute our local assignments to check each candidate, as well as which
`DelayTranche` those assignments may be minimally triggered at. As the same candidate may appear in more than one block,
we must produce our potential assignments for each (Block, Candidate) pair. The timing loop is based on waiting for
assignments to become no-shows or waiting to broadcast and begin our own assignment to check.
Another main component of this subsystem is the logic for determining when a (Block, Candidate) pair has been approved and when to broadcast and trigger our own assignment. Once a (Block, Candidate) pair has been approved, we mark a corresponding bit in the `BlockEntry` that indicates the candidate has been approved under the block. When we trigger our own assignment, we broadcast it via Approval Distribution, begin fetching the data from Availability Recovery, and then pass it through to the Candidate Validation. Once these steps are successful, we issue our approval vote. If any of these steps fail, we don't issue any vote and will "no-show" from the perspective of other validators in addition a dispute is raised via the dispute-coordinator, by sending `IssueLocalStatement`.
Another main component of this subsystem is the logic for determining when a (Block, Candidate) pair has been approved
and when to broadcast and trigger our own assignment. Once a (Block, Candidate) pair has been approved, we mark a
corresponding bit in the `BlockEntry` that indicates the candidate has been approved under the block. When we trigger
our own assignment, we broadcast it via Approval Distribution, begin fetching the data from Availability Recovery, and
then pass it through to the Candidate Validation. Once these steps are successful, we issue our approval vote. If any of
these steps fail, we don't issue any vote and will "no-show" from the perspective of other validators in addition a
dispute is raised via the dispute-coordinator, by sending `IssueLocalStatement`.
Where this all fits into Polkadot is via block finality. Our goal is to not finalize any block containing a candidate that is not approved. We provide a hook for a custom GRANDPA voting rule - GRANDPA makes requests of the form (target, minimum) consisting of a target block (i.e. longest chain) that it would like to finalize, and a minimum block which, due to the rules of GRANDPA, must be voted on. The minimum is typically the last finalized block, but may be beyond it, in the case of having a last-round-estimate beyond the last finalized. Thus, our goal is to inform GRANDPA of some block between target and minimum which we believe can be finalized safely. We do this by iterating backwards from the target to the minimum and finding the longest continuous chain from minimum where all candidates included by those blocks have been approved.
Where this all fits into Polkadot is via block finality. Our goal is to not finalize any block containing a candidate
that is not approved. We provide a hook for a custom GRANDPA voting rule - GRANDPA makes requests of the form (target,
minimum) consisting of a target block (i.e. longest chain) that it would like to finalize, and a minimum block which,
due to the rules of GRANDPA, must be voted on. The minimum is typically the last finalized block, but may be beyond it,
in the case of having a last-round-estimate beyond the last finalized. Thus, our goal is to inform GRANDPA of some block
between target and minimum which we believe can be finalized safely. We do this by iterating backwards from the target
to the minimum and finding the longest continuous chain from minimum where all candidates included by those blocks have
been approved.
## Protocol
Input:
- `ApprovalVotingMessage::CheckAndImportAssignment`
- `ApprovalVotingMessage::CheckAndImportApproval`
- `ApprovalVotingMessage::ApprovedAncestor`
* `ApprovalVotingMessage::CheckAndImportAssignment`
* `ApprovalVotingMessage::CheckAndImportApproval`
* `ApprovalVotingMessage::ApprovedAncestor`
Output:
- `ApprovalDistributionMessage::DistributeAssignment`
- `ApprovalDistributionMessage::DistributeApproval`
- `RuntimeApiMessage::Request`
- `ChainApiMessage`
- `AvailabilityRecoveryMessage::Recover`
- `CandidateExecutionMessage::ValidateFromExhaustive`
* `ApprovalDistributionMessage::DistributeAssignment`
* `ApprovalDistributionMessage::DistributeApproval`
* `RuntimeApiMessage::Request`
* `ChainApiMessage`
* `AvailabilityRecoveryMessage::Recover`
* `CandidateExecutionMessage::ValidateFromExhaustive`
## Functionality
The approval voting subsystem is responsible for casting votes and determining approval of candidates and as a result, blocks.
The approval voting subsystem is responsible for casting votes and determining approval of candidates and as a result,
blocks.
This subsystem wraps a database which is used to store metadata about unfinalized blocks and the candidates within them. Candidates may appear in multiple blocks, and assignment criteria are chosen differently based on the hash of the block they appear in.
This subsystem wraps a database which is used to store metadata about unfinalized blocks and the candidates within them.
Candidates may appear in multiple blocks, and assignment criteria are chosen differently based on the hash of the block
they appear in.
## Database Schema
@@ -150,16 +176,22 @@ struct State {
}
```
This guide section makes no explicit references to writes to or reads from disk. Instead, it handles them implicitly, with the understanding that updates to block, candidate, and approval entries are persisted to disk.
This guide section makes no explicit references to writes to or reads from disk. Instead, it handles them implicitly,
with the understanding that updates to block, candidate, and approval entries are persisted to disk.
[`SessionInfo`](../../runtime/session_info.md)
On start-up, we clear everything currently stored by the database. This is done by loading the `StoredBlockRange`, iterating through each block number, iterating through each block hash, and iterating through each candidate referenced by each block. Although this is `O(o*n*p)`, we don't expect to have more than a few unfinalized blocks at any time and in extreme cases, a few thousand. The clearing operation should be relatively fast as a result.
On start-up, we clear everything currently stored by the database. This is done by loading the `StoredBlockRange`,
iterating through each block number, iterating through each block hash, and iterating through each candidate referenced
by each block. Although this is `O(o*n*p)`, we don't expect to have more than a few unfinalized blocks at any time and
in extreme cases, a few thousand. The clearing operation should be relatively fast as a result.
Main loop:
* Each iteration, select over all of
* The next `Tick` in `wakeups`: trigger `wakeup_process` for each `(Hash, Hash)` pair scheduled under the `Tick` and then remove all entries under the `Tick`.
* The next message from the overseer: handle the message as described in the [Incoming Messages section](#incoming-messages)
* The next `Tick` in `wakeups`: trigger `wakeup_process` for each `(Hash, Hash)` pair scheduled under the `Tick` and
then remove all entries under the `Tick`.
* The next message from the overseer: handle the message as described in the [Incoming Messages
section](#incoming-messages)
* The next approval vote request from `background_rx`
* If this is an `ApprovalVoteRequest`, [Issue an approval vote](#issue-approval-vote).
@@ -167,41 +199,84 @@ Main loop:
#### `OverseerSignal::BlockFinalized`
On receiving an `OverseerSignal::BlockFinalized(h)`, we fetch the block number `b` of that block from the `ChainApi` subsystem. We update our `StoredBlockRange` to begin at `b+1`. Additionally, we remove all block entries and candidates referenced by them up to and including `b`. Lastly, we prune out all descendants of `h` transitively: when we remove a `BlockEntry` with number `b` that is not equal to `h`, we recursively delete all the `BlockEntry`s referenced as children. We remove the `block_assignments` entry for the block hash and if `block_assignments` is now empty, remove the `CandidateEntry`. We also update each of the `BlockNumber -> Vec<Hash>` keys in the database to reflect the blocks at that height, clearing if empty.
On receiving an `OverseerSignal::BlockFinalized(h)`, we fetch the block number `b` of that block from the `ChainApi`
subsystem. We update our `StoredBlockRange` to begin at `b+1`. Additionally, we remove all block entries and candidates
referenced by them up to and including `b`. Lastly, we prune out all descendants of `h` transitively: when we remove a
`BlockEntry` with number `b` that is not equal to `h`, we recursively delete all the `BlockEntry`s referenced as
children. We remove the `block_assignments` entry for the block hash and if `block_assignments` is now empty, remove the
`CandidateEntry`. We also update each of the `BlockNumber -> Vec<Hash>` keys in the database to reflect the blocks at
that height, clearing if empty.
#### `OverseerSignal::ActiveLeavesUpdate`
On receiving an `OverseerSignal::ActiveLeavesUpdate(update)`:
* We determine the set of new blocks that were not in our previous view. This is done by querying the ancestry of all new items in the view and contrasting against the stored `BlockNumber`s. Typically, there will be only one new block. We fetch the headers and information on these blocks from the `ChainApi` subsystem. Stale leaves in the update can be ignored.
* We determine the set of new blocks that were not in our previous view. This is done by querying the ancestry of all
new items in the view and contrasting against the stored `BlockNumber`s. Typically, there will be only one new
block. We fetch the headers and information on these blocks from the `ChainApi` subsystem. Stale leaves in the
update can be ignored.
* We update the `StoredBlockRange` and the `BlockNumber` maps.
* We use the `RuntimeApiSubsystem` to determine information about these blocks. It is generally safe to assume that runtime state is available for recent, unfinalized blocks. In the case that it isn't, it means that we are catching up to the head of the chain and needn't worry about assignments to those blocks anyway, as the security assumption of the protocol tolerates nodes being temporarily offline or out-of-date.
* We fetch the set of candidates included by each block by dispatching a `RuntimeApiRequest::CandidateEvents` and checking the `CandidateIncluded` events.
* We fetch the session of the block by dispatching a `session_index_for_child` request with the parent-hash of the block.
* If the `session index - APPROVAL_SESSIONS > state.earliest_session`, then bump `state.earliest_sessions` to that amount and prune earlier sessions.
* If the session isn't in our `state.session_info`, load the session info for it and for all sessions since the earliest-session, including the earliest-session, if that is missing. And it can be, just after pruning, if we've done a big jump forward, as is the case when we've just finished chain synchronization.
* We use the `RuntimeApiSubsystem` to determine information about these blocks. It is generally safe to assume that
runtime state is available for recent, unfinalized blocks. In the case that it isn't, it means that we are catching
up to the head of the chain and needn't worry about assignments to those blocks anyway, as the security assumption
of the protocol tolerates nodes being temporarily offline or out-of-date.
* We fetch the set of candidates included by each block by dispatching a `RuntimeApiRequest::CandidateEvents` and
checking the `CandidateIncluded` events.
* We fetch the session of the block by dispatching a `session_index_for_child` request with the parent-hash of the
block.
* If the `session index - APPROVAL_SESSIONS > state.earliest_session`, then bump `state.earliest_sessions` to that
amount and prune earlier sessions.
* If the session isn't in our `state.session_info`, load the session info for it and for all sessions since the
earliest-session, including the earliest-session, if that is missing. And it can be, just after pruning, if we've
done a big jump forward, as is the case when we've just finished chain synchronization.
* If any of the runtime API calls fail, we just warn and skip the block.
* We use the `RuntimeApiSubsystem` to determine the set of candidates included in these blocks and use BABE logic to determine the slot number and VRF of the blocks.
* We also note how late we appear to have received the block. We create a `BlockEntry` for each block and a `CandidateEntry` for each candidate obtained from `CandidateIncluded` events after making a `RuntimeApiRequest::CandidateEvents` request.
* For each candidate, if the amount of needed approvals is more than the validators remaining after the backing group of the candidate is subtracted, then the candidate is insta-approved as approval would be impossible otherwise. If all candidates in the block are insta-approved, or there are no candidates in the block, then the block is insta-approved. If the block is insta-approved, a [`ChainSelectionMessage::Approved`][CSM] should be sent for the block.
* Ensure that the `CandidateEntry` contains a `block_assignments` entry for the block, with the correct backing group set.
* We use the `RuntimeApiSubsystem` to determine the set of candidates included in these blocks and use BABE logic to
determine the slot number and VRF of the blocks.
* We also note how late we appear to have received the block. We create a `BlockEntry` for each block and a
`CandidateEntry` for each candidate obtained from `CandidateIncluded` events after making a
`RuntimeApiRequest::CandidateEvents` request.
* For each candidate, if the amount of needed approvals is more than the validators remaining after the backing group
of the candidate is subtracted, then the candidate is insta-approved as approval would be impossible otherwise. If
all candidates in the block are insta-approved, or there are no candidates in the block, then the block is
insta-approved. If the block is insta-approved, a [`ChainSelectionMessage::Approved`][CSM] should be sent for the
block.
* Ensure that the `CandidateEntry` contains a `block_assignments` entry for the block, with the correct backing group
set.
* If a validator in this session, compute and assign `our_assignment` for the `block_assignments`
* Only if not a member of the backing group.
* Run `RelayVRFModulo` and `RelayVRFDelay` according to the [the approvals protocol section](../../protocol-approval.md#assignment-criteria). Ensure that the assigned core derived from the output is covered by the auxiliary signature aggregated in the `VRFPRoof`.
* [Handle Wakeup](#handle-wakeup) for each new candidate in each new block - this will automatically broadcast a 0-tranche assignment, kick off approval work, and schedule the next delay.
* Run `RelayVRFModulo` and `RelayVRFDelay` according to the [the approvals protocol
section](../../protocol-approval.md#assignment-criteria). Ensure that the assigned core derived from the output is
covered by the auxiliary signature aggregated in the `VRFPRoof`.
* [Handle Wakeup](#handle-wakeup) for each new candidate in each new block - this will automatically broadcast a
0-tranche assignment, kick off approval work, and schedule the next delay.
* Dispatch an `ApprovalDistributionMessage::NewBlocks` with the meta information filled out for each new block.
#### `ApprovalVotingMessage::CheckAndImportAssignment`
On receiving a `ApprovalVotingMessage::CheckAndImportAssignment` message, we check the assignment cert against the block entry. The cert itself contains information necessary to determine the candidate that is being assigned-to. In detail:
* Load the `BlockEntry` for the relay-parent referenced by the message. If there is none, return `AssignmentCheckResult::Bad`.
On receiving a `ApprovalVotingMessage::CheckAndImportAssignment` message, we check the assignment cert against the block
entry. The cert itself contains information necessary to determine the candidate that is being assigned-to. In detail:
* Load the `BlockEntry` for the relay-parent referenced by the message. If there is none, return
`AssignmentCheckResult::Bad`.
* Fetch the `SessionInfo` for the session of the block
* Determine the assignment key of the validator based on that.
* Determine the claimed core index by looking up the candidate with given index in `block_entry.candidates`. Return `AssignmentCheckResult::Bad` if missing.
* Determine the claimed core index by looking up the candidate with given index in `block_entry.candidates`. Return
`AssignmentCheckResult::Bad` if missing.
* Check the assignment cert
* If the cert kind is `RelayVRFModulo`, then the certificate is valid as long as `sample < session_info.relay_vrf_samples` and the VRF is valid for the validator's key with the input `block_entry.relay_vrf_story ++ sample.encode()` as described with [the approvals protocol section](../../protocol-approval.md#assignment-criteria). We set `core_index = vrf.make_bytes().to_u32() % session_info.n_cores`. If the `BlockEntry` causes inclusion of a candidate at `core_index`, then this is a valid assignment for the candidate at `core_index` and has delay tranche 0. Otherwise, it can be ignored.
* If the cert kind is `RelayVRFDelay`, then we check if the VRF is valid for the validator's key with the input `block_entry.relay_vrf_story ++ cert.core_index.encode()` as described in [the approvals protocol section](../../protocol-approval.md#assignment-criteria). The cert can be ignored if the block did not cause inclusion of a candidate on that core index. Otherwise, this is a valid assignment for the included candidate. The delay tranche for the assignment is determined by reducing `(vrf.make_bytes().to_u64() % (session_info.n_delay_tranches + session_info.zeroth_delay_tranche_width)).saturating_sub(session_info.zeroth_delay_tranche_width)`.
* We also check that the core index derived by the output is covered by the `VRFProof` by means of an auxiliary signature.
* If the cert kind is `RelayVRFModulo`, then the certificate is valid as long as `sample <
session_info.relay_vrf_samples` and the VRF is valid for the validator's key with the input
`block_entry.relay_vrf_story ++ sample.encode()` as described with [the approvals protocol
section](../../protocol-approval.md#assignment-criteria). We set `core_index = vrf.make_bytes().to_u32() %
session_info.n_cores`. If the `BlockEntry` causes inclusion of a candidate at `core_index`, then this is a valid
assignment for the candidate at `core_index` and has delay tranche 0. Otherwise, it can be ignored.
* If the cert kind is `RelayVRFDelay`, then we check if the VRF is valid for the validator's key with the input
`block_entry.relay_vrf_story ++ cert.core_index.encode()` as described in [the approvals protocol
section](../../protocol-approval.md#assignment-criteria). The cert can be ignored if the block did not cause
inclusion of a candidate on that core index. Otherwise, this is a valid assignment for the included candidate. The
delay tranche for the assignment is determined by reducing `(vrf.make_bytes().to_u64() %
(session_info.n_delay_tranches +
session_info.zeroth_delay_tranche_width)).saturating_sub(session_info.zeroth_delay_tranche_width)`.
* We also check that the core index derived by the output is covered by the `VRFProof` by means of an auxiliary
signature.
* If the delay tranche is too far in the future, return `AssignmentCheckResult::TooFarInFuture`.
* Import the assignment.
* Load the candidate in question and access the `approval_entry` for the block hash the cert references.
@@ -217,32 +292,41 @@ On receiving a `ApprovalVotingMessage::CheckAndImportAssignment` message, we che
On receiving a `CheckAndImportApproval(indirect_approval_vote, response_channel)` message:
* Fetch the `BlockEntry` from the indirect approval vote's `block_hash`. If none, return `ApprovalCheckResult::Bad`.
* Fetch the `CandidateEntry` from the indirect approval vote's `candidate_index`. If the block did not trigger inclusion of enough candidates, return `ApprovalCheckResult::Bad`.
* Construct a `SignedApprovalVote` using the candidate hash and check against the validator's approval key, based on the session info of the block. If invalid or no such validator, return `ApprovalCheckResult::Bad`.
* Fetch the `CandidateEntry` from the indirect approval vote's `candidate_index`. If the block did not trigger
inclusion of enough candidates, return `ApprovalCheckResult::Bad`.
* Construct a `SignedApprovalVote` using the candidate hash and check against the validator's approval key, based on
the session info of the block. If invalid or no such validator, return `ApprovalCheckResult::Bad`.
* Send `ApprovalCheckResult::Accepted`
* [Import the checked approval vote](#import-checked-approval)
#### `ApprovalVotingMessage::ApprovedAncestor`
On receiving an `ApprovedAncestor(Hash, BlockNumber, response_channel)`:
* Iterate over the ancestry of the hash all the way back to block number given, starting from the provided block hash. Load the `CandidateHash`es from each block entry.
* Iterate over the ancestry of the hash all the way back to block number given, starting from the provided block hash.
Load the `CandidateHash`es from each block entry.
* Keep track of an `all_approved_max: Option<(Hash, BlockNumber, Vec<(Hash, Vec<CandidateHash>))>`.
* For each block hash encountered, load the `BlockEntry` associated. If any are not found, return `None` on the response channel and conclude.
* If the block entry's `approval_bitfield` has all bits set to 1 and `all_approved_max == None`, set `all_approved_max = Some((current_hash, current_number))`.
* For each block hash encountered, load the `BlockEntry` associated. If any are not found, return `None` on the
response channel and conclude.
* If the block entry's `approval_bitfield` has all bits set to 1 and `all_approved_max == None`, set `all_approved_max
= Some((current_hash, current_number))`.
* If the block entry's `approval_bitfield` has any 0 bits, set `all_approved_max = None`.
* If `all_approved_max` is `Some`, push the current block hash and candidate hashes onto the list of blocks and candidates `all_approved_max`.
* If `all_approved_max` is `Some`, push the current block hash and candidate hashes onto the list of blocks and
candidates `all_approved_max`.
* After iterating all ancestry, return `all_approved_max`.
### Updates and Auxiliary Logic
#### Import Checked Approval
* Import an approval vote which we can assume to have passed signature checks and correspond to an imported assignment.
* Import an approval vote which we can assume to have passed signature checks and correspond to an imported
assignment.
* Requires `(BlockEntry, CandidateEntry, ValidatorIndex)`
* Set the corresponding bit of the `approvals` bitfield in the `CandidateEntry` to `1`. If already `1`, return.
* Checks the approval state of a candidate under a specific block, and updates the block and candidate entries accordingly.
* Checks the approval state of a candidate under a specific block, and updates the block and candidate entries
accordingly.
* Checks the `ApprovalEntry` for the block.
* [determine the tranches to inspect](#determine-required-tranches) of the candidate,
* [the candidate is approved under the block](#check-approval), set the corresponding bit in the `block_entry.approved_bitfield`.
* [the candidate is approved under the block](#check-approval), set the corresponding bit in the
`block_entry.approved_bitfield`.
* If the block is now fully approved and was not before, send a [`ChainSelectionMessage::Approved`][CSM].
* Otherwise, [schedule a wakeup of the candidate](#schedule-wakeup)
* If the approval vote originates locally, set the `our_approval_sig` in the candidate entry.
@@ -250,13 +334,19 @@ On receiving an `ApprovedAncestor(Hash, BlockNumber, response_channel)`:
#### Handling Wakeup
* Handle a previously-scheduled wakeup of a candidate under a specific block.
* Requires `(relay_block, candidate_hash)`
* Load the `BlockEntry` and `CandidateEntry` from disk. If either is not present, this may have lost a race with finality and can be ignored. Also load the `ApprovalEntry` for the block and candidate.
* Load the `BlockEntry` and `CandidateEntry` from disk. If either is not present, this may have lost a race with
finality and can be ignored. Also load the `ApprovalEntry` for the block and candidate.
* [determine the `RequiredTranches` of the candidate](#determine-required-tranches).
* Determine if we should trigger our assignment.
* If we've already triggered or `OurAssignment` is `None`, we do not trigger.
* If we have `RequiredTranches::All`, then we trigger if the candidate is [not approved](#check-approval). We have no next wakeup as we assume that other validators are doing the same and we will be implicitly woken up by handling new votes.
* If we have `RequiredTranches::Pending { considered, next_no_show, uncovered, maximum_broadcast, clock_drift }`, then we trigger if our assignment's tranche is less than or equal to `maximum_broadcast` and the current tick, with `clock_drift` applied, is at least the tick of our tranche.
* If we have `RequiredTranches::Exact { .. }` then we do not trigger, because this value indicates that no new assignments are needed at the moment.
* If we have `RequiredTranches::All`, then we trigger if the candidate is [not approved](#check-approval). We have
no next wakeup as we assume that other validators are doing the same and we will be implicitly woken up by
handling new votes.
* If we have `RequiredTranches::Pending { considered, next_no_show, uncovered, maximum_broadcast, clock_drift }`,
then we trigger if our assignment's tranche is less than or equal to `maximum_broadcast` and the current tick,
with `clock_drift` applied, is at least the tick of our tranche.
* If we have `RequiredTranches::Exact { .. }` then we do not trigger, because this value indicates that no new
assignments are needed at the moment.
* If we should trigger our assignment
* Import the assignment to the `ApprovalEntry`
* Broadcast on network with an `ApprovalDistributionMessage::DistributeAssignment`.
@@ -265,26 +355,39 @@ On receiving an `ApprovedAncestor(Hash, BlockNumber, response_channel)`:
#### Schedule Wakeup
* Requires `(approval_entry, candidate_entry)` which effectively denotes a `(Block Hash, Candidate Hash)` pair - the candidate, along with the block it appears in.
* Requires `(approval_entry, candidate_entry)` which effectively denotes a `(Block Hash, Candidate Hash)` pair - the
candidate, along with the block it appears in.
* Also requires `RequiredTranches`
* If the `approval_entry` is approved, this doesn't need to be woken up again.
* If `RequiredTranches::All` - no wakeup. We assume other incoming votes will trigger wakeup and potentially re-schedule.
* If `RequiredTranches::Pending { considered, next_no_show, uncovered, maximum_broadcast, clock_drift }` - schedule at the lesser of the next no-show tick, or the tick, offset positively by `clock_drift` of the next non-empty tranche we are aware of after `considered`, including any tranche containing our own unbroadcast assignment. This can lead to no wakeup in the case that we have already broadcast our assignment and there are no pending no-shows; that is, we have approval votes for every assignment we've received that is not already a no-show. In this case, we will be re-triggered by other validators broadcasting their assignments.
* If `RequiredTranches::Exact { next_no_show, latest_assignment_tick, .. }` - set a wakeup for the earlier of the next no-show tick or the latest assignment tick + `APPROVAL_DELAY`.
* If `RequiredTranches::All` - no wakeup. We assume other incoming votes will trigger wakeup and potentially
re-schedule.
* If `RequiredTranches::Pending { considered, next_no_show, uncovered, maximum_broadcast, clock_drift }` - schedule at
the lesser of the next no-show tick, or the tick, offset positively by `clock_drift` of the next non-empty tranche
we are aware of after `considered`, including any tranche containing our own unbroadcast assignment. This can lead
to no wakeup in the case that we have already broadcast our assignment and there are no pending no-shows; that is,
we have approval votes for every assignment we've received that is not already a no-show. In this case, we will be
re-triggered by other validators broadcasting their assignments.
* If `RequiredTranches::Exact { next_no_show, latest_assignment_tick, .. }` - set a wakeup for the earlier of the next
no-show tick or the latest assignment tick + `APPROVAL_DELAY`.
#### Launch Approval Work
* Requires `(SessionIndex, SessionInfo, CandidateReceipt, ValidatorIndex, backing_group, block_hash, candidate_index)`
* Extract the public key of the `ValidatorIndex` from the `SessionInfo` for the session.
* Issue an `AvailabilityRecoveryMessage::RecoverAvailableData(candidate, session_index, Some(backing_group), response_sender)`
* Load the historical validation code of the parachain by dispatching a `RuntimeApiRequest::ValidationCodeByHash(descriptor.validation_code_hash)` against the state of `block_hash`.
* Issue an `AvailabilityRecoveryMessage::RecoverAvailableData(candidate, session_index, Some(backing_group),
response_sender)`
* Load the historical validation code of the parachain by dispatching a
`RuntimeApiRequest::ValidationCodeByHash(descriptor.validation_code_hash)` against the state of `block_hash`.
* Spawn a background task with a clone of `background_tx`
* Wait for the available data
* Issue a `CandidateValidationMessage::ValidateFromExhaustive` message with `APPROVAL_EXECUTION_TIMEOUT` as the timeout parameter.
* Issue a `CandidateValidationMessage::ValidateFromExhaustive` message with `APPROVAL_EXECUTION_TIMEOUT` as the
timeout parameter.
* Wait for the result of validation
* Check that the result of validation, if valid, matches the commitments in the receipt.
* If valid, issue a message on `background_tx` detailing the request.
* If any of the data, the candidate, or the commitments are invalid, issue on `background_tx` a [`DisputeCoordinatorMessage::IssueLocalStatement`](../../types/overseer-protocol.md#dispute-coordinator-message) with `valid = false` to initiate a dispute.
* If any of the data, the candidate, or the commitments are invalid, issue on `background_tx` a
[`DisputeCoordinatorMessage::IssueLocalStatement`](../../types/overseer-protocol.md#dispute-coordinator-message)
with `valid = false` to initiate a dispute.
#### Issue Approval Vote
* Fetch the block entry and candidate entry. Ignore if `None` - we've probably just lost a race with finality.
@@ -297,14 +400,22 @@ On receiving an `ApprovedAncestor(Hash, BlockNumber, response_channel)`:
#### Determine Required Tranches
This logic is for inspecting an approval entry that tracks the assignments received, along with information on which assignments have corresponding approval votes. Inspection also involves the current time and expected requirements and is used to help the higher-level code determine the following:
This logic is for inspecting an approval entry that tracks the assignments received, along with information on which
assignments have corresponding approval votes. Inspection also involves the current time and expected requirements and
is used to help the higher-level code determine the following:
* Whether to broadcast the local assignment
* Whether to check that the candidate entry has been completely approved.
* If the candidate is waiting on approval, when to schedule the next wakeup of the `(candidate, block)` pair at a point where the state machine could be advanced.
* If the candidate is waiting on approval, when to schedule the next wakeup of the `(candidate, block)` pair at a
point where the state machine could be advanced.
These routines are pure functions which only depend on the environmental state. The expectation is that this determination is re-run every time we attempt to update an approval entry: either when we trigger a wakeup to advance the state machine based on a no-show or our own broadcast, or when we receive further assignments or approvals from the network.
These routines are pure functions which only depend on the environmental state. The expectation is that this
determination is re-run every time we attempt to update an approval entry: either when we trigger a wakeup to advance
the state machine based on a no-show or our own broadcast, or when we receive further assignments or approvals from the
network.
Thus it may be that at some point in time, we consider that tranches 0..X is required to be considered, but as we receive more information, we might require fewer tranches. Or votes that we perceived to be missing and require replacement are filled in and change our view.
Thus it may be that at some point in time, we consider that tranches 0..X is required to be considered, but as we
receive more information, we might require fewer tranches. Or votes that we perceived to be missing and require
replacement are filled in and change our view.
Requires `(approval_entry, approvals_received, tranche_now, block_tick, no_show_duration, needed_approvals)`
@@ -327,7 +438,8 @@ enum RequiredTranches {
/// as though it is `clock_drift` ticks earlier.
clock_drift: Tick,
},
// An exact number of required tranches and a number of no-shows. This indicates that the amount of `needed_approvals` are assigned and additionally all no-shows are covered.
// An exact number of required tranches and a number of no-shows. This indicates that the amount of `needed_approvals`
// are assigned and additionally all no-shows are covered.
Exact {
/// The tranche to inspect up to.
needed: DelayTranche,
@@ -345,13 +457,21 @@ enum RequiredTranches {
**Clock-drift and Tranche-taking**
Our vote-counting procedure depends heavily on how we interpret time based on the presence of no-shows - assignments which have no corresponding approval after some time.
Our vote-counting procedure depends heavily on how we interpret time based on the presence of no-shows - assignments
which have no corresponding approval after some time.
We have this is because of how we handle no-shows: we keep track of the depth of no-shows we are covering.
As an example: there may be initial no-shows in tranche 0. It'll take `no_show_duration` ticks before those are considered no-shows. Then, we don't want to immediately take `no_show_duration` more tranches. Instead, we want to take one tranche for each uncovered no-show. However, as we take those tranches, there may be further no-shows. Since these depth-1 no-shows should have only been triggered after the depth-0 no-shows were already known to be no-shows, we need to discount the local clock by `no_show_duration` to see whether these should be considered no-shows or not. There may be malicious parties who broadcast their assignment earlier than they were meant to, who shouldn't be counted as instant no-shows. We continue onwards to cover all depth-1 no-shows which may lead to depth-2 no-shows and so on.
As an example: there may be initial no-shows in tranche 0. It'll take `no_show_duration` ticks before those are
considered no-shows. Then, we don't want to immediately take `no_show_duration` more tranches. Instead, we want to take
one tranche for each uncovered no-show. However, as we take those tranches, there may be further no-shows. Since these
depth-1 no-shows should have only been triggered after the depth-0 no-shows were already known to be no-shows, we need
to discount the local clock by `no_show_duration` to see whether these should be considered no-shows or not. There may
be malicious parties who broadcast their assignment earlier than they were meant to, who shouldn't be counted as instant
no-shows. We continue onwards to cover all depth-1 no-shows which may lead to depth-2 no-shows and so on.
Likewise, when considering how many tranches to take, the no-show depth should be used to apply a depth-discount or clock drift to the `tranche_now`.
Likewise, when considering how many tranches to take, the no-show depth should be used to apply a depth-discount or
clock drift to the `tranche_now`.
**Procedure**
@@ -360,21 +480,35 @@ Likewise, when considering how many tranches to take, the no-show depth should b
* Take tranches up to `tranche_now - clock_drift` until all needed assignments are met.
* Keep track of the `next_no_show` according to the clock drift, as we go.
* Keep track of the `last_assignment_tick` as we go.
* If running out of tranches before then, return `Pending { considered, next_no_show, maximum_broadcast, clock_drift }`
* If running out of tranches before then, return `Pending { considered, next_no_show, maximum_broadcast, clock_drift
}`
* If there are no no-shows, return `Exact { needed, tolerated_missing, next_no_show, last_assignment_tick }`
* `maximum_broadcast` is either `DelayTranche::max_value()` at tranche 0 or otherwise by the last considered tranche + the number of uncovered no-shows at this point.
* If there are no-shows, return to the beginning, incrementing `depth` and attempting to cover the number of no-shows. Each no-show must be covered by a non-empty tranche, which are tranches that have at least one assignment. Each non-empty tranche covers exactly one no-show.
* If at any point, it seems that all validators are required, do an early return with `RequiredTranches::All` which indicates that everyone should broadcast.
* `maximum_broadcast` is either `DelayTranche::max_value()` at tranche 0 or otherwise by the last considered tranche +
the number of uncovered no-shows at this point.
* If there are no-shows, return to the beginning, incrementing `depth` and attempting to cover the number of no-shows.
Each no-show must be covered by a non-empty tranche, which are tranches that have at least one assignment. Each
non-empty tranche covers exactly one no-show.
* If at any point, it seems that all validators are required, do an early return with `RequiredTranches::All` which
indicates that everyone should broadcast.
#### Check Approval
* Check whether a candidate is approved under a particular block.
* Requires `(block_entry, candidate_entry, approval_entry, n_tranches)`
* If we have `3 * n_approvals > n_validators`, return true. This is because any set with f+1 validators must have at least one honest validator, who has approved the candidate.
* If we have `3 * n_approvals > n_validators`, return true. This is because any set with f+1 validators must have at
least one honest validator, who has approved the candidate.
* If `n_tranches` is `RequiredTranches::Pending`, return false
* If `n_tranches` is `RequiredTranches::All`, return false.
* If `n_tranches` is `RequiredTranches::Exact { tranche, tolerated_missing, latest_assignment_tick, .. }`, then we return whether all assigned validators up to `tranche` less `tolerated_missing` have approved and `latest_assignment_tick + APPROVAL_DELAY >= tick_now`.
* e.g. if we had 5 tranches and 1 tolerated missing, we would accept only if all but 1 of assigned validators in tranches 0..=5 have approved. In that example, we also accept all validators in tranches 0..=5 having approved, but that would indicate that the `RequiredTranches` value was incorrectly constructed, so it is not realistic. `tolerated_missing` actually represents covered no-shows. If there are more missing approvals than there are tolerated missing, that indicates that there are some assignments which are not yet no-shows, but may become no-shows, and we should wait for the validators to either approve or become no-shows.
* e.g. If the above passes and the `latest_assignment_tick` was 5 and the current tick was 6, then we'd return false.
* If `n_tranches` is `RequiredTranches::Exact { tranche, tolerated_missing, latest_assignment_tick, .. }`, then we
return whether all assigned validators up to `tranche` less `tolerated_missing` have approved and
`latest_assignment_tick + APPROVAL_DELAY >= tick_now`.
* e.g. if we had 5 tranches and 1 tolerated missing, we would accept only if all but 1 of assigned validators in
tranches 0..=5 have approved. In that example, we also accept all validators in tranches 0..=5 having approved,
but that would indicate that the `RequiredTranches` value was incorrectly constructed, so it is not realistic.
`tolerated_missing` actually represents covered no-shows. If there are more missing approvals than there are
tolerated missing, that indicates that there are some assignments which are not yet no-shows, but may become
no-shows, and we should wait for the validators to either approve or become no-shows.
* e.g. If the above passes and the `latest_assignment_tick` was 5 and the current tick was 6, then we'd return
false.
### Time
@@ -1,3 +1,7 @@
# Availability Subsystems
The availability subsystems are responsible for ensuring that Proofs of Validity of backed candidates are widely available within the validator set, without requiring every node to retain a full copy. They accomplish this by broadly distributing erasure-coded chunks of the PoV, keeping track of which validator has which chunk by means of signed bitfields. They are also responsible for reassembling a complete PoV when required, e.g. when an approval checker needs to validate a parachain block.
The availability subsystems are responsible for ensuring that Proofs of Validity of backed candidates are widely
available within the validator set, without requiring every node to retain a full copy. They accomplish this by broadly
distributing erasure-coded chunks of the PoV, keeping track of which validator has which chunk by means of signed
bitfields. They are also responsible for reassembling a complete PoV when required, e.g. when an approval checker needs
to validate a parachain block.
@@ -1,31 +1,26 @@
# Availability Distribution
This subsystem is responsible for distribution availability data to peers.
Availability data are chunks, `PoV`s and `AvailableData` (which is `PoV` +
`PersistedValidationData`). It does so via request response protocols.
This subsystem is responsible for distribution availability data to peers. Availability data are chunks, `PoV`s and
`AvailableData` (which is `PoV` + `PersistedValidationData`). It does so via request response protocols.
In particular this subsystem is responsible for:
- Respond to network requests requesting availability data by querying the
[Availability Store](../utility/availability-store.md).
- Request chunks from backing validators to put them in the local `Availability
Store` whenever we find an occupied core on any fresh leaf,
this is to ensure availability by at least 2/3+ of all validators, this
happens after a candidate is backed.
- Fetch `PoV` from validators, when requested via `FetchPoV` message from
backing (`pov_requester` module).
- Respond to network requests requesting availability data by querying the [Availability
Store](../utility/availability-store.md).
- Request chunks from backing validators to put them in the local `Availability Store` whenever we find an occupied core
on any fresh leaf, this is to ensure availability by at least 2/3+ of all validators, this happens after a candidate
is backed.
- Fetch `PoV` from validators, when requested via `FetchPoV` message from backing (`pov_requester` module).
The backing subsystem is responsible of making available data available in the
local `Availability Store` upon validation. This subsystem will serve any
network requests by querying that store.
The backing subsystem is responsible of making available data available in the local `Availability Store` upon
validation. This subsystem will serve any network requests by querying that store.
## Protocol
This subsystem does not handle any peer set messages, but the `pov_requester`
does connect to validators of the same backing group on the validation peer
set, to ensure fast propagation of statements between those validators and for
ensuring already established connections for requesting `PoV`s. Other than that
this subsystem drives request/response protocols.
This subsystem does not handle any peer set messages, but the `pov_requester` does connect to validators of the same
backing group on the validation peer set, to ensure fast propagation of statements between those validators and for
ensuring already established connections for requesting `PoV`s. Other than that this subsystem drives request/response
protocols.
Input:
@@ -48,51 +43,42 @@ Output:
### PoV Requester
The PoV requester in the `pov_requester` module takes care of staying connected
to validators of the current backing group of this very validator on the `Validation`
peer set and it will handle `FetchPoV` requests by issuing network requests to
those validators. It will check the hash of the received `PoV`, but will not do any
further validation. That needs to be done by the original `FetchPoV` sender
(backing subsystem).
The PoV requester in the `pov_requester` module takes care of staying connected to validators of the current backing
group of this very validator on the `Validation` peer set and it will handle `FetchPoV` requests by issuing network
requests to those validators. It will check the hash of the received `PoV`, but will not do any further validation. That
needs to be done by the original `FetchPoV` sender (backing subsystem).
### Chunk Requester
After a candidate is backed, the availability of the PoV block must be confirmed
by 2/3+ of all validators. The chunk requester is responsible of making that
availability a reality.
After a candidate is backed, the availability of the PoV block must be confirmed by 2/3+ of all validators. The chunk
requester is responsible of making that availability a reality.
It does that by querying checking occupied cores for all active leaves. For each
occupied core it will spawn a task fetching the erasure chunk which has the
`ValidatorIndex` of the node. For this an `ChunkFetchingRequest` is issued, via
substrate's generic request/response protocol.
It does that by querying checking occupied cores for all active leaves. For each occupied core it will spawn a task
fetching the erasure chunk which has the `ValidatorIndex` of the node. For this an `ChunkFetchingRequest` is issued, via
Substrate's generic request/response protocol.
The spawned task will start trying to fetch the chunk from validators in
responsible group of the occupied core, in a random order. For ensuring that we
use already open TCP connections wherever possible, the requester maintains a
cache and preserves that random order for the entire session.
The spawned task will start trying to fetch the chunk from validators in responsible group of the occupied core, in a
random order. For ensuring that we use already open TCP connections wherever possible, the requester maintains a cache
and preserves that random order for the entire session.
Note however that, because not all validators in a group have to be actual
backers, not all of them are required to have the needed chunk. This in turn
could lead to low throughput, as we have to wait for fetches to fail,
before reaching a validator finally having our chunk. We do rank back validators
not delivering our chunk, but as backers could vary from block to block on a
perfectly legitimate basis, this is still not ideal. See issues [2509](https://github.com/paritytech/polkadot/issues/2509) and [2512](https://github.com/paritytech/polkadot/issues/2512)
for more information.
Note however that, because not all validators in a group have to be actual backers, not all of them are required to have
the needed chunk. This in turn could lead to low throughput, as we have to wait for fetches to fail, before reaching a
validator finally having our chunk. We do rank back validators not delivering our chunk, but as backers could vary from
block to block on a perfectly legitimate basis, this is still not ideal. See issues
[2509](https://github.com/paritytech/polkadot/issues/2509) and
[2512](https://github.com/paritytech/polkadot/issues/2512) for more information.
The current implementation also only fetches chunks for occupied cores in blocks
in active leaves. This means though, if active leaves skips a block or we are
particularly slow in fetching our chunk, we might not fetch our chunk if
availability reached 2/3 fast enough (slot becomes free). This is not desirable
as we would like as many validators as possible to have their chunk. See this
[issue](https://github.com/paritytech/polkadot/issues/2513) for more details.
The current implementation also only fetches chunks for occupied cores in blocks in active leaves. This means though, if
active leaves skips a block or we are particularly slow in fetching our chunk, we might not fetch our chunk if
availability reached 2/3 fast enough (slot becomes free). This is not desirable as we would like as many validators as
possible to have their chunk. See this [issue](https://github.com/paritytech/polkadot/issues/2513) for more details.
### Serving
On the other side the subsystem will listen for incoming `ChunkFetchingRequest`s
and `PoVFetchingRequest`s from the network bridge and will respond to queries,
by looking the requested chunks and `PoV`s up in the availability store, this
happens in the `responder` module.
On the other side the subsystem will listen for incoming `ChunkFetchingRequest`s and `PoVFetchingRequest`s from the
network bridge and will respond to queries, by looking the requested chunks and `PoV`s up in the availability store,
this happens in the `responder` module.
We rely on the backing subsystem to make available data available locally in the
`Availability Store` after it has validated it.
We rely on the backing subsystem to make available data available locally in the `Availability Store` after it has
validated it.
@@ -1,8 +1,13 @@
# Availability Recovery
This subsystem is the inverse of the [Availability Distribution](availability-distribution.md) subsystem: validators will serve the availability chunks kept in the availability store to nodes who connect to them. And the subsystem will also implement the other side: the logic for nodes to connect to validators, request availability pieces, and reconstruct the `AvailableData`.
This subsystem is the inverse of the [Availability Distribution](availability-distribution.md) subsystem: validators
will serve the availability chunks kept in the availability store to nodes who connect to them. And the subsystem will
also implement the other side: the logic for nodes to connect to validators, request availability pieces, and
reconstruct the `AvailableData`.
This version of the availability recovery subsystem is based off of direct connections to validators. In order to recover any given `AvailableData`, we must recover at least `f + 1` pieces from validators of the session. Thus, we will connect to and query randomly chosen validators until we have received `f + 1` pieces.
This version of the availability recovery subsystem is based off of direct connections to validators. In order to
recover any given `AvailableData`, we must recover at least `f + 1` pieces from validators of the session. Thus, we will
connect to and query randomly chosen validators until we have received `f + 1` pieces.
## Protocol
@@ -10,18 +15,20 @@ This version of the availability recovery subsystem is based off of direct conne
Input:
- `NetworkBridgeUpdate(update)`
- `AvailabilityRecoveryMessage::RecoverAvailableData(candidate, session, backing_group, response)`
* `NetworkBridgeUpdate(update)`
* `AvailabilityRecoveryMessage::RecoverAvailableData(candidate, session, backing_group, response)`
Output:
- `NetworkBridge::SendValidationMessage`
- `NetworkBridge::ReportPeer`
- `AvailabilityStore::QueryChunk`
* `NetworkBridge::SendValidationMessage`
* `NetworkBridge::ReportPeer`
* `AvailabilityStore::QueryChunk`
## Functionality
We hold a state which tracks the currently ongoing recovery tasks, as well as which request IDs correspond to which task. A recovery task is a structure encapsulating all recovery tasks with the network necessary to recover the available data in respect to one candidate.
We hold a state which tracks the currently ongoing recovery tasks, as well as which request IDs correspond to which
task. A recovery task is a structure encapsulating all recovery tasks with the network necessary to recover the
available data in respect to one candidate.
```rust
struct State {
@@ -87,17 +94,22 @@ On `Conclude`, shut down the subsystem.
1. Check the `availability_lru` for the candidate and return the data if so.
1. Check if there is already an recovery handle for the request. If so, add the response handle to it.
1. Otherwise, load the session info for the given session under the state of `live_block_hash`, and initiate a recovery task with *`launch_recovery_task`*. Add a recovery handle to the state and add the response channel to it.
1. Otherwise, load the session info for the given session under the state of `live_block_hash`, and initiate a recovery
task with *`launch_recovery_task`*. Add a recovery handle to the state and add the response channel to it.
1. If the session info is not available, return `RecoveryError::Unavailable` on the response channel.
### Recovery logic
#### `launch_recovery_task(session_index, session_info, candidate_receipt, candidate_hash, Option<backing_group_index>)`
1. Compute the threshold from the session info. It should be `f + 1`, where `n = 3f + k`, where `k in {1, 2, 3}`, and `n` is the number of validators.
1. Set the various fields of `RecoveryParams` based on the validator lists in `session_info` and information about the candidate.
1. If the `backing_group_index` is `Some`, start in the `RequestFromBackers` phase with a shuffling of the backing group validator indices and a `None` requesting value.
1. Otherwise, start in the `RequestChunksFromValidators` source with `received_chunks`,`requesting_chunks`, and `next_shuffling` all empty.
1. Compute the threshold from the session info. It should be `f + 1`, where `n = 3f + k`, where `k in {1, 2, 3}`, and
`n` is the number of validators.
1. Set the various fields of `RecoveryParams` based on the validator lists in `session_info` and information about the
candidate.
1. If the `backing_group_index` is `Some`, start in the `RequestFromBackers` phase with a shuffling of the backing group
validator indices and a `None` requesting value.
1. Otherwise, start in the `RequestChunksFromValidators` source with `received_chunks`,`requesting_chunks`, and
`next_shuffling` all empty.
1. Set the `to_subsystems` sender to be equal to a clone of the `SubsystemContext`'s sender.
1. Initialize `received_chunks` to an empty set, as well as `requesting_chunks`.
@@ -115,19 +127,24 @@ const N_PARALLEL: usize = 50;
* Loop:
* If the `requesting_pov` is `Some`, poll for updates on it. If it concludes, set `requesting_pov` to `None`.
* If the `requesting_pov` is `None`, take the next backer off the `shuffled_backers`.
* If the backer is `Some`, issue a `NetworkBridgeMessage::Requests` with a network request for the `AvailableData` and wait for the response.
* If the backer is `Some`, issue a `NetworkBridgeMessage::Requests` with a network request for the
`AvailableData` and wait for the response.
* If it concludes with a `None` result, return to beginning.
* If it concludes with available data, attempt a re-encoding.
* If it has the correct erasure-root, break and issue a `Ok(available_data)`.
* If it has an incorrect erasure-root, return to beginning.
* Send the result to each member of `awaiting`.
* If the backer is `None`, set the source to `RequestChunksFromValidators` with a random shuffling of validators and empty `received_chunks`, and `requesting_chunks` and break the loop.
* If the backer is `None`, set the source to `RequestChunksFromValidators` with a random shuffling of validators
and empty `received_chunks`, and `requesting_chunks` and break the loop.
* If the task contains `RequestChunksFromValidators`:
* Request `AvailabilityStoreMessage::QueryAllChunks`. For each chunk that exists, add it to `received_chunks` and remote the validator from `shuffling`.
* Request `AvailabilityStoreMessage::QueryAllChunks`. For each chunk that exists, add it to `received_chunks` and
remote the validator from `shuffling`.
* Loop:
* If `received_chunks + requesting_chunks + shuffling` lengths are less than the threshold, break and return `Err(Unavailable)`.
* Poll for new updates from `requesting_chunks`. Check merkle proofs of any received chunks. If the request simply fails due to network issues, insert into the front of `shuffling` to be retried.
* If `received_chunks + requesting_chunks + shuffling` lengths are less than the threshold, break and return
`Err(Unavailable)`.
* Poll for new updates from `requesting_chunks`. Check merkle proofs of any received chunks. If the request simply
fails due to network issues, insert into the front of `shuffling` to be retried.
* If `received_chunks` has more than `threshold` entries, attempt to recover the data.
* If that fails, return `Err(RecoveryError::Invalid)`
* If correct:
@@ -135,5 +152,6 @@ const N_PARALLEL: usize = 50;
* break and issue `Ok(available_data)`
* Send the result to each member of `awaiting`.
* While there are fewer than `N_PARALLEL` entries in `requesting_chunks`,
* Pop the next item from `shuffling`. If it's empty and `requesting_chunks` is empty, return `Err(RecoveryError::Unavailable)`.
* Pop the next item from `shuffling`. If it's empty and `requesting_chunks` is empty, return
`Err(RecoveryError::Unavailable)`.
* Issue a `NetworkBridgeMessage::Requests` and wait for the response in `requesting_chunks`.
@@ -1,34 +1,40 @@
# Bitfield Distribution
Validators vote on the availability of a backed candidate by issuing signed bitfields, where each bit corresponds to a single candidate. These bitfields can be used to compactly determine which backed candidates are available or not based on a 2/3+ quorum.
Validators vote on the availability of a backed candidate by issuing signed bitfields, where each bit corresponds to a
single candidate. These bitfields can be used to compactly determine which backed candidates are available or not based
on a 2/3+ quorum.
## Protocol
`PeerSet`: `Validation`
Input:
[`BitfieldDistributionMessage`](../../types/overseer-protocol.md#bitfield-distribution-message) which are gossiped to all peers, no matter if validator or not.
Input: [`BitfieldDistributionMessage`](../../types/overseer-protocol.md#bitfield-distribution-message) which are
gossiped to all peers, no matter if validator or not.
Output:
- `NetworkBridge::SendValidationMessage([PeerId], message)` gossip a verified incoming bitfield on to interested subsystems within this validator node.
- `NetworkBridge::ReportPeer(PeerId, cost_or_benefit)` improve or penalize the reputation of peers based on the messages that are received relative to the current view.
- `ProvisionerMessage::ProvisionableData(ProvisionableData::Bitfield(relay_parent, SignedAvailabilityBitfield))` pass
on the bitfield to the other submodules via the overseer.
- `NetworkBridge::SendValidationMessage([PeerId], message)` gossip a verified incoming bitfield on to interested
subsystems within this validator node.
- `NetworkBridge::ReportPeer(PeerId, cost_or_benefit)` improve or penalize the reputation of peers based on the messages
that are received relative to the current view.
- `ProvisionerMessage::ProvisionableData(ProvisionableData::Bitfield(relay_parent, SignedAvailabilityBitfield))` pass on
the bitfield to the other submodules via the overseer.
## Functionality
This is implemented as a gossip system.
It is necessary to track peer connection, view change, and disconnection events, in order to maintain an index of which peers are interested in which relay parent bitfields.
It is necessary to track peer connection, view change, and disconnection events, in order to maintain an index of which
peers are interested in which relay parent bitfields.
Before gossiping incoming bitfields, they must be checked to be signed by one of the validators
of the validator set relevant to the current relay parent.
Only accept bitfields relevant to our current view and only distribute bitfields to other peers when relevant to their most recent view.
Accept and distribute only one bitfield per validator.
Before gossiping incoming bitfields, they must be checked to be signed by one of the validators of the validator set
relevant to the current relay parent. Only accept bitfields relevant to our current view and only distribute bitfields
to other peers when relevant to their most recent view. Accept and distribute only one bitfield per validator.
When receiving a bitfield either from the network or from a `DistributeBitfield` message, forward it along to the block authorship (provisioning) subsystem for potential inclusion in a block.
When receiving a bitfield either from the network or from a `DistributeBitfield` message, forward it along to the block
authorship (provisioning) subsystem for potential inclusion in a block.
Peers connecting after a set of valid bitfield gossip messages was received, those messages must be cached and sent upon connection of new peers or re-connecting peers.
Peers connecting after a set of valid bitfield gossip messages was received, those messages must be cached and sent upon
connection of new peers or re-connecting peers.
@@ -1,12 +1,15 @@
# Bitfield Signing
Validators vote on the availability of a backed candidate by issuing signed bitfields, where each bit corresponds to a single candidate. These bitfields can be used to compactly determine which backed candidates are available or not based on a 2/3+ quorum.
Validators vote on the availability of a backed candidate by issuing signed bitfields, where each bit corresponds to a
single candidate. These bitfields can be used to compactly determine which backed candidates are available or not based
on a 2/3+ quorum.
## Protocol
Input:
There is no dedicated input mechanism for bitfield signing. Instead, Bitfield Signing produces a bitfield representing the current state of availability on `StartWork`.
There is no dedicated input mechanism for bitfield signing. Instead, Bitfield Signing produces a bitfield representing
the current state of availability on `StartWork`.
Output:
@@ -15,15 +18,20 @@ Output:
## Functionality
Upon receipt of an `ActiveLeavesUpdate`, launch bitfield signing job for each `activated` head referring to a fresh leaf. Stop the job for each `deactivated` head.
Upon receipt of an `ActiveLeavesUpdate`, launch bitfield signing job for each `activated` head referring to a fresh
leaf. Stop the job for each `deactivated` head.
## Bitfield Signing Job
Localized to a specific relay-parent `r`
If not running as a validator, do nothing.
Localized to a specific relay-parent `r` If not running as a validator, do nothing.
- For each fresh leaf, begin by waiting a fixed period of time so availability distribution has the chance to make candidates available.
- Determine our validator index `i`, the set of backed candidates pending availability in `r`, and which bit of the bitfield each corresponds to.
- Start with an empty bitfield. For each bit in the bitfield, if there is a candidate pending availability, query the [Availability Store](../utility/availability-store.md) for whether we have the availability chunk for our validator index. The `OccupiedCore` struct contains the candidate hash so the full candidate does not need to be fetched from runtime.
- For each fresh leaf, begin by waiting a fixed period of time so availability distribution has the chance to make
candidates available.
- Determine our validator index `i`, the set of backed candidates pending availability in `r`, and which bit of the
bitfield each corresponds to.
- Start with an empty bitfield. For each bit in the bitfield, if there is a candidate pending availability, query the
[Availability Store](../utility/availability-store.md) for whether we have the availability chunk for our validator
index. The `OccupiedCore` struct contains the candidate hash so the full candidate does not need to be fetched from
runtime.
- For all chunks we have, set the corresponding bit in the bitfield.
- Sign the bitfield and dispatch a `BitfieldDistribution::DistributeBitfield` message.
@@ -1,10 +1,15 @@
# Backing Subsystems
The backing subsystems, when conceived as a black box, receive an arbitrary quantity of parablock candidates and associated proofs of validity from arbitrary untrusted collators. From these, they produce a bounded quantity of backable candidates which relay chain block authors may choose to include in a subsequent block.
The backing subsystems, when conceived as a black box, receive an arbitrary quantity of parablock candidates and
associated proofs of validity from arbitrary untrusted collators. From these, they produce a bounded quantity of
backable candidates which relay chain block authors may choose to include in a subsequent block.
In broad strokes, the flow operates like this:
- **Candidate Selection** winnows the field of parablock candidates, selecting up to one of them to second.
- **Candidate Backing** ensures that a seconding candidate is valid, then generates the appropriate `Statement`. It also keeps track of which candidates have received the backing of a quorum of other validators.
- **Statement Distribution** is the networking component which ensures that all validators receive each others' statements.
- **PoV Distribution** is the networking component which ensures that validators considering a candidate can get the appropriate PoV.
- **Candidate Backing** ensures that a seconding candidate is valid, then generates the appropriate `Statement`. It also
keeps track of which candidates have received the backing of a quorum of other validators.
- **Statement Distribution** is the networking component which ensures that all validators receive each others'
statements.
- **PoV Distribution** is the networking component which ensures that validators considering a candidate can get the
appropriate PoV.
@@ -1,12 +1,20 @@
# Candidate Backing
The Candidate Backing subsystem ensures every parablock considered for relay block inclusion has been seconded by at least one validator, and approved by a quorum. Parablocks for which not enough validators will assert correctness are discarded. If the block later proves invalid, the initial backers are slashable; this gives polkadot a rational threat model during subsequent stages.
The Candidate Backing subsystem ensures every parablock considered for relay block inclusion has been seconded by at
least one validator, and approved by a quorum. Parablocks for which not enough validators will assert correctness are
discarded. If the block later proves invalid, the initial backers are slashable; this gives Polkadot a rational threat
model during subsequent stages.
Its role is to produce backable candidates for inclusion in new relay-chain blocks. It does so by issuing signed [`Statement`s][Statement] and tracking received statements signed by other validators. Once enough statements are received, they can be combined into backing for specific candidates.
Its role is to produce backable candidates for inclusion in new relay-chain blocks. It does so by issuing signed
[`Statement`s][Statement] and tracking received statements signed by other validators. Once enough statements are
received, they can be combined into backing for specific candidates.
Note that though the candidate backing subsystem attempts to produce as many backable candidates as possible, it does _not_ attempt to choose a single authoritative one. The choice of which actually gets included is ultimately up to the block author, by whatever metrics it may use; those are opaque to this subsystem.
Note that though the candidate backing subsystem attempts to produce as many backable candidates as possible, it does
_not_ attempt to choose a single authoritative one. The choice of which actually gets included is ultimately up to the
block author, by whatever metrics it may use; those are opaque to this subsystem.
Once a sufficient quorum has agreed that a candidate is valid, this subsystem notifies the [Provisioner][PV], which in turn engages block production mechanisms to include the parablock.
Once a sufficient quorum has agreed that a candidate is valid, this subsystem notifies the [Provisioner][PV], which in
turn engages block production mechanisms to include the parablock.
## Protocol
@@ -14,33 +22,49 @@ Input: [`CandidateBackingMessage`][CBM]
Output:
- [`CandidateValidationMessage`][CVM]
- [`RuntimeApiMessage`][RAM]
- [`CollatorProtocolMessage`][CPM]
- [`ProvisionerMessage`][PM]
- [`AvailabilityDistributionMessage`][ADM]
- [`StatementDistributionMessage`][SDM]
* [`CandidateValidationMessage`][CVM]
* [`RuntimeApiMessage`][RAM]
* [`CollatorProtocolMessage`][CPM]
* [`ProvisionerMessage`][PM]
* [`AvailabilityDistributionMessage`][ADM]
* [`StatementDistributionMessage`][SDM]
## Functionality
The [Collator Protocol][CP] subsystem is the primary source of non-overseer messages into this subsystem. That subsystem generates appropriate [`CandidateBackingMessage`s][CBM] and passes them to this subsystem.
The [Collator Protocol][CP] subsystem is the primary source of non-overseer messages into this subsystem. That subsystem
generates appropriate [`CandidateBackingMessage`s][CBM] and passes them to this subsystem.
This subsystem requests validation from the [Candidate Validation][CV] and generates an appropriate [`Statement`][Statement]. All `Statement`s are then passed on to the [Statement Distribution][SD] subsystem to be gossiped to peers. When [Candidate Validation][CV] decides that a candidate is invalid, and it was recommended to us to second by our own [Collator Protocol][CP] subsystem, a message is sent to the [Collator Protocol][CP] subsystem with the candidate's hash so that the collator which recommended it can be penalized.
This subsystem requests validation from the [Candidate Validation][CV] and generates an appropriate
[`Statement`][Statement]. All `Statement`s are then passed on to the [Statement Distribution][SD] subsystem to be
gossiped to peers. When [Candidate Validation][CV] decides that a candidate is invalid, and it was recommended to us to
second by our own [Collator Protocol][CP] subsystem, a message is sent to the [Collator Protocol][CP] subsystem with the
candidate's hash so that the collator which recommended it can be penalized.
The subsystem should maintain a set of handles to Candidate Backing Jobs that are currently live, as well as the relay-parent to which they correspond.
The subsystem should maintain a set of handles to Candidate Backing Jobs that are currently live, as well as the
relay-parent to which they correspond.
### On Overseer Signal
* If the signal is an [`OverseerSignal`][OverseerSignal]`::ActiveLeavesUpdate`:
* spawn a Candidate Backing Job for each `activated` head referring to a fresh leaf, storing a bidirectional channel with the Candidate Backing Job in the set of handles.
* spawn a Candidate Backing Job for each `activated` head referring to a fresh leaf, storing a bidirectional channel
with the Candidate Backing Job in the set of handles.
* cease the Candidate Backing Job for each `deactivated` head, if any.
* If the signal is an [`OverseerSignal`][OverseerSignal]`::Conclude`: Forward conclude messages to all jobs, wait a small amount of time for them to join, and then exit.
* If the signal is an [`OverseerSignal`][OverseerSignal]`::Conclude`: Forward conclude messages to all jobs, wait a
small amount of time for them to join, and then exit.
### On Receiving `CandidateBackingMessage`
* If the message is a [`CandidateBackingMessage`][CBM]`::GetBackedCandidates`, get all backable candidates from the statement table and send them back.
* If the message is a [`CandidateBackingMessage`][CBM]`::Second`, sign and dispatch a `Seconded` statement only if we have not seconded any other candidate and have not signed a `Valid` statement for the requested candidate. Signing both a `Seconded` and `Valid` message is a double-voting misbehavior with a heavy penalty, and this could occur if another validator has seconded the same candidate and we've received their message before the internal seconding request.
* If the message is a [`CandidateBackingMessage`][CBM]`::Statement`, count the statement to the quorum. If the statement in the message is `Seconded` and it contains a candidate that belongs to our assignment, request the corresponding `PoV` from the backing node via `AvailabilityDistribution` and launch validation. Issue our own `Valid` or `Invalid` statement as a result.
* If the message is a [`CandidateBackingMessage`][CBM]`::GetBackedCandidates`, get all backable candidates from the
statement table and send them back.
* If the message is a [`CandidateBackingMessage`][CBM]`::Second`, sign and dispatch a `Seconded` statement only if we
have not seconded any other candidate and have not signed a `Valid` statement for the requested candidate. Signing
both a `Seconded` and `Valid` message is a double-voting misbehavior with a heavy penalty, and this could occur if
another validator has seconded the same candidate and we've received their message before the internal seconding
request.
* If the message is a [`CandidateBackingMessage`][CBM]`::Statement`, count the statement to the quorum. If the statement
in the message is `Seconded` and it contains a candidate that belongs to our assignment, request the corresponding
`PoV` from the backing node via `AvailabilityDistribution` and launch validation. Issue our own `Valid` or `Invalid`
statement as a result.
If the seconding node did not provide us with the `PoV` we will retry fetching from other backing validators.
@@ -51,19 +75,25 @@ If the seconding node did not provide us with the `PoV` we will retry fetching f
> * Allow inclusion of _old_ parachain candidates validated by _current_ validators.
> * Allow inclusion of _old_ parachain candidates validated by _old_ validators.
>
> This will probably blur the lines between jobs, will probably require inter-job communication and a short-term memory of recently backable, but not backed candidates.
> This will probably blur the lines between jobs, will probably require inter-job communication and a short-term memory
> of recently backable, but not backed candidates.
## Candidate Backing Job
The Candidate Backing Job represents the work a node does for backing candidates with respect to a particular relay-parent.
The Candidate Backing Job represents the work a node does for backing candidates with respect to a particular
relay-parent.
The goal of a Candidate Backing Job is to produce as many backable candidates as possible. This is done via signed [`Statement`s][STMT] by validators. If a candidate receives a majority of supporting Statements from the Parachain Validators currently assigned, then that candidate is considered backable.
The goal of a Candidate Backing Job is to produce as many backable candidates as possible. This is done via signed
[`Statement`s][STMT] by validators. If a candidate receives a majority of supporting Statements from the Parachain
Validators currently assigned, then that candidate is considered backable.
### On Startup
* Fetch current validator set, validator -> parachain assignments from [`Runtime API`][RA] subsystem using [`RuntimeApiRequest::Validators`][RAM] and [`RuntimeApiRequest::ValidatorGroups`][RAM]
* Fetch current validator set, validator -> parachain assignments from [`Runtime API`][RA] subsystem using
[`RuntimeApiRequest::Validators`][RAM] and [`RuntimeApiRequest::ValidatorGroups`][RAM]
* Determine if the node controls a key in the current validator set. Call this the local key if so.
* If the local key exists, extract the parachain head and validation function from the [`Runtime API`][RA] for the parachain the local key is assigned to by issuing a [`RuntimeApiRequest::Validators`][RAM]
* If the local key exists, extract the parachain head and validation function from the [`Runtime API`][RA] for the
parachain the local key is assigned to by issuing a [`RuntimeApiRequest::Validators`][RAM]
* Issue a [`RuntimeApiRequest::SigningContext`][RAM] message to get a context that will later be used upon signing.
### On Receiving New Candidate Backing Message
@@ -91,15 +121,17 @@ match msg {
}
```
Add `Seconded` statements and `Valid` statements to a quorum. If the quorum reaches a pre-defined threshold, send a [`ProvisionerMessage`][PM]`::ProvisionableData(ProvisionableData::BackedCandidate(CandidateReceipt))` message.
`Invalid` statements that conflict with already witnessed `Seconded` and `Valid` statements for the given candidate, statements that are double-votes, self-contradictions and so on, should result in issuing a [`ProvisionerMessage`][PM]`::MisbehaviorReport` message for each newly detected case of this kind.
Add `Seconded` statements and `Valid` statements to a quorum. If the quorum reaches a pre-defined threshold, send a
[`ProvisionerMessage`][PM]`::ProvisionableData(ProvisionableData::BackedCandidate(CandidateReceipt))` message. `Invalid`
statements that conflict with already witnessed `Seconded` and `Valid` statements for the given candidate, statements
that are double-votes, self-contradictions and so on, should result in issuing a
[`ProvisionerMessage`][PM]`::MisbehaviorReport` message for each newly detected case of this kind.
Backing does not need to concern itself with providing statements to the dispute
coordinator as the dispute coordinator scrapes them from chain. This way the
import is batched and contains only statements that actually made it on some
Backing does not need to concern itself with providing statements to the dispute coordinator as the dispute coordinator
scrapes them from chain. This way the import is batched and contains only statements that actually made it on some
chain.
### Validating Candidates.
### Validating Candidates
```rust
fn spawn_validation_work(candidate, parachain head, validation function) {
@@ -119,14 +151,16 @@ fn spawn_validation_work(candidate, parachain head, validation function) {
### Fetch PoV Block
Create a `(sender, receiver)` pair.
Dispatch a [`AvailabilityDistributionMessage`][ADM]`::FetchPoV{ validator_index, pov_hash, candidate_hash, tx, } and listen on the passed receiver for a response. Availability distribution will send the request to the validator specified by `validator_index`, which might not be serving it for whatever reasons, therefore we need to retry with other backing validators in that case.
Create a `(sender, receiver)` pair. Dispatch a [`AvailabilityDistributionMessage`][ADM]`::FetchPoV{ validator_index,
pov_hash, candidate_hash, tx, }` and listen on the passed receiver for a response. Availability distribution will send
the request to the validator specified by `validator_index`, which might not be serving it for whatever reasons,
therefore we need to retry with other backing validators in that case.
### Validate PoV Block
Create a `(sender, receiver)` pair.
Dispatch a `CandidateValidationMessage::Validate(validation function, candidate, pov, BACKING_EXECUTION_TIMEOUT, sender)` and listen on the receiver for a response.
Create a `(sender, receiver)` pair. Dispatch a `CandidateValidationMessage::Validate(validation function, candidate,
pov, BACKING_EXECUTION_TIMEOUT, sender)` and listen on the receiver for a response.
### Distribute Signed Statement
@@ -1,18 +1,16 @@
# Statement Distribution (Legacy)
This describes the legacy, backwards-compatible version of the Statement
Distribution subsystem.
This describes the legacy, backwards-compatible version of the Statement Distribution subsystem.
**Note:** All the V1 (legacy) code was extracted out to a `legacy_v1` module of
the `statement-distribution` crate, which doesn't alter any logic. V2 (new
protocol) peers also run `legacy_v1` and communicate with V1 peers using V1
messages and with V2 peers using V2 messages. Once the runtime upgrade goes
through on all networks, this `legacy_v1` code will no longer be triggered and
will be vestigial and can be removed.
**Note:** All the V1 (legacy) code was extracted out to a `legacy_v1` module of the `statement-distribution` crate,
which doesn't alter any logic. V2 (new protocol) peers also run `legacy_v1` and communicate with V1 peers using V1
messages and with V2 peers using V2 messages. Once the runtime upgrade goes through on all networks, this `legacy_v1`
code will no longer be triggered and will be vestigial and can be removed.
## Overview
The Statement Distribution Subsystem is responsible for distributing statements about seconded candidates between validators.
The Statement Distribution Subsystem is responsible for distributing statements about seconded candidates between
validators.
## Protocol
@@ -31,89 +29,133 @@ Output:
## Functionality
Implemented as a gossip protocol. Handles updates to our view and peers' views. Neighbor packets are used to inform peers which chain heads we are interested in data for.
Implemented as a gossip protocol. Handles updates to our view and peers' views. Neighbor packets are used to inform
peers which chain heads we are interested in data for.
The Statement Distribution Subsystem is responsible for distributing signed statements that we have generated and for forwarding statements generated by other validators. It also detects a variety of Validator misbehaviors for reporting to the [Provisioner Subsystem](../utility/provisioner.md). During the Backing stage of the inclusion pipeline, Statement Distribution is the main point of contact with peer nodes. On receiving a signed statement from a peer in the same backing group, assuming the peer receipt state machine is in an appropriate state, it sends the Candidate Receipt to the [Candidate Backing subsystem](candidate-backing.md) to handle the validator's statement. On receiving `StatementDistributionMessage::Share` we make sure to send messages to our backing group in addition to random other peers, to ensure a fast backing process and getting all statements quickly for distribution.
The Statement Distribution Subsystem is responsible for distributing signed statements that we have generated and for
forwarding statements generated by other validators. It also detects a variety of Validator misbehaviors for reporting
to the [Provisioner Subsystem](../utility/provisioner.md). During the Backing stage of the inclusion pipeline, Statement
Distribution is the main point of contact with peer nodes. On receiving a signed statement from a peer in the same
backing group, assuming the peer receipt state machine is in an appropriate state, it sends the Candidate Receipt to the
[Candidate Backing subsystem](candidate-backing.md) to handle the validator's statement. On receiving
`StatementDistributionMessage::Share` we make sure to send messages to our backing group in addition to random other
peers, to ensure a fast backing process and getting all statements quickly for distribution.
This subsystem tracks equivocating validators and stops accepting information from them. It establishes a data-dependency order:
This subsystem tracks equivocating validators and stops accepting information from them. It establishes a
data-dependency order:
- In order to receive a `Seconded` message we have the corresponding chain head in our view
- In order to receive a `Valid` message we must have received the corresponding `Seconded` message.
And respect this data-dependency order from our peers by respecting their views. This subsystem is responsible for checking message signatures.
And respect this data-dependency order from our peers by respecting their views. This subsystem is responsible for
checking message signatures.
The Statement Distribution subsystem sends statements to peer nodes.
## Peer Receipt State Machine
There is a very simple state machine which governs which messages we are willing to receive from peers. Not depicted in the state machine: on initial receipt of any [`SignedFullStatement`](../../types/backing.md#signed-statement-type), validate that the provided signature does in fact sign the included data. Note that each individual parablock candidate gets its own instance of this state machine; it is perfectly legal to receive a `Valid(X)` before a `Seconded(Y)`, as long as a `Seconded(X)` has been received.
There is a very simple state machine which governs which messages we are willing to receive from peers. Not depicted in
the state machine: on initial receipt of any [`SignedFullStatement`](../../types/backing.md#signed-statement-type),
validate that the provided signature does in fact sign the included data. Note that each individual parablock candidate
gets its own instance of this state machine; it is perfectly legal to receive a `Valid(X)` before a `Seconded(Y)`, as
long as a `Seconded(X)` has been received.
A: Initial State. Receive `SignedFullStatement(Statement::Second)`: extract `Statement`, forward to Candidate Backing, proceed to B. Receive any other `SignedFullStatement` variant: drop it.
A: Initial State. Receive `SignedFullStatement(Statement::Second)`: extract `Statement`, forward to Candidate Backing,
proceed to B. Receive any other `SignedFullStatement` variant: drop it.
B: Receive any `SignedFullStatement`: check signature and determine whether the statement is new to us. if new, forward to Candidate Backing and circulate to other peers. Receive `OverseerMessage::StopWork`: proceed to C.
B: Receive any `SignedFullStatement`: check signature and determine whether the statement is new to us. if new, forward
to Candidate Backing and circulate to other peers. Receive `OverseerMessage::StopWork`: proceed to C.
C: Receive any message for this block: drop it.
For large statements (see below), we also keep track of the total received large
statements per peer and have a hard limit on that number for flood protection.
This is necessary as in the current code we only forward statements once we have
all the data, therefore flood protection for large statement is a bit more
subtle. This will become an obsolete problem once [off chain code
upgrades](https://github.com/paritytech/polkadot/issues/2979) are implemented.
For large statements (see below), we also keep track of the total received large statements per peer and have a hard
limit on that number for flood protection. This is necessary as in the current code we only forward statements once we
have all the data, therefore flood protection for large statement is a bit more subtle. This will become an obsolete
problem once [off chain code upgrades](https://github.com/paritytech/polkadot/issues/2979) are implemented.
## Peer Knowledge Tracking
The peer receipt state machine implies that for parsimony of network resources, we should model the knowledge of our peers, and help them out. For example, let's consider a case with peers A, B, and C, validators X and Y, and candidate M. A sends us a `Statement::Second(M)` signed by X. We've double-checked it, and it's valid. While we're checking it, we receive a copy of X's `Statement::Second(M)` from `B`, along with a `Statement::Valid(M)` signed by Y.
The peer receipt state machine implies that for parsimony of network resources, we should model the knowledge of our
peers, and help them out. For example, let's consider a case with peers A, B, and C, validators X and Y, and candidate
M. A sends us a `Statement::Second(M)` signed by X. We've double-checked it, and it's valid. While we're checking it, we
receive a copy of X's `Statement::Second(M)` from `B`, along with a `Statement::Valid(M)` signed by Y.
Our response to A is just the `Statement::Valid(M)` signed by Y. However, we haven't heard anything about this from C. Therefore, we send it everything we have: first a copy of X's `Statement::Second`, then Y's `Statement::Valid`.
Our response to A is just the `Statement::Valid(M)` signed by Y. However, we haven't heard anything about this from C.
Therefore, we send it everything we have: first a copy of X's `Statement::Second`, then Y's `Statement::Valid`.
This system implies a certain level of duplication of messages--we received X's `Statement::Second` from both our peers, and C may experience the same--but it minimizes the degree to which messages are simply dropped.
This system implies a certain level of duplication of messages--we received X's `Statement::Second` from both our peers,
and C may experience the same--but it minimizes the degree to which messages are simply dropped.
And respect this data-dependency order from our peers. This subsystem is responsible for checking message signatures.
No jobs. We follow view changes from the [`NetworkBridge`](../utility/network-bridge.md), which in turn is updated by the overseer.
No jobs. We follow view changes from the [`NetworkBridge`](../utility/network-bridge.md), which in turn is updated by
the overseer.
## Equivocations and Flood Protection
An equivocation is a double-vote by a validator. The [Candidate Backing](candidate-backing.md) Subsystem is better-suited than this one to detect equivocations as it adds votes to quorum trackers.
An equivocation is a double-vote by a validator. The [Candidate Backing](candidate-backing.md) Subsystem is
better-suited than this one to detect equivocations as it adds votes to quorum trackers.
At this level, we are primarily concerned about flood-protection, and to some extent, detecting equivocations is a part of that. In particular, we are interested in detecting equivocations of `Seconded` statements. Since every other statement is dependent on `Seconded` statements, ensuring that we only ever hold a bounded number of `Seconded` statements is sufficient for flood-protection.
At this level, we are primarily concerned about flood-protection, and to some extent, detecting equivocations is a part
of that. In particular, we are interested in detecting equivocations of `Seconded` statements. Since every other
statement is dependent on `Seconded` statements, ensuring that we only ever hold a bounded number of `Seconded`
statements is sufficient for flood-protection.
The simple approach is to say that we only receive up to two `Seconded` statements per validator per chain head. However, the marginal cost of equivocation, conditional on having already equivocated, is close to 0, since a single double-vote offence is counted as all double-vote offences for a particular chain-head. Even if it were not, there is some amount of equivocations that can be done such that the marginal cost of issuing further equivocations is close to 0, as there would be an amount of equivocations necessary to be completely and totally obliterated by the slashing algorithm. We fear the validator with nothing left to lose.
The simple approach is to say that we only receive up to two `Seconded` statements per validator per chain head.
However, the marginal cost of equivocation, conditional on having already equivocated, is close to 0, since a single
double-vote offence is counted as all double-vote offences for a particular chain-head. Even if it were not, there is
some amount of equivocations that can be done such that the marginal cost of issuing further equivocations is close to
0, as there would be an amount of equivocations necessary to be completely and totally obliterated by the slashing
algorithm. We fear the validator with nothing left to lose.
With that in mind, this simple approach has a caveat worth digging deeper into.
First: We may be aware of two equivocated `Seconded` statements issued by a validator. A totally honest peer of ours can also be aware of one or two different `Seconded` statements issued by the same validator. And yet another peer may be aware of one or two _more_ `Seconded` statements. And so on. This interacts badly with pre-emptive sending logic. Upon sending a `Seconded` statement to a peer, we will want to pre-emptively follow up with all statements relative to that candidate. Waiting for acknowledgment introduces latency at every hop, so that is best avoided. What can happen is that upon receipt of the `Seconded` statement, the peer will discard it as it falls beyond the bound of 2 that it is allowed to store. It cannot store anything in memory about discarded candidates as that would introduce a DoS vector. Then, the peer would receive from us all of the statements pertaining to that candidate, which, from its perspective, would be undesired - they are data-dependent on the `Seconded` statement we sent them, but they have erased all record of that from their memory. Upon receiving a potential flood of undesired statements, this 100% honest peer may choose to disconnect from us. In this way, an adversary may be able to partition the network with careful distribution of equivocated `Seconded` statements.
First: We may be aware of two equivocated `Seconded` statements issued by a validator. A totally honest peer of ours can
also be aware of one or two different `Seconded` statements issued by the same validator. And yet another peer may be
aware of one or two _more_ `Seconded` statements. And so on. This interacts badly with pre-emptive sending logic. Upon
sending a `Seconded` statement to a peer, we will want to pre-emptively follow up with all statements relative to that
candidate. Waiting for acknowledgment introduces latency at every hop, so that is best avoided. What can happen is that
upon receipt of the `Seconded` statement, the peer will discard it as it falls beyond the bound of 2 that it is allowed
to store. It cannot store anything in memory about discarded candidates as that would introduce a DoS vector. Then, the
peer would receive from us all of the statements pertaining to that candidate, which, from its perspective, would be
undesired - they are data-dependent on the `Seconded` statement we sent them, but they have erased all record of that
from their memory. Upon receiving a potential flood of undesired statements, this 100% honest peer may choose to
disconnect from us. In this way, an adversary may be able to partition the network with careful distribution of
equivocated `Seconded` statements.
The fix is to track, per-peer, the hashes of up to 4 candidates per validator (per relay-parent) that the peer is aware of. It is 4 because we may send them 2 and they may send us 2 different ones. We track the data that they are aware of as the union of things we have sent them and things they have sent us. If we receive a 1st or 2nd `Seconded` statement from a peer, we note it in the peer's known candidates even if we do disregard the data locally. And then, upon receipt of any data dependent on that statement, we do not reduce that peer's standing in our eyes, as the data was not undesired.
The fix is to track, per-peer, the hashes of up to 4 candidates per validator (per relay-parent) that the peer is aware
of. It is 4 because we may send them 2 and they may send us 2 different ones. We track the data that they are aware of
as the union of things we have sent them and things they have sent us. If we receive a 1st or 2nd `Seconded` statement
from a peer, we note it in the peer's known candidates even if we do disregard the data locally. And then, upon receipt
of any data dependent on that statement, we do not reduce that peer's standing in our eyes, as the data was not
undesired.
There is another caveat to the fix: we don't want to allow the peer to flood us because it has set things up in a way that it knows we will drop all of its traffic.
We also track how many statements we have received per peer, per candidate, and per chain-head. This is any statement concerning a particular candidate: `Seconded`, `Valid`, or `Invalid`. If we ever receive a statement from a peer which would push any of these counters beyond twice the amount of validators at the chain-head, we begin to lower the peer's standing and eventually disconnect. This bound is a massive overestimate and could be reduced to twice the number of validators in the corresponding validator group. It is worth noting that the goal at the time of writing is to ensure any finite bound on the amount of stored data, as any equivocation results in a large slash.
There is another caveat to the fix: we don't want to allow the peer to flood us because it has set things up in a way
that it knows we will drop all of its traffic. We also track how many statements we have received per peer, per
candidate, and per chain-head. This is any statement concerning a particular candidate: `Seconded`, `Valid`, or
`Invalid`. If we ever receive a statement from a peer which would push any of these counters beyond twice the amount of
validators at the chain-head, we begin to lower the peer's standing and eventually disconnect. This bound is a massive
overestimate and could be reduced to twice the number of validators in the corresponding validator group. It is worth
noting that the goal at the time of writing is to ensure any finite bound on the amount of stored data, as any
equivocation results in a large slash.
## Large statements
Seconded statements can become quite large on parachain runtime upgrades for
example. For this reason, there exists a `LargeStatement` constructor for the
`StatementDistributionMessage` wire message, which only contains light metadata
of a statement. The actual candidate data is not included. This message type is
used whenever a message is deemed large. The receiver of such a message needs to
request the actual payload via request/response by means of a
Seconded statements can become quite large on parachain runtime upgrades for example. For this reason, there exists a
`LargeStatement` constructor for the `StatementDistributionMessage` wire message, which only contains light metadata of
a statement. The actual candidate data is not included. This message type is used whenever a message is deemed large.
The receiver of such a message needs to request the actual payload via request/response by means of a
`StatementFetchingV1` request.
This is necessary as distribution of a large payload (mega bytes) via gossip
would make the network collapse and timely distribution of statements would no
longer be possible. By using request/response it is ensured that each peer only
transferes large data once. We only take good care to detect an overloaded
peer early and immediately move on to a different peer for fetching the data.
This mechanism should result in a good load distribution and therefore a rather
This is necessary as distribution of a large payload (mega bytes) via gossip would make the network collapse and timely
distribution of statements would no longer be possible. By using request/response it is ensured that each peer only
transferes large data once. We only take good care to detect an overloaded peer early and immediately move on to a
different peer for fetching the data. This mechanism should result in a good load distribution and therefore a rather
optimal distribution path.
With these optimizations, distribution of payloads in the size of up to 3 to 4
MB should work with Kusama validator specifications. For scaling up even more,
runtime upgrades and message passing should be done off chain at some point.
With these optimizations, distribution of payloads in the size of up to 3 to 4 MB should work with Kusama validator
specifications. For scaling up even more, runtime upgrades and message passing should be done off chain at some point.
Flood protection considerations: For making DoS attacks slightly harder on this
subsystem, nodes will only respond to large statement requests, when they
previously notified that peer via gossip about that statement. So, it is not
possible to DoS nodes at scale, by requesting candidate data over and over
again.
Flood protection considerations: For making DoS attacks slightly harder on this subsystem, nodes will only respond to
large statement requests, when they previously notified that peer via gossip about that statement. So, it is not
possible to DoS nodes at scale, by requesting candidate data over and over again.
@@ -1,158 +1,127 @@
# Statement Distribution
This subsystem is responsible for distributing signed statements that we have generated and forwarding statements generated by our peers. Received candidate receipts and statements are passed to the [Candidate Backing subsystem](candidate-backing.md) to handle producing local statements. On receiving `StatementDistributionMessage::Share`, this subsystem distributes the message across the network with redundency to ensure a fast backing process.
This subsystem is responsible for distributing signed statements that we have generated and forwarding statements
generated by our peers. Received candidate receipts and statements are passed to the [Candidate Backing
subsystem](candidate-backing.md) to handle producing local statements. On receiving
`StatementDistributionMessage::Share`, this subsystem distributes the message across the network with redundency to
ensure a fast backing process.
## Overview
**Goal:** every well-connected node is aware of every next potential parachain
block.
**Goal:** every well-connected node is aware of every next potential parachain block.
Validators can either:
- receive parachain block from collator, check block, and gossip statement.
- receive statements from other validators, check the parachain block if it
originated within their own group, gossip forward statement if valid.
- receive statements from other validators, check the parachain block if it originated within their own group, gossip
forward statement if valid.
Validators must have statements, candidates, and persisted validation from all
other validators. This is because we need to store statements from validators
who've checked the candidate on the relay chain, so we know who to hold
accountable in case of disputes. Any validator can be selected as the next
relay-chain block author, and this is not revealed in advance for security
reasons. As a result, all validators must have a up to date view of all possible
parachain candidates + backing statements that could be placed on-chain in the
next block.
Validators must have statements, candidates, and persisted validation from all other validators. This is because we need
to store statements from validators who've checked the candidate on the relay chain, so we know who to hold accountable
in case of disputes. Any validator can be selected as the next relay-chain block author, and this is not revealed in
advance for security reasons. As a result, all validators must have a up to date view of all possible parachain
candidates + backing statements that could be placed on-chain in the next block.
[This blog post](https://polkadot.network/blog/polkadot-v1-0-sharding-and-economic-security)
puts it another way: "Validators who aren't assigned to the parachain still
listen for the attestations [statements] because whichever validator ends up
being the author of the relay-chain block needs to bundle up attested parachain
blocks for several parachains and place them into the relay-chain block."
[This blog post](https://polkadot.network/blog/polkadot-v1-0-sharding-and-economic-security) puts it another way:
"Validators who aren't assigned to the parachain still listen for the attestations [statements] because whichever
validator ends up being the author of the relay-chain block needs to bundle up attested parachain blocks for several
parachains and place them into the relay-chain block."
Backing-group quorum (that is, enough backing group votes) must be reached
before the block author will consider the candidate. Therefore, validators need
to consider _all_ seconded candidates within their own group, because that's
what they're assigned to work on. Validators only need to consider _backable_
candidates from other groups. This informs the design of the statement
distribution protocol to have separate phases for in-group and out-group
distribution, respectively called "cluster" and "grid" mode (see below).
Backing-group quorum (that is, enough backing group votes) must be reached before the block author will consider the
candidate. Therefore, validators need to consider _all_ seconded candidates within their own group, because that's what
they're assigned to work on. Validators only need to consider _backable_ candidates from other groups. This informs the
design of the statement distribution protocol to have separate phases for in-group and out-group distribution,
respectively called "cluster" and "grid" mode (see below).
### With Async Backing
Asynchronous backing changes the runtime to accept parachain candidates from a
certain allowed range of historic relay-parents. These candidates must be backed
by the group assigned to the parachain as-of their corresponding relay parents.
Asynchronous backing changes the runtime to accept parachain candidates from a certain allowed range of historic
relay-parents. These candidates must be backed by the group assigned to the parachain as-of their corresponding relay
parents.
## Protocol
To address the concern of dealing with large numbers of spam candidates or
statements, the overall design approach is to combine a focused "clustering"
protocol for legitimate fresh candidates with a broad-distribution "grid"
protocol to quickly get backed candidates into the hands of many validators.
Validators do not eagerly send each other heavy `CommittedCandidateReceipt`,
but instead request these lazily through request/response protocols.
To address the concern of dealing with large numbers of spam candidates or statements, the overall design approach is to
combine a focused "clustering" protocol for legitimate fresh candidates with a broad-distribution "grid" protocol to
quickly get backed candidates into the hands of many validators. Validators do not eagerly send each other heavy
`CommittedCandidateReceipt`, but instead request these lazily through request/response protocols.
A high-level description of the protocol follows:
### Messages
Nodes can send each other a few kinds of messages: `Statement`,
`BackedCandidateManifest`, `BackedCandidateAcknowledgement`.
Nodes can send each other a few kinds of messages: `Statement`, `BackedCandidateManifest`,
`BackedCandidateAcknowledgement`.
- `Statement` messages contain only a signed compact statement, without full
candidate info.
- `BackedCandidateManifest` messages advertise a description of a backed
candidate and stored statements.
- `BackedCandidateAcknowledgement` messages acknowledge that a backed candidate
is fully known.
- `Statement` messages contain only a signed compact statement, without full candidate info.
- `BackedCandidateManifest` messages advertise a description of a backed candidate and stored statements.
- `BackedCandidateAcknowledgement` messages acknowledge that a backed candidate is fully known.
### Request/response protocol
Nodes can request the full `CommittedCandidateReceipt` and
`PersistedValidationData`, along with statements, over a request/response
protocol. This is the `AttestedCandidateRequest`; the response is
`AttestedCandidateResponse`.
Nodes can request the full `CommittedCandidateReceipt` and `PersistedValidationData`, along with statements, over a
request/response protocol. This is the `AttestedCandidateRequest`; the response is `AttestedCandidateResponse`.
### Importability and the Hypothetical Frontier
The **prospective parachains** subsystem maintains prospective "fragment trees"
which can be used to determine whether a particular parachain candidate could
possibly be included in the future. Candidates which either are within a
fragment tree or _would be_ part of a fragment tree if accepted are said to be
in the "hypothetical frontier".
The **prospective parachains** subsystem maintains prospective "fragment trees" which can be used to determine whether a
particular parachain candidate could possibly be included in the future. Candidates which either are within a fragment
tree or _would be_ part of a fragment tree if accepted are said to be in the "hypothetical frontier".
The **statement-distribution** subsystem keeps track of all candidates, and
updates its knowledge of the hypothetical frontier based on events such as new
relay parents, new confirmed candidates, and newly backed candidates.
The **statement-distribution** subsystem keeps track of all candidates, and updates its knowledge of the hypothetical
frontier based on events such as new relay parents, new confirmed candidates, and newly backed candidates.
We only consider statements as "importable" when the corresponding candidate is
part of the hypothetical frontier, and only send "importable" statements to the
backing subsystem itself.
We only consider statements as "importable" when the corresponding candidate is part of the hypothetical frontier, and
only send "importable" statements to the backing subsystem itself.
### Cluster Mode
- Validator nodes are partitioned into groups (with some exceptions), and
validators within a group at a relay-parent can send each other `Statement`
messages for any candidates within that group and based on that relay-parent.
- Validator nodes are partitioned into groups (with some exceptions), and validators within a group at a relay-parent
can send each other `Statement` messages for any candidates within that group and based on that relay-parent.
- This is referred to as the "cluster" mode.
- Right now these are the same as backing groups, though "cluster"
specifically refers to the set of nodes communicating with each other in the
first phase of distribution.
- Right now these are the same as backing groups, though "cluster" specifically refers to the set of nodes
communicating with each other in the first phase of distribution.
- `Seconded` statements must be sent before `Valid` statements.
- `Seconded` statements may only be sent to other members of the group when the
candidate is fully known by the local validator.
- "Fully known" means the validator has the full `CommittedCandidateReceipt`
and `PersistedValidationData`, which it receives on request from other
validators or from a collator.
- The reason for this is that sending a statement (which is always a
`CompactStatement` carrying nothing but a hash and signature) to the
cluster, is also a signal that the sending node is available to request the
candidate from.
- This makes the protocol easier to reason about, while also reducing network
messages about candidates that don't really exist.
- Validators in a cluster receiving messages about unknown candidates request
the candidate (and statements) from other cluster members which have it.
- `Seconded` statements may only be sent to other members of the group when the candidate is fully known by the local
validator.
- "Fully known" means the validator has the full `CommittedCandidateReceipt` and `PersistedValidationData`, which it
receives on request from other validators or from a collator.
- The reason for this is that sending a statement (which is always a `CompactStatement` carrying nothing but a hash
and signature) to the cluster, is also a signal that the sending node is available to request the candidate from.
- This makes the protocol easier to reason about, while also reducing network messages about candidates that don't
really exist.
- Validators in a cluster receiving messages about unknown candidates request the candidate (and statements) from other
cluster members which have it.
- Spam considerations
- The maximum depth of candidates allowed in asynchronous backing determines
the maximum amount of `Seconded` statements originating from a validator V
which each validator in a cluster may send to others. This bounds the number
of candidates.
- There is a small number of validators in each group, which further limits
the amount of candidates.
- We accept candidates which don't fit in the fragment trees of any relay
parents.
- "Accept" means "attempt to request and store in memory until useful or
expired".
- We listen to prospective parachains subsystem to learn of new additions to
the fragment trees.
- The maximum depth of candidates allowed in asynchronous backing determines the maximum amount of `Seconded`
statements originating from a validator V which each validator in a cluster may send to others. This bounds the
number of candidates.
- There is a small number of validators in each group, which further limits the amount of candidates.
- We accept candidates which don't fit in the fragment trees of any relay parents.
- "Accept" means "attempt to request and store in memory until useful or expired".
- We listen to prospective parachains subsystem to learn of new additions to the fragment trees.
- Use this to attempt to import the candidate later.
### Grid Mode
- Every consensus session provides randomness and a fixed validator set, which
is used to build a redundant grid topology.
- It's redundant in the sense that there are 2 paths from every node to every
other node. See "Grid Topology" section for more details.
- This grid topology is used to create a sending path from each validator group
to every validator.
- When a node observes a candidate as backed, it sends a
`BackedCandidateManifest` to their "receiving" nodes.
- Every consensus session provides randomness and a fixed validator set, which is used to build a redundant grid
topology.
- It's redundant in the sense that there are 2 paths from every node to every other node. See "Grid Topology" section
for more details.
- This grid topology is used to create a sending path from each validator group to every validator.
- When a node observes a candidate as backed, it sends a `BackedCandidateManifest` to their "receiving" nodes.
- If receiving nodes don't yet know the candidate, they request it.
- Once they know the candidate, they respond with a
`BackedCandidateAcknowledgement`.
- Once two nodes perform a manifest/acknowledgement exchange, they can send
`Statement` messages directly to each other for any new statements they might
need.
- This limits the amount of statements we'd have to deal with w.r.t.
candidates that don't really exist. See "Manifest Exchange" section.
- There are limitations on the number of candidates that can be advertised by
each peer, similar to those in the cluster. Validators do not request
candidates which exceed these limitations.
- Validators request candidates as soon as they are advertised, but do not
import the statements until the candidate is part of the hypothetical
frontier, and do not re-advertise or acknowledge until the candidate is
considered both backable and part of the hypothetical frontier.
- Note that requesting is not an implicit acknowledgement, and an explicit
acknowledgement must be sent upon receipt.
- Once they know the candidate, they respond with a `BackedCandidateAcknowledgement`.
- Once two nodes perform a manifest/acknowledgement exchange, they can send `Statement` messages directly to each other
for any new statements they might need.
- This limits the amount of statements we'd have to deal with w.r.t. candidates that don't really exist. See "Manifest
Exchange" section.
- There are limitations on the number of candidates that can be advertised by each peer, similar to those in the
cluster. Validators do not request candidates which exceed these limitations.
- Validators request candidates as soon as they are advertised, but do not import the statements until the candidate is
part of the hypothetical frontier, and do not re-advertise or acknowledge until the candidate is considered both
backable and part of the hypothetical frontier.
- Note that requesting is not an implicit acknowledgement, and an explicit acknowledgement must be sent upon receipt.
## Messages
@@ -161,27 +130,23 @@ backing subsystem itself.
- `ActiveLeaves`
- Notification of a change in the set of active leaves.
- `StatementDistributionMessage::Share`
- Notification of a locally-originating statement. That is, this statement
comes from our node and should be distributed to other nodes.
- Sent by the Backing Subsystem after it successfully imports a
locally-originating statement.
- Notification of a locally-originating statement. That is, this statement comes from our node and should be
distributed to other nodes.
- Sent by the Backing Subsystem after it successfully imports a locally-originating statement.
- `StatementDistributionMessage::Backed`
- Notification of a candidate being backed (received enough validity votes
from the backing group).
- Sent by the Backing Subsystem after it successfully imports a statement for
the first time and after sending ~Share~.
- Notification of a candidate being backed (received enough validity votes from the backing group).
- Sent by the Backing Subsystem after it successfully imports a statement for the first time and after sending
~Share~.
- `StatementDistributionMessage::NetworkBridgeUpdate`
- See next section.
#### Network bridge events
- v1 compatibility
- Messages for the v1 protocol are routed to the legacy statement
distribution.
- Messages for the v1 protocol are routed to the legacy statement distribution.
- `Statement`
- Notification of a signed statement.
- Sent by a peer's Statement Distribution subsystem when circulating
statements.
- Sent by a peer's Statement Distribution subsystem when circulating statements.
- `BackedCandidateManifest`
- Notification of a backed candidate being known by the sending node.
- For the candidate being requested by the receiving node if needed.
@@ -196,26 +161,23 @@ backing subsystem itself.
### Outgoing
- `NetworkBridgeTxMessage::SendValidationMessages`
- Sends a peer all pending messages / acknowledgements / statements for a
relay parent, either through the cluster or the grid.
- Sends a peer all pending messages / acknowledgements / statements for a relay parent, either through the cluster or
the grid.
- `NetworkBridgeTxMessage::SendValidationMessage`
- Circulates a compact statement to all peers who need it, either through the
cluster or the grid.
- Circulates a compact statement to all peers who need it, either through the cluster or the grid.
- `NetworkBridgeTxMessage::ReportPeer`
- Reports a peer (either good or bad).
- `CandidateBackingMessage::Statement`
- Note a validator's statement about a particular candidate.
- `ProspectiveParachainsMessage::GetHypotheticalFrontier`
- Gets the hypothetical frontier membership of candidates under active leaves'
fragment trees.
- Gets the hypothetical frontier membership of candidates under active leaves' fragment trees.
- `NetworkBridgeTxMessage::SendRequests`
- Sends requests, initiating the request/response protocol.
## Request/Response
We also have a request/response protocol because validators do not eagerly send
each other heavy `CommittedCandidateReceipt`, but instead need to request these
lazily.
We also have a request/response protocol because validators do not eagerly send each other heavy
`CommittedCandidateReceipt`, but instead need to request these lazily.
### Protocol
@@ -225,16 +187,13 @@ lazily.
- Done as needed, when handling incoming manifests/statements.
- `RequestManager::dispatch_requests` sends any queued-up requests.
- Calls `RequestManager::next_request` to completion.
- Creates the `OutgoingRequest`, saves the receiver in
`RequestManager::pending_responses`.
- Does nothing if we have more responses pending than the limit of parallel
requests.
- Creates the `OutgoingRequest`, saves the receiver in `RequestManager::pending_responses`.
- Does nothing if we have more responses pending than the limit of parallel requests.
2. Peer
- Requests come in on a peer on the `IncomingRequestReceiver`.
- Runs in a background responder task which feeds requests to `answer_request`
through `MuxedMessage`.
- Runs in a background responder task which feeds requests to `answer_request` through `MuxedMessage`.
- This responder task has a limit on the number of parallel requests.
- `answer_request` on the peer takes the request and sends a response.
- Does this using the response sender on the request.
@@ -243,8 +202,7 @@ lazily.
- `receive_response` on the original validator yields a response.
- Response was sent on the request's response sender.
- Uses `RequestManager::await_incoming` to await on pending responses in an
unordered fashion.
- Uses `RequestManager::await_incoming` to await on pending responses in an unordered fashion.
- Runs on the `MuxedMessage` receiver.
- `handle_response` handles the response.
@@ -265,25 +223,23 @@ lazily.
## Manifests
A manifest is a message about a known backed candidate, along with a description
of the statements backing it. It can be one of two kinds:
A manifest is a message about a known backed candidate, along with a description of the statements backing it. It can be
one of two kinds:
- `Full`: Contains information about the candidate and should be sent to peers
who may not have the candidate yet. This is also called an `Announcement`.
- `Acknowledgement`: Omits information implicit in the candidate, and should be
sent to peers which are guaranteed to have the candidate already.
- `Full`: Contains information about the candidate and should be sent to peers who may not have the candidate yet. This
is also called an `Announcement`.
- `Acknowledgement`: Omits information implicit in the candidate, and should be sent to peers which are guaranteed to
have the candidate already.
### Manifest Exchange
Manifest exchange is when a receiving node received a `Full` manifest and
replied with an `Acknowledgement`. It indicates that both nodes know the
candidate as valid and backed. This allows the nodes to send `Statement`
messages directly to each other for any new statements.
Manifest exchange is when a receiving node received a `Full` manifest and replied with an `Acknowledgement`. It
indicates that both nodes know the candidate as valid and backed. This allows the nodes to send `Statement` messages
directly to each other for any new statements.
Why? This limits the amount of statements we'd have to deal with w.r.t.
candidates that don't really exist. Limiting out-of-group statement distribution
between peers to only candidates that both peers agree are backed and exist
ensures we only have to store statements about real candidates.
Why? This limits the amount of statements we'd have to deal with w.r.t. candidates that don't really exist. Limiting
out-of-group statement distribution between peers to only candidates that both peers agree are backed and exist ensures
we only have to store statements about real candidates.
In practice, manifest exchange means that one of three things have happened:
@@ -291,36 +247,31 @@ In practice, manifest exchange means that one of three things have happened:
- We announced, they acknowledged.
- We announced, they announced.
Concerning the last case, note that it is possible for two nodes to have each
other in their sending set. Consider:
Concerning the last case, note that it is possible for two nodes to have each other in their sending set. Consider:
```
1 2
3 4
```
If validators 2 and 4 are in group B, then there is a path `2->1->3` and
`4->3->1`. Therefore, 1 and 3 might send each other manifests for the same
candidate at the same time, without having seen the other's yet. This also
counts as a manifest exchange, but is only allowed to occur in this way.
If validators 2 and 4 are in group B, then there is a path `2->1->3` and `4->3->1`. Therefore, 1 and 3 might send each
other manifests for the same candidate at the same time, without having seen the other's yet. This also counts as a
manifest exchange, but is only allowed to occur in this way.
After the exchange is complete, we update pending statements. Pending statements
are those we know locally that the remote node does not.
After the exchange is complete, we update pending statements. Pending statements are those we know locally that the
remote node does not.
#### Alternative Paths Through The Topology
Nodes should send a `BackedCandidateAcknowledgement(CandidateHash,
StatementFilter)` notification to any peer which has sent a manifest, and the
candidate has been acquired by other means. This keeps alternative paths through
the topology open, which allows nodes to receive additional statements that come
later, but not after the candidate has been posted on-chain.
Nodes should send a `BackedCandidateAcknowledgement(CandidateHash, StatementFilter)` notification to any peer which has
sent a manifest, and the candidate has been acquired by other means. This keeps alternative paths through the topology
open, which allows nodes to receive additional statements that come later, but not after the candidate has been posted
on-chain.
This is mostly about the limitation that the runtime has no way for block
authors to post statements that come after the parablock is posted on-chain and
ensure those validators still get rewarded. Technically, we only need enough
statements to back the candidate and the manifest + request will provide that.
But more statements might come shortly afterwards, and we want those to end up
on-chain as well to ensure all validators in the group are rewarded.
This is mostly about the limitation that the runtime has no way for block authors to post statements that come after the
parablock is posted on-chain and ensure those validators still get rewarded. Technically, we only need enough statements
to back the candidate and the manifest + request will provide that. But more statements might come shortly afterwards,
and we want those to end up on-chain as well to ensure all validators in the group are rewarded.
For clarity, here is the full timeline:
@@ -333,52 +284,42 @@ For clarity, here is the full timeline:
## Cluster Module
The cluster module provides direct distribution of unbacked candidates within a
group. By utilizing this initial phase of propagating only within
clusters/groups, we bound the number of `Seconded` messages per validator per
relay-parent, helping us prevent spam. Validators can try to circumvent this,
but they would only consume a few KB of memory and it is trivially slashable on
chain.
The cluster module provides direct distribution of unbacked candidates within a group. By utilizing this initial phase
of propagating only within clusters/groups, we bound the number of `Seconded` messages per validator per relay-parent,
helping us prevent spam. Validators can try to circumvent this, but they would only consume a few KB of memory and it is
trivially slashable on chain.
The cluster module determines whether to accept/reject messages from other
validators in the same group. It keeps track of what we have sent to other
validators in the group, and pending statements. For the full protocol, see
"Protocol".
The cluster module determines whether to accept/reject messages from other validators in the same group. It keeps track
of what we have sent to other validators in the group, and pending statements. For the full protocol, see "Protocol".
## Grid Module
The grid module provides distribution of backed candidates and late statements
outside the backing group. For the full protocol, see the "Protocol" section.
The grid module provides distribution of backed candidates and late statements outside the backing group. For the full
protocol, see the "Protocol" section.
### Grid Topology
For distributing outside our cluster (aka backing group) we use a 2D grid
topology. This limits the amount of peers we send messages to, and handles
view updates.
For distributing outside our cluster (aka backing group) we use a 2D grid topology. This limits the amount of peers we
send messages to, and handles view updates.
The basic operation of the grid topology is that:
- A validator producing a message sends it to its row-neighbors and its
column-neighbors.
- A validator receiving a message originating from one of its row-neighbors
sends it to its column-neighbors.
- A validator receiving a message originating from one of its column-neighbors
sends it to its row-neighbors.
- A validator producing a message sends it to its row-neighbors and its column-neighbors.
- A validator receiving a message originating from one of its row-neighbors sends it to its column-neighbors.
- A validator receiving a message originating from one of its column-neighbors sends it to its row-neighbors.
This grid approach defines 2 unique paths for every validator to reach every
other validator in at most 2 hops, providing redundancy.
This grid approach defines 2 unique paths for every validator to reach every other validator in at most 2 hops,
providing redundancy.
Propagation follows these rules:
- Each node has a receiving set and a sending set. These are different for each
group. That is, if a node receives a candidate from group A, it checks if it
is allowed to receive from that node for candidates from group A.
- Each node has a receiving set and a sending set. These are different for each group. That is, if a node receives a
candidate from group A, it checks if it is allowed to receive from that node for candidates from group A.
- For groups that we are in, receive from nobody and send to our X/Y peers.
- For groups that we are not part of:
- We receive from any validator in the group we share a slice with and send to
the corresponding X/Y slice in the other dimension.
- For any validators we don't share a slice with, we receive from the nodes
which share a slice with them.
- We receive from any validator in the group we share a slice with and send to the corresponding X/Y slice in the
other dimension.
- For any validators we don't share a slice with, we receive from the nodes which share a slice with them.
### Example
@@ -391,81 +332,63 @@ For size 11, the matrix would be:
9 10
```
e.g. for index 10, the neighbors would be 1, 4, 7, 9 -- these are the nodes we
could directly communicate with (e.g. either send to or receive from).
e.g. for index 10, the neighbors would be 1, 4, 7, 9 -- these are the nodes we could directly communicate with (e.g.
either send to or receive from).
Now, which of these neighbors can 10 receive from? Recall that the
sending/receiving sets for 10 would be different for different groups. Here are
some hypothetical scenarios:
Now, which of these neighbors can 10 receive from? Recall that the sending/receiving sets for 10 would be different for
different groups. Here are some hypothetical scenarios:
- **Scenario 1:** 9 belongs to group A but not 10. Here, 10 can directly receive
candidates from group A from 9. 10 would propagate them to the nodes in {1, 4,
7} that are not in A.
- **Scenario 2:** 6 is in group A instead of 9, and 7 is not in group A. 10 can
receive group A messages from 7 or 9. 10 will try to relay these messages, but
7 and 9 together should have already propagated the message to all x/y
peers of 10. If so, then 10 will just receive acknowledgements in reply rather
than requests.
- **Scenario 3:** 10 itself is in group A. 10 would not receive candidates from
this group from any other nodes through the grid. It would itself send such
candidates to all its neighbors that are not in A.
- **Scenario 1:** 9 belongs to group A but not 10. Here, 10 can directly receive candidates from group A from 9. 10
would propagate them to the nodes in {1, 4, 7} that are not in A.
- **Scenario 2:** 6 is in group A instead of 9, and 7 is not in group A. 10 can receive group A messages from 7 or 9. 10
will try to relay these messages, but 7 and 9 together should have already propagated the message to all x/y peers of
10. If so, then 10 will just receive acknowledgements in reply rather than requests.
- **Scenario 3:** 10 itself is in group A. 10 would not receive candidates from this group from any other nodes through
the grid. It would itself send such candidates to all its neighbors that are not in A.
### Seconding Limit
The seconding limit is a per-validator limit. Before asynchronous backing, we
had a rule that every validator was only allowed to second one candidate per
relay parent. With asynchronous backing, we have a 'maximum depth' which makes
it possible to second multiple candidates per relay parent. The seconding limit
is set to `max depth + 1` to set an upper bound on candidates entering the
system.
The seconding limit is a per-validator limit. Before asynchronous backing, we had a rule that every validator was only
allowed to second one candidate per relay parent. With asynchronous backing, we have a 'maximum depth' which makes it
possible to second multiple candidates per relay parent. The seconding limit is set to `max depth + 1` to set an upper
bound on candidates entering the system.
## Candidates Module
The candidates module provides a tracker for all known candidates in the view,
whether they are confirmed or not, and how peers have advertised the candidates.
What is a confirmed candidate? It is a candidate for which we have the full
receipt and the persisted validation data. This module gets confirmed candidates
from two sources:
The candidates module provides a tracker for all known candidates in the view, whether they are confirmed or not, and
how peers have advertised the candidates. What is a confirmed candidate? It is a candidate for which we have the full
receipt and the persisted validation data. This module gets confirmed candidates from two sources:
- It can be that a validator fetched a collation directly from the collator and
validated it.
- The first time a validator gets an announcement for an unknown candidate, it
will send a request for the candidate. Upon receiving a response and
validating it (see `UnhandledResponse::validate_response`), it will mark the
candidate as confirmed.
- It can be that a validator fetched a collation directly from the collator and validated it.
- The first time a validator gets an announcement for an unknown candidate, it will send a request for the candidate.
Upon receiving a response and validating it (see `UnhandledResponse::validate_response`), it will mark the candidate
as confirmed.
## Requests Module
The requests module provides a manager for pending requests for candidate data,
as well as pending responses. See "Request/Response Protocol" for a high-level
description of the flow. See module-docs for full details.
The requests module provides a manager for pending requests for candidate data, as well as pending responses. See
"Request/Response Protocol" for a high-level description of the flow. See module-docs for full details.
## Glossary
- **Acknowledgement:** A partial manifest sent to a validator that already has the
candidate to inform them that the sending node also knows the candidate.
Concludes a manifest exchange.
- **Announcement:** A full manifest indicating that a backed candidate is known by
the sending node. Initiates a manifest exchange.
- **Acknowledgement:** A partial manifest sent to a validator that already has the candidate to inform them that the
sending node also knows the candidate. Concludes a manifest exchange.
- **Announcement:** A full manifest indicating that a backed candidate is known by the sending node. Initiates a
manifest exchange.
- **Attestation:** See "Statement".
- **Backable vs. Backed:**
- Note that we sometimes use "backed" to refer to candidates that are
"backable", but not yet backed on chain.
- **Backed** should technically mean that the parablock candidate and its
backing statements have been added to a relay chain block.
- **Backable** is when the necessary backing statements have been acquired but
those statements and the parablock candidate haven't been backed in a relay
chain block yet.
- **Fragment tree:** A parachain fragment not referenced by the relay-chain.
It is a tree of prospective parachain blocks.
- **Manifest:** A message about a known backed candidate, along with a
description of the statements backing it. There are two kinds of manifest,
`Acknowledgement` and `Announcement`. See "Manifests" section.
- Note that we sometimes use "backed" to refer to candidates that are "backable", but not yet backed on chain.
- **Backed** should technically mean that the parablock candidate and its backing statements have been added to a
relay chain block.
- **Backable** is when the necessary backing statements have been acquired but those statements and the parablock
candidate haven't been backed in a relay chain block yet.
- **Fragment tree:** A parachain fragment not referenced by the relay-chain. It is a tree of prospective parachain
blocks.
- **Manifest:** A message about a known backed candidate, along with a description of the statements backing it. There
are two kinds of manifest, `Acknowledgement` and `Announcement`. See "Manifests" section.
- **Peer:** Another validator that a validator is connected to.
- **Request/response:** A protocol used to lazily request and receive heavy
candidate data when needed.
- **Reputation:** Tracks reputation of peers. Applies annoyance cost and good
behavior benefits.
- **Request/response:** A protocol used to lazily request and receive heavy candidate data when needed.
- **Reputation:** Tracks reputation of peers. Applies annoyance cost and good behavior benefits.
- **Statement:** Signed statements that can be made about parachain candidates.
- **Seconded:** Proposal of a parachain candidate. Implicit validity vote.
- **Valid:** States that a parachain candidate is valid.
@@ -474,6 +397,5 @@ description of the flow. See module-docs for full details.
- **Explicit view** / **immediate view**
- The view a peer has of the relay chain heads and highest finalized block.
- **Implicit view**
- Derived from the immediate view. Composed of active leaves and minimum
relay-parents allowed for candidates of various parachains at those
leaves.
- Derived from the immediate view. Composed of active leaves and minimum relay-parents allowed for candidates of
various parachains at those leaves.
@@ -1,6 +1,8 @@
# Collators
Collators are special nodes which bridge a parachain to the relay chain. They are simultaneously full nodes of the parachain, and at least light clients of the relay chain. Their overall contribution to the system is the generation of Proofs of Validity for parachain candidates.
Collators are special nodes which bridge a parachain to the relay chain. They are simultaneously full nodes of the
parachain, and at least light clients of the relay chain. Their overall contribution to the system is the generation of
Proofs of Validity for parachain candidates.
The **Collation Generation** subsystem triggers collators to produce collations
and then forwards them to **Collator Protocol** to circulate to validators.
The **Collation Generation** subsystem triggers collators to produce collations and then forwards them to **Collator
Protocol** to circulate to validators.
@@ -1,17 +1,18 @@
# Collation Generation
The collation generation subsystem is executed on collator nodes and produces candidates to be distributed to validators. If configured to produce collations for a para, it produces collations and then feeds them to the [Collator Protocol][CP] subsystem, which handles the networking.
The collation generation subsystem is executed on collator nodes and produces candidates to be distributed to
validators. If configured to produce collations for a para, it produces collations and then feeds them to the [Collator
Protocol][CP] subsystem, which handles the networking.
## Protocol
Collation generation for Parachains currently works in the following way:
1. A new relay chain block is imported.
2. The collation generation subsystem checks if the core associated to
the parachain is free and if yes, continues.
3. Collation generation calls our collator callback, if present, to generate a PoV. If none exists, do nothing.
4. Authoring logic determines if the current node should build a PoV.
5. Build new PoV and give it back to collation generation.
1. A new relay chain block is imported.
2. The collation generation subsystem checks if the core associated to the parachain is free and if yes, continues.
3. Collation generation calls our collator callback, if present, to generate a PoV. If none exists, do nothing.
4. Authoring logic determines if the current node should build a PoV.
5. Build new PoV and give it back to collation generation.
## Messages
@@ -22,8 +23,7 @@ Collation generation for Parachains currently works in the following way:
- Triggers collation generation procedure outlined in "Protocol" section.
- `CollationGenerationMessage::Initialize`
- Initializes the subsystem. Carries a config.
- No more than one initialization message should ever be sent to the collation
generation subsystem.
- No more than one initialization message should ever be sent to the collation generation subsystem.
- Sent by a collator to initialize this subsystem.
- `CollationGenerationMessage::SubmitCollation`
- If the subsystem isn't initialized or the relay-parent is too old to be relevant, ignore the message.
@@ -37,7 +37,9 @@ Collation generation for Parachains currently works in the following way:
## Functionality
The process of generating a collation for a parachain is very parachain-specific. As such, the details of how to do so are left beyond the scope of this description. The subsystem should be implemented as an abstract wrapper, which is aware of this configuration:
The process of generating a collation for a parachain is very parachain-specific. As such, the details of how to do so
are left beyond the scope of this description. The subsystem should be implemented as an abstract wrapper, which is
aware of this configuration:
```rust
/// The output of a collator.
@@ -117,30 +119,24 @@ The configuration should be optional, to allow for the case where the node is no
- **Collation (output of a collator)**
- Contains the PoV (proof to verify the state transition of the
parachain) and other data.
- Contains the PoV (proof to verify the state transition of the parachain) and other data.
- **Collation result**
- Contains the collation, and an optional result sender for a
collation-seconded signal.
- Contains the collation, and an optional result sender for a collation-seconded signal.
- **Collation seconded signal**
- The signal that is returned when a collation was seconded by a
validator.
- The signal that is returned when a collation was seconded by a validator.
- **Collation function**
- Called with the relay chain block the parablock will be built on top
of.
- Called with the relay chain block the parablock will be built on top of.
- Called with the validation data.
- Provides information about the state of the parachain on the relay
chain.
- Provides information about the state of the parachain on the relay chain.
- **Collation generation config**
- Contains collator's authentication key, optional collator function, and
parachain ID.
- Contains collator's authentication key, optional collator function, and parachain ID.
[CP]: collator-protocol.md
@@ -1,16 +1,25 @@
# Collator Protocol
The Collator Protocol implements the network protocol by which collators and validators communicate. It is used by collators to distribute collations to validators and used by validators to accept collations by collators.
The Collator Protocol implements the network protocol by which collators and validators communicate. It is used by
collators to distribute collations to validators and used by validators to accept collations by collators.
Collator-to-Validator networking is more difficult than Validator-to-Validator networking because the set of possible collators for any given para is unbounded, unlike the validator set. Validator-to-Validator networking protocols can easily be implemented as gossip because the data can be bounded, and validators can authenticate each other by their `PeerId`s for the purposes of instantiating and accepting connections.
Collator-to-Validator networking is more difficult than Validator-to-Validator networking because the set of possible
collators for any given para is unbounded, unlike the validator set. Validator-to-Validator networking protocols can
easily be implemented as gossip because the data can be bounded, and validators can authenticate each other by their
`PeerId`s for the purposes of instantiating and accepting connections.
Since, at least at the level of the para abstraction, the collator-set for any given para is unbounded, validators need to make sure that they are receiving connections from capable and honest collators and that their bandwidth and time are not being wasted by attackers. Communicating across this trust-boundary is the most difficult part of this subsystem.
Since, at least at the level of the para abstraction, the collator-set for any given para is unbounded, validators need
to make sure that they are receiving connections from capable and honest collators and that their bandwidth and time are
not being wasted by attackers. Communicating across this trust-boundary is the most difficult part of this subsystem.
Validation of candidates is a heavy task, and furthermore, the [`PoV`][PoV] itself is a large piece of data. Empirically, `PoV`s are on the order of 10MB.
Validation of candidates is a heavy task, and furthermore, the [`PoV`][PoV] itself is a large piece of data.
Empirically, `PoV`s are on the order of 10MB.
> TODO: note the incremental validation function Ximin proposes at https://github.com/paritytech/polkadot/issues/1348
As this network protocol serves as a bridge between collators and validators, it communicates primarily with one subsystem on behalf of each. As a collator, this will receive messages from the [`CollationGeneration`][CG] subsystem. As a validator, this will communicate only with the [`CandidateBacking`][CB].
As this network protocol serves as a bridge between collators and validators, it communicates primarily with one
subsystem on behalf of each. As a collator, this will receive messages from the [`CollationGeneration`][CG] subsystem.
As a validator, this will communicate only with the [`CandidateBacking`][CB].
## Protocol
@@ -18,9 +27,9 @@ Input: [`CollatorProtocolMessage`][CPM]
Output:
- [`RuntimeApiMessage`][RAM]
- [`NetworkBridgeMessage`][NBM]
- [`CandidateBackingMessage`][CBM]
* [`RuntimeApiMessage`][RAM]
* [`NetworkBridgeMessage`][NBM]
* [`CandidateBackingMessage`][CBM]
## Functionality
@@ -28,7 +37,8 @@ This network protocol uses the `Collation` peer-set of the [`NetworkBridge`][NB]
It uses the [`CollatorProtocolV1Message`](../../types/network.md#collator-protocol) as its `WireMessage`
Since this protocol functions both for validators and collators, it is easiest to go through the protocol actions for each of them separately.
Since this protocol functions both for validators and collators, it is easiest to go through the protocol actions for
each of them separately.
Validators and collators.
```dot process
@@ -47,24 +57,44 @@ digraph {
### Collators
It is assumed that collators are only collating on a single parachain. Collations are generated by the [Collation Generation][CG] subsystem. We will keep up to one local collation per relay-parent, based on `DistributeCollation` messages. If the para is not scheduled on any core, at the relay-parent, or the relay-parent isn't in the active-leaves set, we ignore the message as it must be invalid in that case - although this indicates a logic error elsewhere in the node.
It is assumed that collators are only collating on a single parachain. Collations are generated by the [Collation
Generation][CG] subsystem. We will keep up to one local collation per relay-parent, based on `DistributeCollation`
messages. If the para is not scheduled on any core, at the relay-parent, or the relay-parent isn't in the active-leaves
set, we ignore the message as it must be invalid in that case - although this indicates a logic error elsewhere in the
node.
We keep track of the Para ID we are collating on as a collator. This starts as `None`, and is updated with each `CollateOn` message received. If the `ParaId` of a collation requested to be distributed does not match the one we expect, we ignore the message.
We keep track of the Para ID we are collating on as a collator. This starts as `None`, and is updated with each
`CollateOn` message received. If the `ParaId` of a collation requested to be distributed does not match the one we
expect, we ignore the message.
As with most other subsystems, we track the active leaves set by following `ActiveLeavesUpdate` signals.
For the purposes of actually distributing a collation, we need to be connected to the validators who are interested in collations on that `ParaId` at this point in time. We assume that there is a discovery API for connecting to a set of validators.
For the purposes of actually distributing a collation, we need to be connected to the validators who are interested in
collations on that `ParaId` at this point in time. We assume that there is a discovery API for connecting to a set of
validators.
As seen in the [Scheduler Module][SCH] of the runtime, validator groups are fixed for an entire session and their rotations across cores are predictable. Collators will want to do these things when attempting to distribute collations at a given relay-parent:
As seen in the [Scheduler Module][SCH] of the runtime, validator groups are fixed for an entire session and their
rotations across cores are predictable. Collators will want to do these things when attempting to distribute collations
at a given relay-parent:
* Determine which core the para collated-on is assigned to.
* Determine the group on that core.
* Issue a discovery request for the validators of the current group with[`NetworkBridgeMessage`][NBM]`::ConnectToValidators`.
* Issue a discovery request for the validators of the current group
with[`NetworkBridgeMessage`][NBM]`::ConnectToValidators`.
Once connected to the relevant peers for the current group assigned to the core (transitively, the para), advertise the collation to any of them which advertise the relay-parent in their view (as provided by the [Network Bridge][NB]). If any respond with a request for the full collation, provide it. However, we only send one collation at a time per relay parent, other requests need to wait. This is done to reduce the bandwidth requirements of a collator and also increases the chance to fully send the collation to at least one validator. From the point where one validator has received the collation and seconded it, it will also start to share this collation with other validators in its backing group. Upon receiving a view update from any of these peers which includes a relay-parent for which we have a collation that they will find relevant, advertise the collation to them if we haven't already.
Once connected to the relevant peers for the current group assigned to the core (transitively, the para), advertise the
collation to any of them which advertise the relay-parent in their view (as provided by the [Network Bridge][NB]). If
any respond with a request for the full collation, provide it. However, we only send one collation at a time per relay
parent, other requests need to wait. This is done to reduce the bandwidth requirements of a collator and also increases
the chance to fully send the collation to at least one validator. From the point where one validator has received the
collation and seconded it, it will also start to share this collation with other validators in its backing group. Upon
receiving a view update from any of these peers which includes a relay-parent for which we have a collation that they
will find relevant, advertise the collation to them if we haven't already.
### Validators
On the validator side of the protocol, validators need to accept incoming connections from collators. They should keep some peer slots open for accepting new speculative connections from collators and should disconnect from collators who are not relevant.
On the validator side of the protocol, validators need to accept incoming connections from collators. They should keep
some peer slots open for accepting new speculative connections from collators and should disconnect from collators who
are not relevant.
```dot process
digraph G {
@@ -98,32 +128,62 @@ digraph G {
}
```
When peers connect to us, they can `Declare` that they represent a collator with given public key and intend to collate on a specific para ID. Once they've declared that, and we checked their signature, they can begin to send advertisements of collations. The peers should not send us any advertisements for collations that are on a relay-parent outside of our view or for a para outside of the one they've declared.
When peers connect to us, they can `Declare` that they represent a collator with given public key and intend to collate
on a specific para ID. Once they've declared that, and we checked their signature, they can begin to send advertisements
of collations. The peers should not send us any advertisements for collations that are on a relay-parent outside of our
view or for a para outside of the one they've declared.
The protocol tracks advertisements received and the source of the advertisement. The advertisement source is the `PeerId` of the peer who sent the message. We accept one advertisement per collator per source per relay-parent.
The protocol tracks advertisements received and the source of the advertisement. The advertisement source is the
`PeerId` of the peer who sent the message. We accept one advertisement per collator per source per relay-parent.
As a validator, we will handle requests from other subsystems to fetch a collation on a specific `ParaId` and relay-parent. These requests are made with the request response protocol `CollationFetchingRequest` request. To do so, we need to first check if we have already gathered a collation on that `ParaId` and relay-parent. If not, we need to select one of the advertisements and issue a request for it. If we've already issued a request, we shouldn't issue another one until the first has returned.
As a validator, we will handle requests from other subsystems to fetch a collation on a specific `ParaId` and
relay-parent. These requests are made with the request response protocol `CollationFetchingRequest` request. To do so,
we need to first check if we have already gathered a collation on that `ParaId` and relay-parent. If not, we need to
select one of the advertisements and issue a request for it. If we've already issued a request, we shouldn't issue
another one until the first has returned.
When acting on an advertisement, we issue a `Requests::CollationFetchingV1`. However, we only request one collation at a time per relay parent. This reduces the bandwidth requirements and as we can second only one candidate per relay parent, the others are probably not required anyway. If the request times out, we need to note the collator as being unreliable and reduce its priority relative to other collators.
When acting on an advertisement, we issue a `Requests::CollationFetchingV1`. However, we only request one collation at a
time per relay parent. This reduces the bandwidth requirements and as we can second only one candidate per relay parent,
the others are probably not required anyway. If the request times out, we need to note the collator as being unreliable
and reduce its priority relative to other collators.
As a validator, once the collation has been fetched some other subsystem will inspect and do deeper validation of the collation. The subsystem will report to this subsystem with a [`CollatorProtocolMessage`][CPM]`::ReportCollator`. In that case, if we are connected directly to the collator, we apply a cost to the `PeerId` associated with the collator and potentially disconnect or blacklist it. If the collation is seconded, we notify the collator and apply a benefit to the `PeerId` associated with the collator.
As a validator, once the collation has been fetched some other subsystem will inspect and do deeper validation of the
collation. The subsystem will report to this subsystem with a [`CollatorProtocolMessage`][CPM]`::ReportCollator`. In
that case, if we are connected directly to the collator, we apply a cost to the `PeerId` associated with the collator
and potentially disconnect or blacklist it. If the collation is seconded, we notify the collator and apply a benefit to
the `PeerId` associated with the collator.
### Interaction with [Candidate Backing][CB]
As collators advertise the availability, a validator will simply second the first valid parablock candidate per relay head by sending a [`CandidateBackingMessage`][CBM]`::Second`. Note that this message contains the relay parent of the advertised collation, the candidate receipt and the [PoV][PoV].
As collators advertise the availability, a validator will simply second the first valid parablock candidate per relay
head by sending a [`CandidateBackingMessage`][CBM]`::Second`. Note that this message contains the relay parent of the
advertised collation, the candidate receipt and the [PoV][PoV].
Subsequently, once a valid parablock candidate has been seconded, the [`CandidateBacking`][CB] subsystem will send a [`CollatorProtocolMessage`][CPM]`::Seconded`, which will trigger this subsystem to notify the collator at the `PeerId` that first advertised the parablock on the seconded relay head of their successful seconding.
Subsequently, once a valid parablock candidate has been seconded, the [`CandidateBacking`][CB] subsystem will send a
[`CollatorProtocolMessage`][CPM]`::Seconded`, which will trigger this subsystem to notify the collator at the `PeerId`
that first advertised the parablock on the seconded relay head of their successful seconding.
## Future Work
Several approaches have been discussed, but all have some issues:
- The current approach is very straightforward. However, that protocol is vulnerable to a single collator which, as an attack or simply through chance, gets its block candidate to the node more often than its fair share of the time.
- If collators produce blocks via Aura, BABE or in future Sassafras, it may be possible to choose an "Official" collator for the round, but it may be tricky to ensure that the PVF logic is enforced at collator leader election.
- We could use relay-chain BABE randomness to generate some delay `D` on the order of 1 second, +- 1 second. The collator would then second the first valid parablock which arrives after `D`, or in case none has arrived by `2*D`, the last valid parablock which has arrived. This makes it very hard for a collator to game the system to always get its block nominated, but it reduces the maximum throughput of the system by introducing delay into an already tight schedule.
- A variation of that scheme would be to have a fixed acceptance window `D` for parablock candidates and keep track of count `C`: the number of parablock candidates received. At the end of the period `D`, we choose a random number I in the range `[0, C)` and second the block at Index I. Its drawback is the same: it must wait the full `D` period before seconding any of its received candidates, reducing throughput.
- In order to protect against DoS attacks, it may be prudent to run throw out collations from collators that have behaved poorly (whether recently or historically) and subsequently only verify the PoV for the most suitable of collations.
* The current approach is very straightforward. However, that protocol is vulnerable to a single collator which, as an
attack or simply through chance, gets its block candidate to the node more often than its fair share of the time.
* If collators produce blocks via Aura, BABE or in future Sassafras, it may be possible to choose an "Official" collator
for the round, but it may be tricky to ensure that the PVF logic is enforced at collator leader election.
* We could use relay-chain BABE randomness to generate some delay `D` on the order of 1 second, +* 1 second. The
collator would then second the first valid parablock which arrives after `D`, or in case none has arrived by `2*D`,
the last valid parablock which has arrived. This makes it very hard for a collator to game the system to always get
its block nominated, but it reduces the maximum throughput of the system by introducing delay into an already tight
schedule.
* A variation of that scheme would be to have a fixed acceptance window `D` for parablock candidates and keep track of
count `C`: the number of parablock candidates received. At the end of the period `D`, we choose a random number I in
the range `[0, C)` and second the block at Index I. Its drawback is the same: it must wait the full `D` period before
seconding any of its received candidates, reducing throughput.
* In order to protect against DoS attacks, it may be prudent to run throw out collations from collators that have
behaved poorly (whether recently or historically) and subsequently only verify the PoV for the most suitable of
collations.
[CB]: ../backing/candidate-backing.md
[CBM]: ../../types/overseer-protocol.md#candidate-backing-mesage
@@ -4,12 +4,12 @@ If approval voting finds an invalid candidate, a dispute is raised. The disputes
subsystems are concerned with the following:
1. Disputes can be raised
2. Disputes (votes) get propagated to all other validators
3. Votes get recorded as necessary
3. Nodes will participate in disputes in a sensible fashion
4. Finality is stopped while a candidate is being disputed on chain
5. Chains can be reverted in case a dispute concludes invalid
6. Votes are provided to the provisioner for importing on chain, in order for
1. Disputes (votes) get propagated to all other validators
1. Votes get recorded as necessary
1. Nodes will participate in disputes in a sensible fashion
1. Finality is stopped while a candidate is being disputed on chain
1. Chains can be reverted in case a dispute concludes invalid
1. Votes are provided to the provisioner for importing on chain, in order for
slashing to work.
The dispute-coordinator subsystem interfaces with the provisioner and chain
File diff suppressed because it is too large Load Diff
@@ -202,8 +202,8 @@ the dispute-coordinator already knows about the dispute.
Goal 3 and 4 are obviously very related and both can easily be solved via rate
limiting as we shall see below. Rate limits should already be implemented at the
substrate level, but [are not](https://github.com/paritytech/substrate/issues/7750)
at the time of writing. But even if they were, the enforced substrate limits would
Substrate level, but [are not](https://github.com/paritytech/substrate/issues/7750)
at the time of writing. But even if they were, the enforced Substrate limits would
likely not be configurable and thus would still be to high for our needs as we can
rely on the following observations:
@@ -282,10 +282,10 @@ well, we will do the following:
to assume this is concerning a new dispute.
2. We open a batch and start collecting incoming messages for that candidate,
instead of immediately forwarding.
4. We keep collecting votes in the batch until we receive less than
3. We keep collecting votes in the batch until we receive less than
`MIN_KEEP_BATCH_ALIVE_VOTES` unique votes in the last `BATCH_COLLECTING_INTERVAL`. This is
important to accommodate for goal 5 and also 3.
5. We send the whole batch to the dispute-coordinator.
4. We send the whole batch to the dispute-coordinator.
This together with rate limiting explained above ensures we will be able to
process valid disputes: We can limit the number of simultaneous existing batches
@@ -312,8 +312,8 @@ of attackers, each has 10 messages per second, all are needed to maintain the
batches in memory. Therefore we have a hard cap of around 330 (number of
malicious nodes) open batches. Each can be filled with number of malicious
actor's votes. So 330 batches with each 330 votes: Let's assume approximately 100
bytes per signature/vote. This results in a worst case memory usage of 330 * 330
* 100 ~= 10 MiB.
bytes per signature/vote. This results in a worst case memory usage of
`330 * 330 * 100 ~= 10 MiB`.
For 10_000 validators, we are already in the Gigabyte range, which means that
with a validator set that large we might want to be more strict with the rate limit or
@@ -1,10 +1,25 @@
# GRANDPA Voting Rule
Specifics on the motivation and types of constraints we apply to the GRANDPA voting logic as well as the definitions of **viable** and **finalizable** blocks can be found in the [Chain Selection Protocol](../protocol-chain-selection.md) section.
The subsystem which provides us with viable leaves is the [Chain Selection Subsystem](utility/chain-selection.md).
Specifics on the motivation and types of constraints we apply to the GRANDPA voting logic as well as the definitions of
**viable** and **finalizable** blocks can be found in the [Chain Selection Protocol](../protocol-chain-selection.md)
section. The subsystem which provides us with viable leaves is the [Chain Selection
Subsystem](utility/chain-selection.md).
GRANDPA's regular voting rule is for each validator to select the longest chain they are aware of. GRANDPA proceeds in rounds, collecting information from all online validators and determines the blocks that a supermajority of validators all have in common with each other.
GRANDPA's regular voting rule is for each validator to select the longest chain they are aware of. GRANDPA proceeds in
rounds, collecting information from all online validators and determines the blocks that a supermajority of validators
all have in common with each other.
The low-level GRANDPA logic will provide us with a **required block**. We can find the best leaf containing that block in its chain with the [`ChainSelectionMessage::BestLeafContaining`](../types/overseer-protocol.md#chain-selection-message). If the result is `None`, then we will simply cast a vote on the required block.
The low-level GRANDPA logic will provide us with a **required block**. We can find the best leaf containing that block
in its chain with the
[`ChainSelectionMessage::BestLeafContaining`](../types/overseer-protocol.md#chain-selection-message). If the result is
`None`, then we will simply cast a vote on the required block.
The **viable** leaves provided from the chain selection subsystem are not necessarily **finalizable**, so we need to perform further work to discover the finalizable ancestor of the block. The first constraint is to avoid voting on any unapproved block. The highest approved ancestor of a given block can be determined by querying the Approval Voting subsystem via the [`ApprovalVotingMessage::ApprovedAncestor`](../types/overseer-protocol.md#approval-voting) message. If the response is `Some`, we continue and apply the second constraint. The second constraint is to avoid voting on any block containing a candidate undergoing an active dispute. The list of block hashes and candidates returned from `ApprovedAncestor` should be reversed, and passed to the [`DisputeCoordinatorMessage::DetermineUndisputedChain`](../types/overseer-protocol.md#dispute-coordinator-message) to determine the **finalizable** block which will be our eventual vote.
The **viable** leaves provided from the chain selection subsystem are not necessarily **finalizable**, so we need to
perform further work to discover the finalizable ancestor of the block. The first constraint is to avoid voting on any
unapproved block. The highest approved ancestor of a given block can be determined by querying the Approval Voting
subsystem via the [`ApprovalVotingMessage::ApprovedAncestor`](../types/overseer-protocol.md#approval-voting) message. If
the response is `Some`, we continue and apply the second constraint. The second constraint is to avoid voting on any
block containing a candidate undergoing an active dispute. The list of block hashes and candidates returned from
`ApprovedAncestor` should be reversed, and passed to the
[`DisputeCoordinatorMessage::DetermineUndisputedChain`](../types/overseer-protocol.md#dispute-coordinator-message) to
determine the **finalizable** block which will be our eventual vote.
@@ -24,27 +24,44 @@ The hierarchy of subsystems:
```
The overseer determines work to do based on block import events and block finalization events. It does this by keeping track of the set of relay-parents for which work is currently being done. This is known as the "active leaves" set. It determines an initial set of active leaves on startup based on the data on-disk, and uses events about blockchain import to update the active leaves. Updates lead to [`OverseerSignal`](../types/overseer-protocol.md#overseer-signal)`::ActiveLeavesUpdate` being sent according to new relay-parents, as well as relay-parents to stop considering. Block import events inform the overseer of leaves that no longer need to be built on, now that they have children, and inform us to begin building on those children. Block finalization events inform us when we can stop focusing on blocks that appear to have been orphaned.
The overseer determines work to do based on block import events and block finalization events. It does this by keeping
track of the set of relay-parents for which work is currently being done. This is known as the "active leaves" set. It
determines an initial set of active leaves on startup based on the data on-disk, and uses events about blockchain import
to update the active leaves. Updates lead to
[`OverseerSignal`](../types/overseer-protocol.md#overseer-signal)`::ActiveLeavesUpdate` being sent according to new
relay-parents, as well as relay-parents to stop considering. Block import events inform the overseer of leaves that no
longer need to be built on, now that they have children, and inform us to begin building on those children. Block
finalization events inform us when we can stop focusing on blocks that appear to have been orphaned.
The overseer is also responsible for tracking the freshness of active leaves. Leaves are fresh when they're encountered for the first time, and stale when they're encountered for subsequent times. This can occur after chain reversions or when the fork-choice rule abandons some chain. This distinction is used to manage **Reversion Safety**. Consensus messages are often localized to a specific relay-parent, and it is often a misbehavior to equivocate or sign two conflicting messages. When reverting the chain, we may begin work on a leaf that subsystems have already signed messages for. Subsystems which need to account for reversion safety should avoid performing work on stale leaves.
The overseer is also responsible for tracking the freshness of active leaves. Leaves are fresh when they're encountered
for the first time, and stale when they're encountered for subsequent times. This can occur after chain reversions or
when the fork-choice rule abandons some chain. This distinction is used to manage **Reversion Safety**. Consensus
messages are often localized to a specific relay-parent, and it is often a misbehavior to equivocate or sign two
conflicting messages. When reverting the chain, we may begin work on a leaf that subsystems have already signed messages
for. Subsystems which need to account for reversion safety should avoid performing work on stale leaves.
The overseer's logic can be described with these functions:
## On Startup
* Start all subsystems
* Determine all blocks of the blockchain that should be built on. This should typically be the head of the best fork of the chain we are aware of. Sometimes add recent forks as well.
* Determine all blocks of the blockchain that should be built on. This should typically be the head of the best fork of
the chain we are aware of. Sometimes add recent forks as well.
* Send an `OverseerSignal::ActiveLeavesUpdate` to all subsystems with `activated` containing each of these blocks.
* Begin listening for block import and finality events
## On Block Import Event
* Apply the block import event to the active leaves. A new block should lead to its addition to the active leaves set and its parent being deactivated.
* Mark any stale leaves as stale. The overseer should track all leaves it activates to determine whether leaves are fresh or stale.
* Send an `OverseerSignal::ActiveLeavesUpdate` message to all subsystems containing all activated and deactivated leaves.
* Apply the block import event to the active leaves. A new block should lead to its addition to the active leaves set
and its parent being deactivated.
* Mark any stale leaves as stale. The overseer should track all leaves it activates to determine whether leaves are
fresh or stale.
* Send an `OverseerSignal::ActiveLeavesUpdate` message to all subsystems containing all activated and deactivated
leaves.
* Ensure all `ActiveLeavesUpdate` messages are flushed before resuming activity as a message router.
> TODO: in the future, we may want to avoid building on too many sibling blocks at once. the notion of a "preferred head" among many competing sibling blocks would imply changes in our "active leaves" update rules here
> TODO: in the future, we may want to avoid building on too many sibling blocks at once. the notion of a "preferred
> head" among many competing sibling blocks would imply changes in our "active leaves" update rules here
## On Finalization Event
@@ -54,11 +71,16 @@ The overseer's logic can be described with these functions:
## On Subsystem Failure
Subsystems are essential tasks meant to run as long as the node does. Subsystems can spawn ephemeral work in the form of jobs, but the subsystems themselves should not go down. If a subsystem goes down, it will be because of a critical error that should take the entire node down as well.
Subsystems are essential tasks meant to run as long as the node does. Subsystems can spawn ephemeral work in the form of
jobs, but the subsystems themselves should not go down. If a subsystem goes down, it will be because of a critical error
that should take the entire node down as well.
## Communication Between Subsystems
When a subsystem wants to communicate with another subsystem, or, more typically, a job within a subsystem wants to communicate with its counterpart under another subsystem, that communication must happen via the overseer. Consider this example where a job on subsystem A wants to send a message to its counterpart under subsystem B. This is a realistic scenario, where you can imagine that both jobs correspond to work under the same relay-parent.
When a subsystem wants to communicate with another subsystem, or, more typically, a job within a subsystem wants to
communicate with its counterpart under another subsystem, that communication must happen via the overseer. Consider this
example where a job on subsystem A wants to send a message to its counterpart under subsystem B. This is a realistic
scenario, where you can imagine that both jobs correspond to work under the same relay-parent.
```text
+--------+ +--------+
@@ -78,21 +100,48 @@ When a subsystem wants to communicate with another subsystem, or, more typically
+------------------------------+
```
First, the subsystem that spawned a job is responsible for handling the first step of the communication. The overseer is not aware of the hierarchy of tasks within any given subsystem and is only responsible for subsystem-to-subsystem communication. So the sending subsystem must pass on the message via the overseer to the receiving subsystem, in such a way that the receiving subsystem can further address the communication to one of its internal tasks, if necessary.
First, the subsystem that spawned a job is responsible for handling the first step of the communication. The overseer is
not aware of the hierarchy of tasks within any given subsystem and is only responsible for subsystem-to-subsystem
communication. So the sending subsystem must pass on the message via the overseer to the receiving subsystem, in such a
way that the receiving subsystem can further address the communication to one of its internal tasks, if necessary.
This communication prevents a certain class of race conditions. When the Overseer determines that it is time for subsystems to begin working on top of a particular relay-parent, it will dispatch a `ActiveLeavesUpdate` message to all subsystems to do so, and those messages will be handled asynchronously by those subsystems. Some subsystems will receive those messsages before others, and it is important that a message sent by subsystem A after receiving `ActiveLeavesUpdate` message will arrive at subsystem B after its `ActiveLeavesUpdate` message. If subsystem A maintained an independent channel with subsystem B to communicate, it would be possible for subsystem B to handle the side message before the `ActiveLeavesUpdate` message, but it wouldn't have any logical course of action to take with the side message - leading to it being discarded or improperly handled. Well-architectured state machines should have a single source of inputs, so that is what we do here.
This communication prevents a certain class of race conditions. When the Overseer determines that it is time for
subsystems to begin working on top of a particular relay-parent, it will dispatch a `ActiveLeavesUpdate` message to all
subsystems to do so, and those messages will be handled asynchronously by those subsystems. Some subsystems will receive
those messsages before others, and it is important that a message sent by subsystem A after receiving
`ActiveLeavesUpdate` message will arrive at subsystem B after its `ActiveLeavesUpdate` message. If subsystem A
maintained an independent channel with subsystem B to communicate, it would be possible for subsystem B to handle the
side message before the `ActiveLeavesUpdate` message, but it wouldn't have any logical course of action to take with the
side message - leading to it being discarded or improperly handled. Well-architectured state machines should have a
single source of inputs, so that is what we do here.
One exception is reasonable to make for responses to requests. A request should be made via the overseer in order to ensure that it arrives after any relevant `ActiveLeavesUpdate` message. A subsystem issuing a request as a result of a `ActiveLeavesUpdate` message can safely receive the response via a side-channel for two reasons:
One exception is reasonable to make for responses to requests. A request should be made via the overseer in order to
ensure that it arrives after any relevant `ActiveLeavesUpdate` message. A subsystem issuing a request as a result of a
`ActiveLeavesUpdate` message can safely receive the response via a side-channel for two reasons:
1. It's impossible for a request to be answered before it arrives, it is provable that any response to a request obeys the same ordering constraint.
1. The request was sent as a result of handling a `ActiveLeavesUpdate` message. Then there is no possible future in which the `ActiveLeavesUpdate` message has not been handled upon the receipt of the response.
1. It's impossible for a request to be answered before it arrives, it is provable that any response to a request obeys
the same ordering constraint.
1. The request was sent as a result of handling a `ActiveLeavesUpdate` message. Then there is no possible future in
which the `ActiveLeavesUpdate` message has not been handled upon the receipt of the response.
So as a single exception to the rule that all communication must happen via the overseer we allow the receipt of responses to requests via a side-channel, which may be established for that purpose. This simplifies any cases where the outside world desires to make a request to a subsystem, as the outside world can then establish a side-channel to receive the response on.
So as a single exception to the rule that all communication must happen via the overseer we allow the receipt of
responses to requests via a side-channel, which may be established for that purpose. This simplifies any cases where the
outside world desires to make a request to a subsystem, as the outside world can then establish a side-channel to
receive the response on.
It's important to note that the overseer is not aware of the internals of subsystems, and this extends to the jobs that they spawn. The overseer isn't aware of the existence or definition of those jobs, and is only aware of the outer subsystems with which it interacts. This gives subsystem implementations leeway to define internal jobs as they see fit, and to wrap a more complex hierarchy of state machines than having a single layer of jobs for relay-parent-based work. Likewise, subsystems aren't required to spawn jobs. Certain types of subsystems, such as those for shared storage or networking resources, won't perform block-based work but would still benefit from being on the Overseer's message bus. These subsystems can just ignore the overseer's signals for block-based work.
It's important to note that the overseer is not aware of the internals of subsystems, and this extends to the jobs that
they spawn. The overseer isn't aware of the existence or definition of those jobs, and is only aware of the outer
subsystems with which it interacts. This gives subsystem implementations leeway to define internal jobs as they see fit,
and to wrap a more complex hierarchy of state machines than having a single layer of jobs for relay-parent-based work.
Likewise, subsystems aren't required to spawn jobs. Certain types of subsystems, such as those for shared storage or
networking resources, won't perform block-based work but would still benefit from being on the Overseer's message bus.
These subsystems can just ignore the overseer's signals for block-based work.
Furthermore, the protocols by which subsystems communicate with each other should be well-defined irrespective of the implementation of the subsystem. In other words, their interface should be distinct from their implementation. This will prevent subsystems from accessing aspects of each other that are beyond the scope of the communication boundary.
Furthermore, the protocols by which subsystems communicate with each other should be well-defined irrespective of the
implementation of the subsystem. In other words, their interface should be distinct from their implementation. This will
prevent subsystems from accessing aspects of each other that are beyond the scope of the communication boundary.
## On shutdown
Send an `OverseerSignal::Conclude` message to each subsystem and wait some time for them to conclude before hard-exiting.
Send an `OverseerSignal::Conclude` message to each subsystem and wait some time for them to conclude before
hard-exiting.
@@ -1,25 +1,66 @@
# Subsystems and Jobs
In this section we define the notions of Subsystems and Jobs. These are guidelines for how we will employ an architecture of hierarchical state machines. We'll have a top-level state machine which oversees the next level of state machines which oversee another layer of state machines and so on. The next sections will lay out these guidelines for what we've called subsystems and jobs, since this model applies to many of the tasks that the Node-side behavior needs to encompass, but these are only guidelines and some Subsystems may have deeper hierarchies internally.
In this section we define the notions of Subsystems and Jobs. These are
guidelines for how we will employ an architecture of hierarchical state
machines. We'll have a top-level state machine which oversees the next level of
state machines which oversee another layer of state machines and so on. The next
sections will lay out these guidelines for what we've called subsystems and
jobs, since this model applies to many of the tasks that the Node-side behavior
needs to encompass, but these are only guidelines and some Subsystems may have
deeper hierarchies internally.
Subsystems are long-lived worker tasks that are in charge of performing some particular kind of work. All subsystems can communicate with each other via a well-defined protocol. Subsystems can't generally communicate directly, but must coordinate communication through an [Overseer](overseer.md), which is responsible for relaying messages, handling subsystem failures, and dispatching work signals.
Subsystems are long-lived worker tasks that are in charge of performing some
particular kind of work. All subsystems can communicate with each other via a
well-defined protocol. Subsystems can't generally communicate directly, but must
coordinate communication through an [Overseer](overseer.md), which is
responsible for relaying messages, handling subsystem failures, and dispatching
work signals.
Most work that happens on the Node-side is related to building on top of a specific relay-chain block, which is contextually known as the "relay parent". We call it the relay parent to explicitly denote that it is a block in the relay chain and not on a parachain. We refer to the parent because when we are in the process of building a new block, we don't know what that new block is going to be. The parent block is our only stable point of reference, even though it is usually only useful when it is not yet a parent but in fact a leaf of the block-DAG expected to soon become a parent (because validators are authoring on top of it). Furthermore, we are assuming a forkful blockchain-extension protocol, which means that there may be multiple possible children of the relay-parent. Even if the relay parent has multiple children blocks, the parent of those children is the same, and the context in which those children is authored should be the same. The parent block is the best and most stable reference to use for defining the scope of work items and messages, and is typically referred to by its cryptographic hash.
Most work that happens on the Node-side is related to building on top of a
specific relay-chain block, which is contextually known as the "relay parent".
We call it the relay parent to explicitly denote that it is a block in the relay
chain and not on a parachain. We refer to the parent because when we are in the
process of building a new block, we don't know what that new block is going to
be. The parent block is our only stable point of reference, even though it is
usually only useful when it is not yet a parent but in fact a leaf of the
block-DAG expected to soon become a parent (because validators are authoring on
top of it). Furthermore, we are assuming a forkful blockchain-extension
protocol, which means that there may be multiple possible children of the
relay-parent. Even if the relay parent has multiple children blocks, the parent
of those children is the same, and the context in which those children is
authored should be the same. The parent block is the best and most stable
reference to use for defining the scope of work items and messages, and is
typically referred to by its cryptographic hash.
Since this goal of determining when to start and conclude work relative to a specific relay-parent is common to most, if not all subsystems, it is logically the job of the Overseer to distribute those signals as opposed to each subsystem duplicating that effort, potentially being out of synchronization with each other. Subsystem A should be able to expect that subsystem B is working on the same relay-parents as it is. One of the Overseer's tasks is to provide this heartbeat, or synchronized rhythm, to the system.
Since this goal of determining when to start and conclude work relative to a
specific relay-parent is common to most, if not all subsystems, it is logically
the job of the Overseer to distribute those signals as opposed to each subsystem
duplicating that effort, potentially being out of synchronization with each
other. Subsystem A should be able to expect that subsystem B is working on the
same relay-parents as it is. One of the Overseer's tasks is to provide this
heartbeat, or synchronized rhythm, to the system.
The work that subsystems spawn to be done on a specific relay-parent is known as a job. Subsystems should set up and tear down jobs according to the signals received from the overseer. Subsystems may share or cache state between jobs.
The work that subsystems spawn to be done on a specific relay-parent is known as
a job. Subsystems should set up and tear down jobs according to the signals
received from the overseer. Subsystems may share or cache state between jobs.
Subsystems must be robust to spurious exits. The outputs of the set of subsystems as a whole comprises of signed messages and data committed to disk. Care must be taken to avoid issuing messages that are not substantiated. Since subsystems need to be safe under spurious exits, it is the expected behavior that an `OverseerSignal::Conclude` can just lead to breaking the loop and exiting directly as opposed to waiting for everything to shut down gracefully.
Subsystems must be robust to spurious exits. The outputs of the set of
subsystems as a whole comprises of signed messages and data committed to disk.
Care must be taken to avoid issuing messages that are not substantiated. Since
subsystems need to be safe under spurious exits, it is the expected behavior
that an `OverseerSignal::Conclude` can just lead to breaking the loop and
exiting directly as opposed to waiting for everything to shut down gracefully.
## Subsystem Message Traffic
Which subsystems send messages to which other subsystems.
**Note**: This diagram omits the overseer for simplicity. In fact, all messages are relayed via the overseer.
**Note**: This diagram omits the overseer for simplicity. In fact, all messages
are relayed via the overseer.
**Note**: Messages with a filled diamond arrowhead ("♦") include a `oneshot::Sender` which communicates a response from the recipient.
Messages with an open triangle arrowhead ("Δ") do not include a return sender.
**Note**: Messages with a filled diamond arrowhead ("♦") include a
`oneshot::Sender` which communicates a response from the recipient. Messages
with an open triangle arrowhead ("Δ") do not include a return sender.
```dot process
digraph {
@@ -125,14 +166,17 @@ digraph {
## The Path to Inclusion (Node Side)
Let's contextualize that diagram a bit by following a parachain block from its creation through finalization.
Parachains can use completely arbitrary processes to generate blocks. The relay chain doesn't know or care about
the details; each parachain just needs to provide a [collator](collators/collation-generation.md).
Let's contextualize that diagram a bit by following a parachain block from its
creation through finalization. Parachains can use completely arbitrary processes
to generate blocks. The relay chain doesn't know or care about the details; each
parachain just needs to provide a [collator](collators/collation-generation.md).
**Note**: Inter-subsystem communications are relayed via the overseer, but that step is omitted here for brevity.
**Note**: Inter-subsystem communications are relayed via the overseer, but that
step is omitted here for brevity.
**Note**: Dashed lines indicate a request/response cycle, where the response is communicated asynchronously via
a oneshot channel. Adjacent dashed lines may be processed in parallel.
**Note**: Dashed lines indicate a request/response cycle, where the response is
communicated asynchronously via a oneshot channel. Adjacent dashed lines may be
processed in parallel.
```mermaid
sequenceDiagram
@@ -156,11 +200,13 @@ sequenceDiagram
end
```
The `DistributeCollation` messages that `CollationGeneration` sends to the `CollatorProtocol` contains
two items: a `CandidateReceipt` and `PoV`. The `CollatorProtocol` is then responsible for distributing
that collation to interested validators. However, not all potential collations are of interest. The
`CandidateSelection` subsystem is responsible for determining which collations are interesting, before
`CollatorProtocol` actually fetches the collation.
The `DistributeCollation` messages that `CollationGeneration` sends to the
`CollatorProtocol` contains two items: a `CandidateReceipt` and `PoV`. The
`CollatorProtocol` is then responsible for distributing that collation to
interested validators. However, not all potential collations are of interest.
The `CandidateSelection` subsystem is responsible for determining which
collations are interesting, before `CollatorProtocol` actually fetches the
collation.
```mermaid
sequenceDiagram
@@ -205,10 +251,11 @@ sequenceDiagram
end
```
Assuming we hit the happy path, flow continues with `CandidateSelection` receiving a `(candidate_receipt, pov)` as
the return value from its
`FetchCollation` request. The only time `CandidateSelection` actively requests a collation is when
it hasn't yet seconded one for some `relay_parent`, and is ready to second.
Assuming we hit the happy path, flow continues with `CandidateSelection`
receiving a `(candidate_receipt, pov)` as the return value from its
`FetchCollation` request. The only time `CandidateSelection` actively requests a
collation is when it hasn't yet seconded one for some `relay_parent`, and is
ready to second.
```mermaid
sequenceDiagram
@@ -243,15 +290,17 @@ sequenceDiagram
end
```
At this point, you'll see that control flows in two directions: to `StatementDistribution` to distribute
the `SignedStatement`, and to `PoVDistribution` to distribute the `PoV`. However, that's largely a mirage:
while the initial implementation distributes `PoV`s by gossip, that's inefficient, and will be replaced
with a system which fetches `PoV`s only when actually necessary.
At this point, you'll see that control flows in two directions: to
`StatementDistribution` to distribute the `SignedStatement`, and to
`PoVDistribution` to distribute the `PoV`. However, that's largely a mirage:
while the initial implementation distributes `PoV`s by gossip, that's
inefficient, and will be replaced with a system which fetches `PoV`s only when
actually necessary.
> TODO: figure out more precisely the current status and plans; write them up
Therefore, we'll follow the `SignedStatement`. The `StatementDistribution` subsystem is largely concerned
with implementing a gossip protocol:
Therefore, we'll follow the `SignedStatement`. The `StatementDistribution`
subsystem is largely concerned with implementing a gossip protocol:
```mermaid
sequenceDiagram
@@ -278,8 +327,8 @@ sequenceDiagram
end
```
But who are these `Listener`s who've asked to be notified about incoming `SignedStatement`s?
Nobody, as yet.
But who are these `Listener`s who've asked to be notified about incoming
`SignedStatement`s? Nobody, as yet.
Let's pick back up with the PoV Distribution subsystem.
@@ -305,11 +354,13 @@ sequenceDiagram
Note over PD,NB: On receipt of a network PoV, PovDistribution forwards it to each Listener.<br/>It also penalizes bad gossipers.
```
Unlike in the case of `StatementDistribution`, there is another subsystem which in various circumstances
already registers a listener to be notified when a new `PoV` arrives: `CandidateBacking`. Note that this
is the second time that `CandidateBacking` has gotten involved. The first instance was from the perspective
of the validator choosing to second a candidate via its `CandidateSelection` subsystem. This time, it's
from the perspective of some other validator, being informed that this foreign `PoV` has been received.
Unlike in the case of `StatementDistribution`, there is another subsystem which
in various circumstances already registers a listener to be notified when a new
`PoV` arrives: `CandidateBacking`. Note that this is the second time that
`CandidateBacking` has gotten involved. The first instance was from the
perspective of the validator choosing to second a candidate via its
`CandidateSelection` subsystem. This time, it's from the perspective of some
other validator, being informed that this foreign `PoV` has been received.
```mermaid
sequenceDiagram
@@ -326,10 +377,11 @@ sequenceDiagram
CB ->> AS: StoreAvailableData
```
At this point, things have gone a bit nonlinear. Let's pick up the thread again with `BitfieldSigning`. As
the `Overseer` activates each relay parent, it starts a `BitfieldSigningJob` which operates on an extremely
simple metric: after creation, it immediately goes to sleep for 1.5 seconds. On waking, it records the state
of the world pertaining to availability at that moment.
At this point, things have gone a bit nonlinear. Let's pick up the thread again
with `BitfieldSigning`. As the `Overseer` activates each relay parent, it starts
a `BitfieldSigningJob` which operates on an extremely simple metric: after
creation, it immediately goes to sleep for 1.5 seconds. On waking, it records
the state of the world pertaining to availability at that moment.
```mermaid
sequenceDiagram
@@ -350,9 +402,10 @@ sequenceDiagram
end
```
`BitfieldDistribution` is, like the other `*Distribution` subsystems, primarily interested in implementing
a peer-to-peer gossip network propagating its particular messages. However, it also serves as an essential
relay passing the message along.
`BitfieldDistribution` is, like the other `*Distribution` subsystems, primarily
interested in implementing a peer-to-peer gossip network propagating its
particular messages. However, it also serves as an essential relay passing the
message along.
```mermaid
sequenceDiagram
@@ -366,12 +419,14 @@ sequenceDiagram
BD ->> NB: SendValidationMessage::BitfieldDistribution::Bitfield
```
We've now seen the message flow to the `Provisioner`: both `CandidateBacking` and `BitfieldDistribution`
contribute provisionable data. Now, let's look at that subsystem.
We've now seen the message flow to the `Provisioner`: both `CandidateBacking`
and `BitfieldDistribution` contribute provisionable data. Now, let's look at
that subsystem.
Much like the `BitfieldSigning` subsystem, the `Provisioner` creates a new job for each newly-activated
leaf, and starts a timer. Unlike `BitfieldSigning`, we won't depict that part of the process, because
the `Provisioner` also has other things going on.
Much like the `BitfieldSigning` subsystem, the `Provisioner` creates a new job
for each newly-activated leaf, and starts a timer. Unlike `BitfieldSigning`, we
won't depict that part of the process, because the `Provisioner` also has other
things going on.
```mermaid
sequenceDiagram
@@ -411,8 +466,9 @@ sequenceDiagram
end
```
In principle, any arbitrary subsystem could send a `RequestInherentData` to the `Provisioner`. In practice,
only the `ParachainsInherentDataProvider` does so.
In principle, any arbitrary subsystem could send a `RequestInherentData` to the
`Provisioner`. In practice, only the `ParachainsInherentDataProvider` does so.
The tuple `(SignedAvailabilityBitfields, BackedCandidates, ParentHeader)` is injected by the `ParachainsInherentDataProvider`
into the inherent data. From that point on, control passes from the node to the runtime.
The tuple `(SignedAvailabilityBitfields, BackedCandidates, ParentHeader)` is
injected by the `ParachainsInherentDataProvider` into the inherent data. From
that point on, control passes from the node to the runtime.
@@ -9,13 +9,20 @@ The two data types:
For each of these data we have pruning rules that determine how long we need to keep that data available.
PoV hypothetically only need to be kept around until the block where the data was made fully available is finalized. However, disputes can revert finality, so we need to be a bit more conservative and we add a delay. We should keep the PoV until a block that finalized availability of it has been finalized for 1 day + 1 hour.
PoV hypothetically only need to be kept around until the block where the data was made fully available is finalized.
However, disputes can revert finality, so we need to be a bit more conservative and we add a delay. We should keep the
PoV until a block that finalized availability of it has been finalized for 1 day + 1 hour.
Availability chunks need to be kept available until the dispute period for the corresponding candidate has ended. We can accomplish this by using the same criterion as the above. This gives us a pruning condition of the block finalizing availability of the chunk being final for 1 day + 1 hour.
Availability chunks need to be kept available until the dispute period for the corresponding candidate has ended. We can
accomplish this by using the same criterion as the above. This gives us a pruning condition of the block finalizing
availability of the chunk being final for 1 day + 1 hour.
There is also the case where a validator commits to make a PoV available, but the corresponding candidate is never backed. In this case, we keep the PoV available for 1 hour.
There is also the case where a validator commits to make a PoV available, but the corresponding candidate is never
backed. In this case, we keep the PoV available for 1 hour.
There may be multiple competing blocks all ending the availability phase for a particular candidate. Until finality, it will be unclear which of those is actually the canonical chain, so the pruning records for PoVs and Availability chunks should keep track of all such blocks.
There may be multiple competing blocks all ending the availability phase for a particular candidate. Until finality, it
will be unclear which of those is actually the canonical chain, so the pruning records for PoVs and Availability chunks
should keep track of all such blocks.
## Lifetime of the block data and chunks in storage
@@ -44,7 +51,8 @@ We use an underlying Key-Value database where we assume we have the following op
- `write(key, value)`
- `read(key) -> Option<value>`
- `iter_with_prefix(prefix) -> Iterator<(key, value)>` - gives all keys and values in lexicographical order where the key starts with `prefix`.
- `iter_with_prefix(prefix) -> Iterator<(key, value)>` - gives all keys and values in lexicographical order where the
key starts with `prefix`.
We use this database to encode the following schema:
@@ -57,7 +65,8 @@ We use this database to encode the following schema:
("prune_by_time", Timestamp, CandidateHash) -> Option<()>
```
Timestamps are the wall-clock seconds since Unix epoch. Timestamps and block numbers are both encoded as big-endian so lexicographic order is ascending.
Timestamps are the wall-clock seconds since Unix epoch. Timestamps and block numbers are both encoded as big-endian so
lexicographic order is ascending.
The meta information that we track per-candidate is defined as the `CandidateMeta` struct
@@ -80,9 +89,12 @@ enum State {
}
```
We maintain the invariant that if a candidate has a meta entry, its available data exists on disk if `data_available` is true. All chunks mentioned in the meta entry are available.
We maintain the invariant that if a candidate has a meta entry, its available data exists on disk if `data_available` is
true. All chunks mentioned in the meta entry are available.
Additionally, there is exactly one `prune_by_time` entry which holds the candidate hash unless the state is `Unfinalized`. There may be zero, one, or many "unfinalized" keys with the given candidate, and this will correspond to the `state` of the meta entry.
Additionally, there is exactly one `prune_by_time` entry which holds the candidate hash unless the state is
`Unfinalized`. There may be zero, one, or many "unfinalized" keys with the given candidate, and this will correspond to
the `state` of the meta entry.
## Protocol
@@ -96,9 +108,15 @@ Output:
For each head in the `activated` list:
- Load all ancestors of the head back to the finalized block so we don't miss anything if import notifications are missed. If a `StoreChunk` message is received for a candidate which has no entry, then we will prematurely lose the data.
- Note any new candidates backed in the head. Update the `CandidateMeta` for each. If the `CandidateMeta` does not exist, create it as `Unavailable` with the current timestamp. Register a `"prune_by_time"` entry based on the current timestamp + 1 hour.
- Note any new candidate included in the head. Update the `CandidateMeta` for each, performing a transition from `Unavailable` to `Unfinalized` if necessary. That includes removing the `"prune_by_time"` entry. Add the head hash and number to the state, if unfinalized. Add an `"unfinalized"` entry for the block and candidate.
- Load all ancestors of the head back to the finalized block so we don't miss anything if import notifications are
missed. If a `StoreChunk` message is received for a candidate which has no entry, then we will prematurely lose the
data.
- Note any new candidates backed in the head. Update the `CandidateMeta` for each. If the `CandidateMeta` does not
exist, create it as `Unavailable` with the current timestamp. Register a `"prune_by_time"` entry based on the current
timestamp + 1 hour.
- Note any new candidate included in the head. Update the `CandidateMeta` for each, performing a transition from
`Unavailable` to `Unfinalized` if necessary. That includes removing the `"prune_by_time"` entry. Add the head hash and
number to the state, if unfinalized. Add an `"unfinalized"` entry for the block and candidate.
- The `CandidateEvent` runtime API can be used for this purpose.
On `OverseerSignal::BlockFinalized(finalized)` events:
@@ -106,17 +124,22 @@ On `OverseerSignal::BlockFinalized(finalized)` events:
- for each key in `iter_with_prefix("unfinalized")`
- Stop if the key is beyond `("unfinalized, finalized)`
- For each block number f that we encounter, load the finalized hash for that block.
- The state of each `CandidateMeta` we encounter here must be `Unfinalized`, since we loaded the candidate from an `"unfinalized"` key.
- The state of each `CandidateMeta` we encounter here must be `Unfinalized`, since we loaded the candidate from an
`"unfinalized"` key.
- For each candidate that we encounter under `f` and the finalized block hash,
- Update the `CandidateMeta` to have `State::Finalized`. Remove all `"unfinalized"` entries from the old `Unfinalized` state.
- Update the `CandidateMeta` to have `State::Finalized`. Remove all `"unfinalized"` entries from the old
`Unfinalized` state.
- Register a `"prune_by_time"` entry for the candidate based on the current time + 1 day + 1 hour.
- For each candidate that we encounter under `f` which is not under the finalized block hash,
- Remove all entries under `f` in the `Unfinalized` state.
- If the `CandidateMeta` has state `Unfinalized` with an empty list of blocks, downgrade to `Unavailable` and re-schedule pruning under the timestamp + 1 hour. We do not prune here as the candidate still may be included in a descendant of the finalized chain.
- If the `CandidateMeta` has state `Unfinalized` with an empty list of blocks, downgrade to `Unavailable` and
re-schedule pruning under the timestamp + 1 hour. We do not prune here as the candidate still may be included in
a descendant of the finalized chain.
- Remove all `"unfinalized"` keys under `f`.
- Update `last_finalized` = finalized.
This is roughly `O(n * m)` where n is the number of blocks finalized since the last update, and `m` is the number of parachains.
This is roughly `O(n * m)` where n is the number of blocks finalized since the last update, and `m` is the number of
parachains.
On `QueryAvailableData` message:
@@ -139,7 +162,8 @@ On `QueryChunk` message:
On `QueryAllChunks` message:
- Query `("meta", candidate_hash)`. If `None`, send an empty response and return.
- For all `1` bits in the `chunks_stored`, query `("chunk", candidate_hash, index)`. Ignore but warn on errors, and return a vector of all loaded chunks.
- For all `1` bits in the `chunks_stored`, query `("chunk", candidate_hash, index)`. Ignore but warn on errors, and
return a vector of all loaded chunks.
On `QueryChunkAvailability` message:
@@ -149,14 +173,17 @@ On `QueryChunkAvailability` message:
On `StoreChunk` message:
- If there is a `CandidateMeta` under the candidate hash, set the bit of the erasure-chunk in the `chunks_stored` bitfield to `1`. If it was not `1` already, write the chunk under `("chunk", candidate_hash, chunk_index)`.
- If there is a `CandidateMeta` under the candidate hash, set the bit of the erasure-chunk in the `chunks_stored`
bitfield to `1`. If it was not `1` already, write the chunk under `("chunk", candidate_hash, chunk_index)`.
This is `O(n)` in the size of the chunk.
On `StoreAvailableData` message:
- Compute the erasure root of the available data and compare it with `expected_erasure_root`. Return `StoreAvailableDataError::InvalidErasureRoot` on mismatch.
- If there is no `CandidateMeta` under the candidate hash, create it with `State::Unavailable(now)`. Load the `CandidateMeta` otherwise.
- Compute the erasure root of the available data and compare it with `expected_erasure_root`. Return
`StoreAvailableDataError::InvalidErasureRoot` on mismatch.
- If there is no `CandidateMeta` under the candidate hash, create it with `State::Unavailable(now)`. Load the
`CandidateMeta` otherwise.
- Store `data` under `("available", candidate_hash)` and set `data_available` to true.
- Store each chunk under `("chunk", candidate_hash, index)` and set every bit in `chunks_stored` to `1`.
@@ -172,12 +199,13 @@ Every 5 minutes, run a pruning routine:
- For each erasure chunk bit set, remove `("chunk", candidate_hash, bit_index)`.
- If `data_available`, remove `("available", candidate_hash)`
This is O(n * m) in the amount of candidates and average size of the data stored. This is probably the most expensive operation but does not need
to be run very often.
This is O(n * m) in the amount of candidates and average size of the data stored. This is probably the most expensive
operation but does not need to be run very often.
## Basic scenarios to test
Basically we need to test the correctness of data flow through state FSMs described earlier. These tests obviously assume that some mocking of time is happening.
Basically we need to test the correctness of data flow through state FSMs described earlier. These tests obviously
assume that some mocking of time is happening.
- Stored data that is never included pruned in necessary timeout
- A block (and/or a chunk) is added to the store.
@@ -2,7 +2,8 @@
This subsystem is responsible for handling candidate validation requests. It is a simple request/response server.
A variety of subsystems want to know if a parachain block candidate is valid. None of them care about the detailed mechanics of how a candidate gets validated, just the results. This subsystem handles those details.
A variety of subsystems want to know if a parachain block candidate is valid. None of them care about the detailed
mechanics of how a candidate gets validated, just the results. This subsystem handles those details.
## Protocol
@@ -12,35 +13,53 @@ Output: Validation result via the provided response side-channel.
## Functionality
This subsystem groups the requests it handles in two categories: *candidate validation* and *PVF pre-checking*.
This subsystem groups the requests it handles in two categories: *candidate validation* and *PVF pre-checking*.
The first category can be further subdivided in two request types: one which draws out validation data from the state, and another which accepts all validation data exhaustively. Validation returns three possible outcomes on the response channel: the candidate is valid, the candidate is invalid, or an internal error occurred.
The first category can be further subdivided in two request types: one which draws out validation data from the state,
and another which accepts all validation data exhaustively. Validation returns three possible outcomes on the response
channel: the candidate is valid, the candidate is invalid, or an internal error occurred.
Parachain candidates are validated against their validation function: A piece of Wasm code that describes the state-transition of the parachain. Validation function execution is not metered. This means that an execution which is an infinite loop or simply takes too long must be forcibly exited by some other means. For this reason, we recommend dispatching candidate validation to be done on subprocesses which can be killed if they time-out.
Parachain candidates are validated against their validation function: A piece of Wasm code that describes the
state-transition of the parachain. Validation function execution is not metered. This means that an execution which is
an infinite loop or simply takes too long must be forcibly exited by some other means. For this reason, we recommend
dispatching candidate validation to be done on subprocesses which can be killed if they time-out.
Upon receiving a validation request, the first thing the candidate validation subsystem should do is make sure it has all the necessary parameters to the validation function. These are:
Upon receiving a validation request, the first thing the candidate validation subsystem should do is make sure it has
all the necessary parameters to the validation function. These are:
* The Validation Function itself.
* The [`CandidateDescriptor`](../../types/candidate.md#candidatedescriptor).
* The [`ValidationData`](../../types/candidate.md#validationdata).
* The [`PoV`](../../types/availability.md#proofofvalidity).
The second category is for PVF pre-checking. This is primarly used by the [PVF pre-checker](pvf-prechecker.md) subsystem.
The second category is for PVF pre-checking. This is primarly used by the [PVF pre-checker](pvf-prechecker.md)
subsystem.
### Determining Parameters
For a [`CandidateValidationMessage`][CVM]`::ValidateFromExhaustive`, these parameters are exhaustively provided.
For a [`CandidateValidationMessage`][CVM]`::ValidateFromChainState`, some more work needs to be done. Due to the uncertainty of Availability Cores (implemented in the [`Scheduler`](../../runtime/scheduler.md) module of the runtime), a candidate at a particular relay-parent and for a particular para may have two different valid validation-data to be executed under depending on what is assumed to happen if the para is occupying a core at the onset of the new block. This is encoded as an `OccupiedCoreAssumption` in the runtime API.
For a [`CandidateValidationMessage`][CVM]`::ValidateFromChainState`, some more work needs to be done. Due to the
uncertainty of Availability Cores (implemented in the [`Scheduler`](../../runtime/scheduler.md) module of the runtime),
a candidate at a particular relay-parent and for a particular para may have two different valid validation-data to be
executed under depending on what is assumed to happen if the para is occupying a core at the onset of the new block.
This is encoded as an `OccupiedCoreAssumption` in the runtime API.
The way that we can determine which assumption the candidate is meant to be executed under is simply to do an exhaustive check of both possibilities based on the state of the relay-parent. First we fetch the validation data under the assumption that the block occupying becomes available. If the `validation_data_hash` of the `CandidateDescriptor` matches this validation data, we use that. Otherwise, if the `validation_data_hash` matches the validation data fetched under the `TimedOut` assumption, we use that. Otherwise, we return a `ValidationResult::Invalid` response and conclude.
The way that we can determine which assumption the candidate is meant to be executed under is simply to do an exhaustive
check of both possibilities based on the state of the relay-parent. First we fetch the validation data under the
assumption that the block occupying becomes available. If the `validation_data_hash` of the `CandidateDescriptor`
matches this validation data, we use that. Otherwise, if the `validation_data_hash` matches the validation data fetched
under the `TimedOut` assumption, we use that. Otherwise, we return a `ValidationResult::Invalid` response and conclude.
Then, we can fetch the validation code from the runtime based on which type of candidate this is. This gives us all the parameters. The descriptor and PoV come from the request itself, and the other parameters have been derived from the state.
Then, we can fetch the validation code from the runtime based on which type of candidate this is. This gives us all the
parameters. The descriptor and PoV come from the request itself, and the other parameters have been derived from the
state.
> TODO: This would be a great place for caching to avoid making lots of runtime requests. That would need a job, though.
### Execution of the Parachain Wasm
Once we have all parameters, we can spin up a background task to perform the validation in a way that doesn't hold up the entire event loop. Before invoking the validation function itself, this should first do some basic checks:
Once we have all parameters, we can spin up a background task to perform the validation in a way that doesn't hold up
the entire event loop. Before invoking the validation function itself, this should first do some basic checks:
* The collator signature is valid
* The PoV provided matches the `pov_hash` field of the descriptor
@@ -48,6 +67,8 @@ For more details please see [PVF Host and Workers](pvf-host-and-workers.md).
### Checking Validation Outputs
If we can assume the presence of the relay-chain state (that is, during processing [`CandidateValidationMessage`][CVM]`::ValidateFromChainState`) we can run all the checks that the relay-chain would run at the inclusion time thus confirming that the candidate will be accepted.
If we can assume the presence of the relay-chain state (that is, during processing
[`CandidateValidationMessage`][CVM]`::ValidateFromChainState`) we can run all the checks that the relay-chain would run
at the inclusion time thus confirming that the candidate will be accepted.
[CVM]: ../../types/overseer-protocol.md#validationrequesttype
@@ -1,6 +1,7 @@
# Chain API
The Chain API subsystem is responsible for providing a single point of access to chain state data via a set of pre-determined queries.
The Chain API subsystem is responsible for providing a single point of access to chain state data via a set of
pre-determined queries.
## Protocol
@@ -10,7 +11,8 @@ Output: None
## Functionality
On receipt of `ChainApiMessage`, answer the request and provide the response to the side-channel embedded within the request.
On receipt of `ChainApiMessage`, answer the request and provide the response to the side-channel embedded within the
request.
Currently, the following requests are supported:
* Block hash to number
@@ -1,8 +1,12 @@
# Chain Selection Subsystem
This subsystem implements the necessary metadata for the implementation of the [chain selection](../../protocol-chain-selection.md) portion of the protocol.
This subsystem implements the necessary metadata for the implementation of the [chain
selection](../../protocol-chain-selection.md) portion of the protocol.
The subsystem wraps a database component which maintains a view of the unfinalized chain and records the properties of each block: whether the block is **viable**, whether it is **stagnant**, and whether it is **reverted**. It should also maintain an updated set of active leaves in accordance with this view, which should be cheap to query. Leaves are ordered descending first by weight and then by block number.
The subsystem wraps a database component which maintains a view of the unfinalized chain and records the properties of
each block: whether the block is **viable**, whether it is **stagnant**, and whether it is **reverted**. It should also
maintain an updated set of active leaves in accordance with this view, which should be cheap to query. Leaves are
ordered descending first by weight and then by block number.
This subsystem needs to update its information on the unfinalized chain:
* On every leaf-activated signal
@@ -11,32 +15,47 @@ This subsystem needs to update its information on the unfinalized chain:
* On every `ChainSelectionMessage::RevertBlocks`
* Periodically, to detect stagnation.
Simple implementations of these updates do `O(n_unfinalized_blocks)` disk operations. If the amount of unfinalized blocks is relatively small, the updates should not take very much time. However, in cases where there are hundreds or thousands of unfinalized blocks the naive implementations of these update algorithms would have to be replaced with more sophisticated versions.
Simple implementations of these updates do `O(n_unfinalized_blocks)` disk operations. If the amount of unfinalized
blocks is relatively small, the updates should not take very much time. However, in cases where there are hundreds or
thousands of unfinalized blocks the naive implementations of these update algorithms would have to be replaced with more
sophisticated versions.
### `OverseerSignal::ActiveLeavesUpdate`
## `OverseerSignal::ActiveLeavesUpdate`
Determine all new blocks implicitly referenced by any new active leaves and add them to the view. Update the set of viable leaves accordingly. The weights of imported blocks can be determined by the [`ChainApiMessage::BlockWeight`](../../types/overseer-protocol.md#chain-api-message).
Determine all new blocks implicitly referenced by any new active leaves and add them to the view. Update the set of
viable leaves accordingly. The weights of imported blocks can be determined by the
[`ChainApiMessage::BlockWeight`](../../types/overseer-protocol.md#chain-api-message).
### `OverseerSignal::BlockFinalized`
## `OverseerSignal::BlockFinalized`
Delete data for all orphaned chains and update all metadata descending from the new finalized block accordingly, along with the set of viable leaves. Note that finalizing a **reverted** or **stagnant** block means that the descendants of those blocks may lose that status because the definitions of those properties don't include the finalized chain. Update the set of viable leaves accordingly.
Delete data for all orphaned chains and update all metadata descending from the new finalized block accordingly, along
with the set of viable leaves. Note that finalizing a **reverted** or **stagnant** block means that the descendants of
those blocks may lose that status because the definitions of those properties don't include the finalized chain. Update
the set of viable leaves accordingly.
### `ChainSelectionMessage::Approved`
## `ChainSelectionMessage::Approved`
Update the approval status of the referenced block. If the block was stagnant and thus non-viable and is now viable, then the metadata of all of its descendants needs to be updated as well, as they may no longer be stagnant either. Update the set of viable leaves accordingly.
Update the approval status of the referenced block. If the block was stagnant and thus non-viable and is now viable,
then the metadata of all of its descendants needs to be updated as well, as they may no longer be stagnant either.
Update the set of viable leaves accordingly.
### `ChainSelectionMessage::Leaves`
## `ChainSelectionMessage::Leaves`
Gets all leaves of the chain, i.e. block hashes that are suitable to build upon and have no suitable children. Supplies the leaves in descending order by score.
Gets all leaves of the chain, i.e. block hashes that are suitable to build upon and have no suitable children. Supplies
the leaves in descending order by score.
### `ChainSelectionMessage::BestLeafContaining`
## `ChainSelectionMessage::BestLeafContaining`
If the required block is unknown or not viable, then return `None`. Iterate over all leaves in order of descending weight, returning the first leaf containing the required block in its chain, and `None` otherwise.
If the required block is unknown or not viable, then return `None`. Iterate over all leaves in order of descending
weight, returning the first leaf containing the required block in its chain, and `None` otherwise.
### `ChainSelectionMessage::RevertBlocks`
This message indicates that a dispute has concluded against a parachain block candidate. The message passes along a vector containing the block number and block hash of each block where the disputed candidate was included. The passed blocks will be marked as reverted, and their descendants will be marked as non-viable.
## `ChainSelectionMessage::RevertBlocks`
This message indicates that a dispute has concluded against a parachain block candidate. The message passes along a
vector containing the block number and block hash of each block where the disputed candidate was included. The passed
blocks will be marked as reverted, and their descendants will be marked as non-viable.
### Periodically
## Periodically
Detect stagnant blocks and apply the stagnant definition to all descendants. Update the set of viable leaves accordingly.
Detect stagnant blocks and apply the stagnant definition to all descendants. Update the set of viable leaves
accordingly.
@@ -1,30 +1,43 @@
# Network Bridge
One of the main features of the overseer/subsystem duality is to avoid shared ownership of resources and to communicate via message-passing. However, implementing each networking subsystem as its own network protocol brings a fair share of challenges.
One of the main features of the overseer/subsystem duality is to avoid shared ownership of resources and to communicate
via message-passing. However, implementing each networking subsystem as its own network protocol brings a fair share of
challenges.
The most notable challenge is coordinating and eliminating race conditions of peer connection and disconnection events. If we have many network protocols that peers are supposed to be connected on, it is difficult to enforce that a peer is indeed connected on all of them or the order in which those protocols receive notifications that peers have connected. This becomes especially difficult when attempting to share peer state across protocols. All of the Parachain-Host's gossip protocols eliminate DoS with a data-dependency on current chain heads. However, it is inefficient and confusing to implement the logic for tracking our current chain heads as well as our peers' on each of those subsystems. Having one subsystem for tracking this shared state and distributing it to the others is an improvement in architecture and efficiency.
The most notable challenge is coordinating and eliminating race conditions of peer connection and disconnection events.
If we have many network protocols that peers are supposed to be connected on, it is difficult to enforce that a peer is
indeed connected on all of them or the order in which those protocols receive notifications that peers have connected.
This becomes especially difficult when attempting to share peer state across protocols. All of the Parachain-Host's
gossip protocols eliminate DoS with a data-dependency on current chain heads. However, it is inefficient and confusing
to implement the logic for tracking our current chain heads as well as our peers' on each of those subsystems. Having
one subsystem for tracking this shared state and distributing it to the others is an improvement in architecture and
efficiency.
One other piece of shared state to track is peer reputation. When peers are found to have provided value or cost, we adjust their reputation accordingly.
One other piece of shared state to track is peer reputation. When peers are found to have provided value or cost, we
adjust their reputation accordingly.
So in short, this Subsystem acts as a bridge between an actual network component and a subsystem's protocol. The implementation of the underlying network component is beyond the scope of this module. We make certain assumptions about the network component:
* The network allows registering of protocols and multiple versions of each protocol.
* The network handles version negotiation of protocols with peers and only connects the peer on the highest version of the protocol.
* Each protocol has its own peer-set, although there may be some overlap.
* The network provides peer-set management utilities for discovering the peer-IDs of validators and a means of dialing peers with given IDs.
So in short, this Subsystem acts as a bridge between an actual network component and a subsystem's protocol. The
implementation of the underlying network component is beyond the scope of this module. We make certain assumptions about
the network component:
- The network allows registering of protocols and multiple versions of each protocol.
- The network handles version negotiation of protocols with peers and only connects the peer on the highest version of
the protocol.
- Each protocol has its own peer-set, although there may be some overlap.
- The network provides peer-set management utilities for discovering the peer-IDs of validators and a means of dialing
peers with given IDs.
The network bridge makes use of the peer-set feature, but is not generic over peer-set. Instead, it exposes two peer-sets that event producers can attach to: `Validation` and `Collation`. More information can be found on the documentation of the [`NetworkBridgeMessage`][NBM].
The network bridge makes use of the peer-set feature, but is not generic over peer-set. Instead, it exposes two
peer-sets that event producers can attach to: `Validation` and `Collation`. More information can be found on the
documentation of the [`NetworkBridgeMessage`][NBM].
## Protocol
Input: [`NetworkBridgeMessage`][NBM]
Output:
- [`ApprovalDistributionMessage`][AppD]`::NetworkBridgeUpdate`
- [`BitfieldDistributionMessage`][BitD]`::NetworkBridgeUpdate`
- [`CollatorProtocolMessage`][CollP]`::NetworkBridgeUpdate`
- [`StatementDistributionMessage`][StmtD]`::NetworkBridgeUpdate`
Output: - [`ApprovalDistributionMessage`][AppD]`::NetworkBridgeUpdate` -
[`BitfieldDistributionMessage`][BitD]`::NetworkBridgeUpdate` -
[`CollatorProtocolMessage`][CollP]`::NetworkBridgeUpdate` -
[`StatementDistributionMessage`][StmtD]`::NetworkBridgeUpdate`
## Functionality
@@ -37,7 +50,8 @@ enum WireMessage<M> {
}
```
and instantiates this type twice, once using the [`ValidationProtocolV1`][VP1] message type, and once with the [`CollationProtocolV1`][CP1] message type.
and instantiates this type twice, once using the [`ValidationProtocolV1`][VP1] message type, and once with the
[`CollationProtocolV1`][CP1] message type.
```rust
type ValidationV1Message = WireMessage<ValidationProtocolV1>;
@@ -46,17 +60,21 @@ type CollationV1Message = WireMessage<CollationProtocolV1>;
### Startup
On startup, we register two protocols with the underlying network utility. One for validation and one for collation. We register only version 1 of each of these protocols.
On startup, we register two protocols with the underlying network utility. One for validation and one for collation. We
register only version 1 of each of these protocols.
### Main Loop
The bulk of the work done by this subsystem is in responding to network events, signals from the overseer, and messages from other subsystems.
The bulk of the work done by this subsystem is in responding to network events, signals from the overseer, and messages
from other subsystems.
Each network event is associated with a particular peer-set.
### Overseer Signal: `ActiveLeavesUpdate`
The `activated` and `deactivated` lists determine the evolution of our local view over time. A `ProtocolMessage::ViewUpdate` is issued to each connected peer on each peer-set, and a `NetworkBridgeEvent::OurViewChange` is issued to each event handler for each protocol.
The `activated` and `deactivated` lists determine the evolution of our local view over time. A
`ProtocolMessage::ViewUpdate` is issued to each connected peer on each peer-set, and a
`NetworkBridgeEvent::OurViewChange` is issued to each event handler for each protocol.
We only send view updates if the node has indicated that it has finished major blockchain synchronization.
@@ -64,24 +82,31 @@ If we are connected to the same peer on both peer-sets, we will send the peer tw
### Overseer Signal: `BlockFinalized`
We update our view's `finalized_number` to the provided one and delay `ProtocolMessage::ViewUpdate` and `NetworkBridgeEvent::OurViewChange` till the next `ActiveLeavesUpdate`.
We update our view's `finalized_number` to the provided one and delay `ProtocolMessage::ViewUpdate` and
`NetworkBridgeEvent::OurViewChange` till the next `ActiveLeavesUpdate`.
### Network Event: `PeerConnected`
Issue a `NetworkBridgeEvent::PeerConnected` for each [Event Handler](#event-handlers) of the peer-set and negotiated protocol version of the peer. Also issue a `NetworkBridgeEvent::PeerViewChange` and send the peer our current view, but only if the node has indicated that it has finished major blockchain synchronization. Otherwise, we only send the peer an empty view.
Issue a `NetworkBridgeEvent::PeerConnected` for each [Event Handler](#event-handlers) of the peer-set and negotiated
protocol version of the peer. Also issue a `NetworkBridgeEvent::PeerViewChange` and send the peer our current view, but
only if the node has indicated that it has finished major blockchain synchronization. Otherwise, we only send the peer
an empty view.
### Network Event: `PeerDisconnected`
Issue a `NetworkBridgeEvent::PeerDisconnected` for each [Event Handler](#event-handlers) of the peer-set and negotiated protocol version of the peer.
Issue a `NetworkBridgeEvent::PeerDisconnected` for each [Event Handler](#event-handlers) of the peer-set and negotiated
protocol version of the peer.
### Network Event: `ProtocolMessage`
Map the message onto the corresponding [Event Handler](#event-handlers) based on the peer-set this message was received on and dispatch via overseer.
Map the message onto the corresponding [Event Handler](#event-handlers) based on the peer-set this message was received
on and dispatch via overseer.
### Network Event: `ViewUpdate`
- Check that the new view is valid and note it as the most recent view update of the peer on this peer-set.
- Map a `NetworkBridgeEvent::PeerViewChange` onto the corresponding [Event Handler](#event-handlers) based on the peer-set this message was received on and dispatch via overseer.
- Map a `NetworkBridgeEvent::PeerViewChange` onto the corresponding [Event Handler](#event-handlers) based on the
peer-set this message was received on and dispatch via overseer.
### `ReportPeer`
@@ -108,22 +133,23 @@ Map the message onto the corresponding [Event Handler](#event-handlers) based on
### `NewGossipTopology`
- Map all `AuthorityDiscoveryId`s to `PeerId`s and issue a corresponding `NetworkBridgeUpdate`
to all validation subsystems.
- Map all `AuthorityDiscoveryId`s to `PeerId`s and issue a corresponding `NetworkBridgeUpdate` to all validation
subsystems.
## Event Handlers
Network bridge event handlers are the intended recipients of particular network protocol messages. These are each a variant of a message to be sent via the overseer.
Network bridge event handlers are the intended recipients of particular network protocol messages. These are each a
variant of a message to be sent via the overseer.
### Validation V1
* `ApprovalDistributionV1Message -> ApprovalDistributionMessage::NetworkBridgeUpdate`
* `BitfieldDistributionV1Message -> BitfieldDistributionMessage::NetworkBridgeUpdate`
* `StatementDistributionV1Message -> StatementDistributionMessage::NetworkBridgeUpdate`
- `ApprovalDistributionV1Message -> ApprovalDistributionMessage::NetworkBridgeUpdate`
- `BitfieldDistributionV1Message -> BitfieldDistributionMessage::NetworkBridgeUpdate`
- `StatementDistributionV1Message -> StatementDistributionMessage::NetworkBridgeUpdate`
### Collation V1
* `CollatorProtocolV1Message -> CollatorProtocolMessage::NetworkBridgeUpdate`
- `CollatorProtocolV1Message -> CollatorProtocolMessage::NetworkBridgeUpdate`
[NBM]: ../../types/overseer-protocol.md#network-bridge-message
[AppD]: ../../types/overseer-protocol.md#approval-distribution-message
@@ -1,28 +1,43 @@
# Provisioner
Relay chain block authorship authority is governed by BABE and is beyond the scope of the Overseer and the rest of the subsystems. That said, ultimately the block author needs to select a set of backable parachain candidates and other consensus data, and assemble a block from them. This subsystem is responsible for providing the necessary data to all potential block authors.
Relay chain block authorship authority is governed by BABE and is beyond the scope of the Overseer and the rest of the
subsystems. That said, ultimately the block author needs to select a set of backable parachain candidates and other
consensus data, and assemble a block from them. This subsystem is responsible for providing the necessary data to all
potential block authors.
## Provisionable Data
There are several distinct types of provisionable data, but they share this property in common: all should eventually be included in a relay chain block.
There are several distinct types of provisionable data, but they share this property in common: all should eventually be
included in a relay chain block.
### Backed Candidates
The block author can choose 0 or 1 backed parachain candidates per parachain; the only constraint is that each backable candidate has the appropriate relay parent. However, the choice of a backed candidate must be the block author's. The provisioner subsystem is how those block authors make this choice in practice.
The block author can choose 0 or 1 backed parachain candidates per parachain; the only constraint is that each backable
candidate has the appropriate relay parent. However, the choice of a backed candidate must be the block author's. The
provisioner subsystem is how those block authors make this choice in practice.
### Signed Bitfields
[Signed bitfields](../../types/availability.md#signed-availability-bitfield) are attestations from a particular validator about which candidates it believes are available. Those will only be provided on fresh leaves.
[Signed bitfields](../../types/availability.md#signed-availability-bitfield) are attestations from a particular
validator about which candidates it believes are available. Those will only be provided on fresh leaves.
### Misbehavior Reports
Misbehavior reports are self-contained proofs of misbehavior by a validator or group of validators. For example, it is very easy to verify a double-voting misbehavior report: the report contains two votes signed by the same key, advocating different outcomes. Concretely, misbehavior reports become inherents which cause dots to be slashed.
Misbehavior reports are self-contained proofs of misbehavior by a validator or group of validators. For example, it is
very easy to verify a double-voting misbehavior report: the report contains two votes signed by the same key, advocating
different outcomes. Concretely, misbehavior reports become inherents which cause dots to be slashed.
Note that there is no mechanism in place which forces a block author to include a misbehavior report which it doesn't like, for example if it would be slashed by such a report. The chain's defense against this is to have a relatively long slash period, such that it's likely to encounter an honest author before the slash period expires.
Note that there is no mechanism in place which forces a block author to include a misbehavior report which it doesn't
like, for example if it would be slashed by such a report. The chain's defense against this is to have a relatively long
slash period, such that it's likely to encounter an honest author before the slash period expires.
### Dispute Inherent
The dispute inherent is similar to a misbehavior report in that it is an attestation of misbehavior on the part of a validator or group of validators. Unlike a misbehavior report, it is not self-contained: resolution requires coordinated action by several validators. The canonical example of a dispute inherent involves an approval checker discovering that a set of validators has improperly approved an invalid parachain block: resolving this requires the entire validator set to re-validate the block, so that the minority can be slashed.
The dispute inherent is similar to a misbehavior report in that it is an attestation of misbehavior on the part of a
validator or group of validators. Unlike a misbehavior report, it is not self-contained: resolution requires coordinated
action by several validators. The canonical example of a dispute inherent involves an approval checker discovering that
a set of validators has improperly approved an invalid parachain block: resolving this requires the entire validator set
to re-validate the block, so that the minority can be slashed.
Dispute resolution is complex and is explained in substantially more detail [here](../../runtime/disputes.md).
@@ -34,58 +49,85 @@ The subsystem should maintain a set of handles to Block Authorship Provisioning
- `ActiveLeavesUpdate`:
- For each `activated` head:
- spawn a Block Authorship Provisioning iteration with the given relay parent, storing a bidirectional channel with that iteration.
- spawn a Block Authorship Provisioning iteration with the given relay parent, storing a bidirectional channel with
that iteration.
- For each `deactivated` head:
- terminate the Block Authorship Provisioning iteration for the given relay parent, if any.
- `Conclude`: Forward `Conclude` to all iterations, waiting a small amount of time for them to join, and then hard-exiting.
- `Conclude`: Forward `Conclude` to all iterations, waiting a small amount of time for them to join, and then
hard-exiting.
### On `ProvisionerMessage`
Forward the message to the appropriate Block Authorship Provisioning iteration, or discard if no appropriate iteration is currently active.
Forward the message to the appropriate Block Authorship Provisioning iteration, or discard if no appropriate iteration
is currently active.
### Per Provisioning Iteration
Input: [`ProvisionerMessage`](../../types/overseer-protocol.md#provisioner-message). Backed candidates come from the [Candidate Backing subsystem](../backing/candidate-backing.md), signed bitfields come from the [Bitfield Distribution subsystem](../availability/bitfield-distribution.md), and disputes come from the [Disputes Subsystem](../disputes/dispute-coordinator.md). Misbehavior reports are currently sent from the [Candidate Backing subsystem](../backing/candidate-backing.md) and contain the following misbehaviors:
Input: [`ProvisionerMessage`](../../types/overseer-protocol.md#provisioner-message). Backed candidates come from the
[Candidate Backing subsystem](../backing/candidate-backing.md), signed bitfields come from the [Bitfield Distribution
subsystem](../availability/bitfield-distribution.md), and disputes come from the [Disputes
Subsystem](../disputes/dispute-coordinator.md). Misbehavior reports are currently sent from the [Candidate Backing
subsystem](../backing/candidate-backing.md) and contain the following misbehaviors:
1. `Misbehavior::ValidityDoubleVote`
2. `Misbehavior::MultipleCandidates`
3. `Misbehavior::UnauthorizedStatement`
4. `Misbehavior::DoubleSign`
But we choose not to punish these forms of misbehavior for the time being. Risks from misbehavior are sufficiently mitigated at the protocol level via reputation changes. Punitive actions here may become desirable enough to dedicate time to in the future.
But we choose not to punish these forms of misbehavior for the time being. Risks from misbehavior are sufficiently
mitigated at the protocol level via reputation changes. Punitive actions here may become desirable enough to dedicate
time to in the future.
At initialization, this subsystem has no outputs.
Block authors request the inherent data they should use for constructing the inherent in the block which contains parachain execution information.
Block authors request the inherent data they should use for constructing the inherent in the block which contains
parachain execution information.
## Block Production
When a validator is selected by BABE to author a block, it becomes a block producer. The provisioner is the subsystem best suited to choosing which specific backed candidates and availability bitfields should be assembled into the block. To engage this functionality, a `ProvisionerMessage::RequestInherentData` is sent; the response is a [`ParaInherentData`](../../types/runtime.md#parainherentdata). Each relay chain block backs at most one backable parachain block candidate per parachain. Additionally no further block candidate can be backed until the previous one either gets declared available or expired. If bitfields indicate that candidate A, predecessor of B, should be declared available, then B can be backed in the same relay block. Appropriate bitfields, as outlined in the section on [bitfield selection](#bitfield-selection), and any dispute statements should be attached as well.
When a validator is selected by BABE to author a block, it becomes a block producer. The provisioner is the subsystem
best suited to choosing which specific backed candidates and availability bitfields should be assembled into the block.
To engage this functionality, a `ProvisionerMessage::RequestInherentData` is sent; the response is a
[`ParaInherentData`](../../types/runtime.md#parainherentdata). Each relay chain block backs at most one backable
parachain block candidate per parachain. Additionally no further block candidate can be backed until the previous one
either gets declared available or expired. If bitfields indicate that candidate A, predecessor of B, should be declared
available, then B can be backed in the same relay block. Appropriate bitfields, as outlined in the section on [bitfield
selection](#bitfield-selection), and any dispute statements should be attached as well.
### Bitfield Selection
Our goal with respect to bitfields is simple: maximize availability. However, it's not quite as simple as always including all bitfields; there are constraints which still need to be met:
Our goal with respect to bitfields is simple: maximize availability. However, it's not quite as simple as always
including all bitfields; there are constraints which still need to be met:
- not more than one bitfield per validator
- each 1 bit must correspond to an occupied core
Beyond that, a semi-arbitrary selection policy is fine. In order to meet the goal of maximizing availability, a heuristic of picking the bitfield with the greatest number of 1 bits set in the event of conflict is useful.
Beyond that, a semi-arbitrary selection policy is fine. In order to meet the goal of maximizing availability, a
heuristic of picking the bitfield with the greatest number of 1 bits set in the event of conflict is useful.
### Dispute Statement Selection
This is the point at which the block author provides further votes to active disputes or initiates new disputes in the runtime state.
This is the point at which the block author provides further votes to active disputes or initiates new disputes in the
runtime state.
The block-authoring logic of the runtime has an extra step between handling the inherent-data and producing the actual inherent call, which we assume performs the work of filtering out disputes which are not relevant to the on-chain state. Backing votes are always kept in the dispute statement set. This ensures we punish the maximum number of misbehaving backers.
The block-authoring logic of the runtime has an extra step between handling the inherent-data and producing the actual
inherent call, which we assume performs the work of filtering out disputes which are not relevant to the on-chain state.
Backing votes are always kept in the dispute statement set. This ensures we punish the maximum number of misbehaving
backers.
To select disputes:
- Issue a `DisputeCoordinatorMessage::RecentDisputes` message and wait for the response. This is a set of all disputes in recent sessions which we are aware of.
- Issue a `DisputeCoordinatorMessage::RecentDisputes` message and wait for the response. This is a set of all disputes
in recent sessions which we are aware of.
### Determining Bitfield Availability
An occupied core has a `CoreAvailability` bitfield. We also have a list of `SignedAvailabilityBitfield`s. We need to determine from these whether or not a core at a particular index has become available.
An occupied core has a `CoreAvailability` bitfield. We also have a list of `SignedAvailabilityBitfield`s. We need to
determine from these whether or not a core at a particular index has become available.
The key insight required is that `CoreAvailability` is transverse to the `SignedAvailabilityBitfield`s: if we conceptualize the list of bitfields as many rows, each bit of which is its own column, then `CoreAvailability` for a given core index is the vertical slice of bits in the set at that index.
The key insight required is that `CoreAvailability` is transverse to the `SignedAvailabilityBitfield`s: if we
conceptualize the list of bitfields as many rows, each bit of which is its own column, then `CoreAvailability` for a
given core index is the vertical slice of bits in the set at that index.
To compute bitfield availability, then:
@@ -97,16 +139,22 @@ To compute bitfield availability, then:
### Candidate Selection: Prospective Parachains Mode
The state of the provisioner `PerRelayParent` tracks an important setting, `ProspectiveParachainsMode`. This setting determines which backable candidate selection method the provisioner uses.
The state of the provisioner `PerRelayParent` tracks an important setting, `ProspectiveParachainsMode`. This setting
determines which backable candidate selection method the provisioner uses.
`ProspectiveParachainsMode::Disabled` - The provisioner uses its own internal legacy candidate selection.
`ProspectiveParachainsMode::Enabled` - The provisioner requests that [prospective parachains](../backing/prospective-parachains.md) provide selected candidates.
`ProspectiveParachainsMode::Disabled` - The provisioner uses its own internal legacy candidate selection.
`ProspectiveParachainsMode::Enabled` - The provisioner requests that [prospective
parachains](../backing/prospective-parachains.md) provide selected candidates.
Candidates selected with `ProspectiveParachainsMode::Enabled` are able to benefit from the increased block production time asynchronous backing allows. For this reason all Polkadot protocol networks will eventually use prospective parachains candidate selection. Then legacy candidate selection will be removed as obsolete.
Candidates selected with `ProspectiveParachainsMode::Enabled` are able to benefit from the increased block production
time asynchronous backing allows. For this reason all Polkadot protocol networks will eventually use prospective
parachains candidate selection. Then legacy candidate selection will be removed as obsolete.
### Prospective Parachains Candidate Selection
The goal of candidate selection is to determine which cores are free, and then to the degree possible, pick a candidate appropriate to each free core. In prospective parachains candidate selection the provisioner handles the former process while [prospective parachains](../backing/prospective-parachains.md) handles the latter.
The goal of candidate selection is to determine which cores are free, and then to the degree possible, pick a candidate
appropriate to each free core. In prospective parachains candidate selection the provisioner handles the former process
while [prospective parachains](../backing/prospective-parachains.md) handles the latter.
To select backable candidates:
@@ -116,32 +164,50 @@ To select backable candidates:
- The core is unscheduled and doesnt need to be provisioned with a candidate
- On `CoreState::Scheduled`
- The core is unoccupied and scheduled to accept a backed block for a particular `para_id`.
- The provisioner requests a backable candidate from [prospective parachains](../backing/prospective-parachains.md) with the desired relay parent, the cores scheduled `para_id`, and an empty required path.
- The provisioner requests a backable candidate from [prospective parachains](../backing/prospective-parachains.md)
with the desired relay parent, the cores scheduled `para_id`, and an empty required path.
- On `CoreState::Occupied`
- The availability core is occupied by a parachain block candidate pending availability. A further candidate need not be provided by the provisioner unless the core will be vacated this block. This is the case when either bitfields indicate the current core occupant has been made available or a timeout is reached.
- The availability core is occupied by a parachain block candidate pending availability. A further candidate need
not be provided by the provisioner unless the core will be vacated this block. This is the case when either
bitfields indicate the current core occupant has been made available or a timeout is reached.
- If `bitfields_indicate_availability`
- If `Some(scheduled_core) = occupied_core.next_up_on_available`, the core will be vacated and in need of a provisioned candidate. The provisioner requests a backable candidate from [prospective parachains](../backing/prospective-parachains.md) with the cores scheduled `para_id` and a required path with one entry. This entry corresponds to the parablock candidate previously occupying this core, which was made available and can be built upon even though it hasnt been seen as included in a relay chain block yet. See the Required Path section below for more detail.
- If `occupied_core.next_up_on_available` is `None`, then the core being vacated is unscheduled and doesnt need to be provisioned with a candidate.
- If `Some(scheduled_core) = occupied_core.next_up_on_available`, the core will be vacated and in need of a
provisioned candidate. The provisioner requests a backable candidate from [prospective
parachains](../backing/prospective-parachains.md) with the cores scheduled `para_id` and a required path with
one entry. This entry corresponds to the parablock candidate previously occupying this core, which was made
available and can be built upon even though it hasnt been seen as included in a relay chain block yet. See the
Required Path section below for more detail.
- If `occupied_core.next_up_on_available` is `None`, then the core being vacated is unscheduled and doesnt need
to be provisioned with a candidate.
- Else-if `occupied_core.time_out_at == block_number`
- If `Some(scheduled_core) = occupied_core.next_up_on_timeout`, the core will be vacated and in need of a provisioned candidate. A candidate is requested in exactly the same way as with `CoreState::Scheduled`.
- Else the core being vacated is unscheduled and doesnt need to be provisioned with a candidate
The end result of this process is a vector of `CandidateHash`s, sorted in order of their core index.
- If `Some(scheduled_core) = occupied_core.next_up_on_timeout`, the core will be vacated and in need of a
provisioned candidate. A candidate is requested in exactly the same way as with `CoreState::Scheduled`.
- Else the core being vacated is unscheduled and doesnt need to be provisioned with a candidate The end result of
this process is a vector of `CandidateHash`s, sorted in order of their core index.
#### Required Path
Required path is a parameter for `ProspectiveParachainsMessage::GetBackableCandidate`, which the provisioner sends in candidate selection.
Required path is a parameter for `ProspectiveParachainsMessage::GetBackableCandidate`, which the provisioner sends in
candidate selection.
An empty required path indicates that the requested candidate should be a direct child of the most recently included parablock for the given `para_id` as of the given relay parent.
An empty required path indicates that the requested candidate should be a direct child of the most recently included
parablock for the given `para_id` as of the given relay parent.
In contrast, a required path with one or more entries prompts [prospective parachains](../backing/prospective-parachains.md) to step forward through its fragment tree for the given `para_id` and relay parent until the desired parablock is reached. We then select a direct child of that parablock to pass to the provisioner.
In contrast, a required path with one or more entries prompts [prospective
parachains](../backing/prospective-parachains.md) to step forward through its fragment tree for the given `para_id` and
relay parent until the desired parablock is reached. We then select a direct child of that parablock to pass to the
provisioner.
The parablocks making up a required path do not need to have been previously seen as included in relay chain blocks. Thus the ability to provision backable candidates based on a required path effectively decouples backing from inclusion.
The parablocks making up a required path do not need to have been previously seen as included in relay chain blocks.
Thus the ability to provision backable candidates based on a required path effectively decouples backing from inclusion.
### Legacy Candidate Selection
### Legacy Candidate Selection
Legacy candidate selection takes place in the provisioner. Thus the provisioner needs to keep an up to date record of all [backed_candidates](../../types/backing.md#backed-candidate) `PerRelayParent` to pick from.
Legacy candidate selection takes place in the provisioner. Thus the provisioner needs to keep an up to date record of
all [backed_candidates](../../types/backing.md#backed-candidate) `PerRelayParent` to pick from.
The goal of candidate selection is to determine which cores are free, and then to the degree possible, pick a candidate appropriate to each free core.
The goal of candidate selection is to determine which cores are free, and then to the degree possible, pick a candidate
appropriate to each free core.
To determine availability:
@@ -149,38 +215,54 @@ To determine availability:
- For each core state:
- On `CoreState::Scheduled`, then we can make an `OccupiedCoreAssumption::Free`.
- On `CoreState::Occupied`, then we may be able to make an assumption:
- If the bitfields indicate availability and there is a scheduled `next_up_on_available`, then we can make an `OccupiedCoreAssumption::Included`.
- If the bitfields do not indicate availability, and there is a scheduled `next_up_on_time_out`, and `occupied_core.time_out_at == block_number_under_production`, then we can make an `OccupiedCoreAssumption::TimedOut`.
- If the bitfields indicate availability and there is a scheduled `next_up_on_available`, then we can make an
`OccupiedCoreAssumption::Included`.
- If the bitfields do not indicate availability, and there is a scheduled `next_up_on_time_out`, and
`occupied_core.time_out_at == block_number_under_production`, then we can make an
`OccupiedCoreAssumption::TimedOut`.
- If we did not make an `OccupiedCoreAssumption`, then continue on to the next core.
- Now compute the core's `validation_data_hash`: get the `PersistedValidationData` from the runtime, given the known `ParaId` and `OccupiedCoreAssumption`;
- Now compute the core's `validation_data_hash`: get the `PersistedValidationData` from the runtime, given the known
`ParaId` and `OccupiedCoreAssumption`;
- Find an appropriate candidate for the core.
- There are two constraints: `backed_candidate.candidate.descriptor.para_id == scheduled_core.para_id && candidate.candidate.descriptor.validation_data_hash == computed_validation_data_hash`.
- In the event that more than one candidate meets the constraints, selection between the candidates is arbitrary. However, not more than one candidate can be selected per core.
- There are two constraints: `backed_candidate.candidate.descriptor.para_id == scheduled_core.para_id &&
candidate.candidate.descriptor.validation_data_hash == computed_validation_data_hash`.
- In the event that more than one candidate meets the constraints, selection between the candidates is arbitrary.
However, not more than one candidate can be selected per core.
The end result of this process is a vector of `CandidateHash`s, sorted in order of their core index.
### Retrieving Full `BackedCandidate`s for Selected Hashes
Legacy candidate selection and prospective parachains candidate selection both leave us with a vector of `CandidateHash`s. These are passed to the backing subsystem with `CandidateBackingMessage::GetBackedCandidates`.
Legacy candidate selection and prospective parachains candidate selection both leave us with a vector of
`CandidateHash`s. These are passed to the backing subsystem with `CandidateBackingMessage::GetBackedCandidates`.
The response is a vector of `BackedCandidate`s, sorted in order of their core index and ready to be provisioned to block authoring. The candidate selection and retrieval process should select at maximum one candidate which upgrades the runtime validation code.
The response is a vector of `BackedCandidate`s, sorted in order of their core index and ready to be provisioned to block
authoring. The candidate selection and retrieval process should select at maximum one candidate which upgrades the
runtime validation code.
## Glossary
- **Relay-parent:**
- A particular relay-chain block which serves as an anchor and reference point for processes and data which depend on relay-chain state.
- **Active Leaf:**
- A relay chain block which is the head of an active fork of the relay chain.
- **Relay-parent:**
- A particular relay-chain block which serves as an anchor and reference point for processes and data which depend on
relay-chain state.
- **Active Leaf:**
- A relay chain block which is the head of an active fork of the relay chain.
- Block authorship provisioning jobs are spawned per active leaf and concluded for any leaves which become inactive.
- **Candidate Selection:**
- **Candidate Selection:**
- The process by which the provisioner selects backable parachain block candidates to pass to block authoring.
- Two versions, prospective parachains candidate selection and legacy candidate selection. See their respective protocol sections for details.
- **Availability Core:**
- Often referred to simply as "cores", availability cores are an abstraction used for resource management. For the provisioner, availability cores are most relevant in that core states determine which `para_id`s to provision backable candidates for.
- For more on availability cores see [Scheduler Module: Availability Cores](../../runtime/scheduler.md#availability-cores)
- Two versions, prospective parachains candidate selection and legacy candidate selection. See their respective
protocol sections for details.
- **Availability Core:**
- Often referred to simply as "cores", availability cores are an abstraction used for resource management. For the
provisioner, availability cores are most relevant in that core states determine which `para_id`s to provision
backable candidates for.
- For more on availability cores see [Scheduler Module: Availability
Cores](../../runtime/scheduler.md#availability-cores)
- **Availability Bitfield:**
- Often referred to simply as a "bitfield", an availability bitfield represents the view of parablock candidate availability from a particular validator's perspective. Each bit in the bitfield corresponds to a single [availability core](../../runtime-api/availability-cores.md).
- Often referred to simply as a "bitfield", an availability bitfield represents the view of parablock candidate
availability from a particular validator's perspective. Each bit in the bitfield corresponds to a single
[availability core](../../runtime-api/availability-cores.md).
- For more on availability bitfields see [availability](../../types/availability.md)
- **Backable vs. Backed:**
- Note that we sometimes use "backed" to refer to candidates that are "backable", but not yet backed on chain.
- Backable means that a quorum of the candidate's assigned backing group have provided signed affirming statements.
- Backable means that a quorum of the candidate's assigned backing group have provided signed affirming statements.
@@ -1,45 +1,70 @@
# PVF Pre-checker
The PVF pre-checker is a subsystem that is responsible for watching the relay chain for new PVFs that require pre-checking. Head over to [overview] for the PVF pre-checking process overview.
The PVF pre-checker is a subsystem that is responsible for watching the relay chain for new PVFs that require
pre-checking. Head over to [overview] for the PVF pre-checking process overview.
## Protocol
There is no dedicated input mechanism for PVF pre-checker. Instead, PVF pre-checker looks on the `ActiveLeavesUpdate` event stream for work.
There is no dedicated input mechanism for PVF pre-checker. Instead, PVF pre-checker looks on the `ActiveLeavesUpdate`
event stream for work.
This subsytem does not produce any output messages either. The subsystem will, however, send messages to the [Runtime API] subsystem to query for the pending PVFs and to submit votes. In addition to that, it will also communicate with [Candidate Validation] Subsystem to request PVF pre-check.
This subsytem does not produce any output messages either. The subsystem will, however, send messages to the [Runtime
API] subsystem to query for the pending PVFs and to submit votes. In addition to that, it will also communicate with
[Candidate Validation] Subsystem to request PVF pre-check.
## Functionality
If the node is running in a collator mode, this subsystem will be disabled. The PVF pre-checker subsystem keeps track of the PVFs that are relevant for the subsystem.
If the node is running in a collator mode, this subsystem will be disabled. The PVF pre-checker subsystem keeps track of
the PVFs that are relevant for the subsystem.
To be relevant for the subsystem, a PVF must be returned by the [`pvfs_require_precheck` runtime API][PVF pre-checking runtime API] in any of the active leaves. If the PVF is not present in any of the active leaves, it ceases to be relevant.
To be relevant for the subsystem, a PVF must be returned by the [`pvfs_require_precheck` runtime API][PVF pre-checking
runtime API] in any of the active leaves. If the PVF is not present in any of the active leaves, it ceases to be
relevant.
When a PVF just becomes relevant, the subsystem will send a message to the [Candidate Validation] subsystem asking for the pre-check.
When a PVF just becomes relevant, the subsystem will send a message to the [Candidate Validation] subsystem asking for
the pre-check.
Upon receving a message from the candidate-validation subsystem, the pre-checker will note down that the PVF has its judgement and will also sign and submit a [`PvfCheckStatement`][PvfCheckStatement] via the [`submit_pvf_check_statement` runtime API][PVF pre-checking runtime API]. In case, a judgement was received for a PVF that is no longer in view it is ignored.
Upon receving a message from the candidate-validation subsystem, the pre-checker will note down that the PVF has its
judgement and will also sign and submit a [`PvfCheckStatement`][PvfCheckStatement] via the [`submit_pvf_check_statement`
runtime API][PVF pre-checking runtime API]. In case, a judgement was received for a PVF that is no longer in view it is
ignored.
Since a vote only is valid during [one session][overview], the subsystem will have to resign and submit the statements for the new session. The new session is assumed to be started if at least one of the leaves has a greater session index that was previously observed in any of the leaves.
Since a vote only is valid during [one session][overview], the subsystem will have to resign and submit the statements
for the new session. The new session is assumed to be started if at least one of the leaves has a greater session index
that was previously observed in any of the leaves.
The subsystem tracks all the statements that it submitted within a session. If for some reason a PVF became irrelevant and then becomes relevant again, the subsystem will not submit a new statement for that PVF within the same session.
The subsystem tracks all the statements that it submitted within a session. If for some reason a PVF became irrelevant
and then becomes relevant again, the subsystem will not submit a new statement for that PVF within the same session.
If the node is not in the active validator set, it will still perform all the checks. However, it will only submit the check statements when the node is in the active validator set.
If the node is not in the active validator set, it will still perform all the checks. However, it will only submit the
check statements when the node is in the active validator set.
### Rejecting failed PVFs
It is possible that the candidate validation was not able to check the PVF, e.g. if it timed out. In that case, the PVF pre-checker will vote against it. This is considered safe, as there is no slashing for being on the wrong side of a pre-check vote.
It is possible that the candidate validation was not able to check the PVF, e.g. if it timed out. In that case, the PVF
pre-checker will vote against it. This is considered safe, as there is no slashing for being on the wrong side of a
pre-check vote.
Rejecting instead of abstaining is better in several ways:
1. Conclusion is reached faster - we have actual votes, instead of relying on a timeout.
1. Being strict in pre-checking makes it safer to be more lenient in preparation errors afterwards. Hence we have more leeway in avoiding raising dubious disputes, without making things less secure.
1. Being strict in pre-checking makes it safer to be more lenient in preparation errors afterwards. Hence we have more
leeway in avoiding raising dubious disputes, without making things less secure.
Also, if we only abstain, an attacker can specially craft a PVF wasm blob so that it will fail on e.g. 50% of the validators. In that case a supermajority will never be reached and the vote will repeat multiple times, most likely with the same result (since all votes are cleared on a session change). This is avoided by rejecting failed PVFs, and by only requiring 1/3 of validators to reject a PVF to reach a decision.
Also, if we only abstain, an attacker can specially craft a PVF wasm blob so that it will fail on e.g. 50% of the
validators. In that case a supermajority will never be reached and the vote will repeat multiple times, most likely with
the same result (since all votes are cleared on a session change). This is avoided by rejecting failed PVFs, and by only
requiring 1/3 of validators to reject a PVF to reach a decision.
### Note on Disputes
Having a pre-checking phase allows us to make certain assumptions later when preparing the PVF for execution. If a runtime passed pre-checking, then we know that the runtime should be valid, and therefore any issue during preparation for execution can be assumed to be a local problem on the current node.
Having a pre-checking phase allows us to make certain assumptions later when preparing the PVF for execution. If a
runtime passed pre-checking, then we know that the runtime should be valid, and therefore any issue during preparation
for execution can be assumed to be a local problem on the current node.
For this reason, even deterministic preparation errors should not trigger disputes. And since we do not dispute as a result of the pre-checking phase, as stated above, it should be impossible for preparation in general to result in disputes.
For this reason, even deterministic preparation errors should not trigger disputes. And since we do not dispute as a
result of the pre-checking phase, as stated above, it should be impossible for preparation in general to result in
disputes.
[overview]: ../../pvf-prechecking.md
[Runtime API]: runtime-api.md
@@ -1,6 +1,7 @@
# Runtime API
The Runtime API subsystem is responsible for providing a single point of access to runtime state data via a set of pre-determined queries. This prevents shared ownership of a blockchain client resource by providing
The Runtime API subsystem is responsible for providing a single point of access to runtime state data via a set of
pre-determined queries. This prevents shared ownership of a blockchain client resource by providing
## Protocol
@@ -10,8 +11,11 @@ Output: None
## Functionality
On receipt of `RuntimeApiMessage::Request(relay_parent, request)`, answer the request using the post-state of the `relay_parent` provided and provide the response to the side-channel embedded within the request.
On receipt of `RuntimeApiMessage::Request(relay_parent, request)`, answer the request using the post-state of the
`relay_parent` provided and provide the response to the side-channel embedded within the request.
## Jobs
> TODO Don't limit requests based on parent hash, but limit caching. No caching should be done for any requests on `relay_parent`s that are not active based on `ActiveLeavesUpdate` messages. Maybe with some leeway for things that have just been stopped.
> TODO Don't limit requests based on parent hash, but limit caching. No caching should be done for any requests on
> `relay_parent`s that are not active based on `ActiveLeavesUpdate` messages. Maybe with some leeway for things that
> have just been stopped.