feat: initialize Kurdistan SDK - independent fork of Polkadot SDK
This commit is contained in:
@@ -0,0 +1,10 @@
|
||||
# Approval Subsystems
|
||||
|
||||
The approval subsystems implement the node-side of the [Approval Protocol](../../protocol-approval.md).
|
||||
|
||||
We make a divide between the [assignment/voting logic](approval-voting.md) and the [distribution
|
||||
logic](approval-distribution.md) that distributes assignment certifications and approval votes. The logic in the
|
||||
assignment and voting also informs the GRANDPA voting rule on how to vote.
|
||||
|
||||
These subsystems are intended to flag issues and begin participating in live disputes. Dispute subsystems also track all
|
||||
observed votes (backing, approval, and dispute-specific) by all validators on all candidates.
|
||||
@@ -0,0 +1,348 @@
|
||||
# Approval Distribution
|
||||
|
||||
A subsystem for the distribution of assignments and approvals for approval checks on candidates over the network.
|
||||
|
||||
The [Approval Voting](approval-voting.md) subsystem is responsible for active participation in a protocol designed to
|
||||
select a sufficient number of validators to check each and every candidate which appears in the relay chain. Statements
|
||||
of participation in this checking process are divided into two kinds:
|
||||
* **Assignments** indicate that validators have been selected to do checking
|
||||
* **Approvals** indicate that validators have checked and found the candidate satisfactory.
|
||||
|
||||
The [Approval Voting](approval-voting.md) subsystem handles all the issuing and tallying of this protocol, but this
|
||||
subsystem is responsible for the disbursal of statements among the validator-set.
|
||||
|
||||
The inclusion pipeline of candidates concludes after availability, and only after inclusion do candidates actually get
|
||||
pushed into the approval checking pipeline. As such, this protocol deals with the candidates _made available by_
|
||||
particular blocks, as opposed to the candidates which actually appear within those blocks, which are the candidates
|
||||
_backed by_ those blocks. Unless stated otherwise, whenever we reference a candidate partially by block hash, we are
|
||||
referring to the set of candidates _made available by_ those blocks.
|
||||
|
||||
We implement this protocol as a gossip protocol, and like other teyrchain-related gossip protocols our primary concerns
|
||||
are about ensuring fast message propagation while maintaining an upper bound on the number of messages any given node
|
||||
must store at any time.
|
||||
|
||||
Approval messages should always follow assignments, so we need to be able to discern two pieces of information based on
|
||||
our [View](../../types/network.md#universal-types):
|
||||
1. Is a particular assignment relevant under a given `View`?
|
||||
2. Is a particular approval relevant to any assignment in a set?
|
||||
|
||||
For our own local view, these two queries must not yield false negatives. When applied to our peers' views, it is
|
||||
acceptable for them to yield false negatives. The reason for that is that our peers' views may be beyond ours, and we
|
||||
are not capable of fully evaluating them. Once we have caught up, we can check again for false negatives to continue
|
||||
distributing.
|
||||
|
||||
For assignments, what we need to be checking is whether we are aware of the (block, candidate) pair that the assignment
|
||||
references. For approvals, we need to be aware of an assignment by the same validator which references the candidate
|
||||
being approved.
|
||||
|
||||
However, awareness on its own of a (block, candidate) pair would imply that even ancient candidates all the way back to
|
||||
the genesis are relevant. We are actually not interested in anything before finality.
|
||||
|
||||
We gossip assignments along a grid topology produced by the [Gossip Support Subsystem](../utility/gossip-support.md) and
|
||||
also to a few random peers. The first time we accept an assignment or approval, regardless of the source, which
|
||||
originates from a validator peer in a shared dimension of the grid, we propagate the message to validator peers in the
|
||||
unshared dimension as well as a few random peers.
|
||||
|
||||
But, in case these mechanisms don't work on their own, we need to trade bandwidth for protocol liveness by introducing
|
||||
aggression.
|
||||
|
||||
Aggression has 3 levels:
|
||||
* Aggression Level 0: The basic behaviors described above.
|
||||
* Aggression Level 1: The originator of a message sends to all peers. Other peers follow the rules above.
|
||||
* Aggression Level 2: All peers send all messages to all their row and column neighbors. This means that each validator
|
||||
will, on average, receive each message approximately 2*sqrt(n) times.
|
||||
|
||||
These aggression levels are chosen based on how long a block has taken to finalize: assignments and approvals related to
|
||||
the unfinalized block will be propagated with more aggression. In particular, it's only the earliest unfinalized blocks
|
||||
that aggression should be applied to, because descendants may be unfinalized only by virtue of being descendants.
|
||||
|
||||
## Protocol
|
||||
|
||||
Input:
|
||||
* `ApprovalDistributionMessage::NewBlocks`
|
||||
* `ApprovalDistributionMessage::DistributeAssignment`
|
||||
* `ApprovalDistributionMessage::DistributeApproval`
|
||||
* `ApprovalDistributionMessage::NetworkBridgeUpdate`
|
||||
* `OverseerSignal::BlockFinalized`
|
||||
|
||||
Output:
|
||||
* `ApprovalVotingMessage::ImportAssignment`
|
||||
* `ApprovalVotingMessage::ImportApproval`
|
||||
* `NetworkBridgeMessage::SendValidationMessage::ApprovalDistribution`
|
||||
|
||||
## Functionality
|
||||
|
||||
```rust
|
||||
type BlockScopedCandidate = (Hash, CandidateHash);
|
||||
|
||||
enum PendingMessage {
|
||||
Assignment(IndirectAssignmentCert, CoreIndex),
|
||||
Approval(IndirectSignedApprovalVote),
|
||||
}
|
||||
|
||||
/// The `State` struct is responsible for tracking the overall state of the subsystem.
|
||||
///
|
||||
/// It tracks metadata about our view of the unfinalized chain, which assignments and approvals we have seen, and our peers' views.
|
||||
struct State {
|
||||
// These two fields are used in conjunction to construct a view over the unfinalized chain.
|
||||
blocks_by_number: BTreeMap<BlockNumber, Vec<Hash>>,
|
||||
blocks: HashMap<Hash, BlockEntry>,
|
||||
|
||||
/// Our view updates to our peers can race with `NewBlocks` updates. We store messages received
|
||||
/// against the directly mentioned blocks in our view in this map until `NewBlocks` is received.
|
||||
///
|
||||
/// As long as the parent is already in the `blocks` map and `NewBlocks` messages aren't delayed
|
||||
/// by more than a block length, this strategy will work well for mitigating the race. This is
|
||||
/// also a race that occurs typically on local networks.
|
||||
pending_known: HashMap<Hash, Vec<(PeerId, PendingMessage>)>>,
|
||||
|
||||
// Peer view data is partially stored here, and partially inline within the `BlockEntry`s
|
||||
peer_views: HashMap<PeerId, View>,
|
||||
}
|
||||
|
||||
enum MessageFingerprint {
|
||||
Assignment(Hash, u32, ValidatorIndex),
|
||||
Approval(Hash, u32, ValidatorIndex),
|
||||
}
|
||||
|
||||
struct Knowledge {
|
||||
known_messages: HashSet<MessageFingerprint>,
|
||||
}
|
||||
|
||||
struct PeerKnowledge {
|
||||
/// The knowledge we've sent to the peer.
|
||||
sent: Knowledge,
|
||||
/// The knowledge we've received from the peer.
|
||||
received: Knowledge,
|
||||
}
|
||||
|
||||
/// Information about blocks in our current view as well as whether peers know of them.
|
||||
struct BlockEntry {
|
||||
// Peers who we know are aware of this block and thus, the candidates within it. This maps to their knowledge of messages.
|
||||
known_by: HashMap<PeerId, PeerKnowledge>,
|
||||
// The number of the block.
|
||||
number: BlockNumber,
|
||||
// The parent hash of the block.
|
||||
parent_hash: Hash,
|
||||
// Our knowledge of messages.
|
||||
knowledge: Knowledge,
|
||||
// A votes entry for each candidate.
|
||||
candidates: IndexMap<CandidateHash, CandidateEntry>,
|
||||
}
|
||||
|
||||
enum ApprovalState {
|
||||
Assigned(AssignmentCert),
|
||||
Approved(AssignmentCert, ApprovalSignature),
|
||||
}
|
||||
|
||||
/// Information about candidates in the context of a particular block they are included in. In other words,
|
||||
/// multiple `CandidateEntry`s may exist for the same candidate, if it is included by multiple blocks - this is likely the case
|
||||
/// when there are forks.
|
||||
struct CandidateEntry {
|
||||
approvals: HashMap<ValidatorIndex, ApprovalState>,
|
||||
}
|
||||
```
|
||||
|
||||
### Network updates
|
||||
|
||||
#### `NetworkBridgeEvent::PeerConnected`
|
||||
|
||||
Add a blank view to the `peer_views` state.
|
||||
|
||||
#### `NetworkBridgeEvent::PeerDisconnected`
|
||||
|
||||
Remove the view under the associated `PeerId` from `State::peer_views`.
|
||||
|
||||
Iterate over every `BlockEntry` and remove `PeerId` from it.
|
||||
|
||||
#### `NetworkBridgeEvent::OurViewChange`
|
||||
|
||||
Remove entries in `pending_known` for all hashes not present in the view. Ensure a vector is present in `pending_known`
|
||||
for each hash in the view that does not have an entry in `blocks`.
|
||||
|
||||
#### `NetworkBridgeEvent::PeerViewChange`
|
||||
|
||||
Invoke `unify_with_peer(peer, view)` to catch them up to messages we have.
|
||||
|
||||
We also need to use the `view.finalized_number` to remove the `PeerId` from any blocks that it won't be wanting
|
||||
information about anymore. Note that we have to be on guard for peers doing crazy stuff like jumping their
|
||||
`finalized_number` forward 10 trillion blocks to try and get us stuck in a loop for ages.
|
||||
|
||||
One of the safeguards we can implement is to reject view updates from peers where the new `finalized_number` is less
|
||||
than the previous.
|
||||
|
||||
We augment that by defining `constrain(x)` to output the x bounded by the first and last numbers in
|
||||
`state.blocks_by_number`.
|
||||
|
||||
From there, we can loop backwards from `constrain(view.finalized_number)` until `constrain(last_view.finalized_number)`
|
||||
is reached, removing the `PeerId` from all `BlockEntry`s referenced at that height. We can break the loop early if we
|
||||
ever exit the bound supplied by the first block in `state.blocks_by_number`.
|
||||
|
||||
#### `NetworkBridgeEvent::PeerMessage`
|
||||
|
||||
If the block hash referenced by the message exists in `pending_known`, add it to the vector of pending messages and
|
||||
return.
|
||||
|
||||
If the message is of type `ApprovalDistributionV1Message::Assignment(assignment_cert, claimed_index)`, then call
|
||||
`import_and_circulate_assignment(MessageSource::Peer(sender), assignment_cert, claimed_index)`
|
||||
|
||||
If the message is of type `ApprovalDistributionV1Message::Approval(approval_vote)`, then call
|
||||
`import_and_circulate_approval(MessageSource::Peer(sender), approval_vote)`
|
||||
|
||||
### Subsystem Updates
|
||||
|
||||
#### `ApprovalDistributionMessage::NewBlocks`
|
||||
|
||||
Create `BlockEntry` and `CandidateEntries` for all blocks.
|
||||
|
||||
For all entries in `pending_known`:
|
||||
* If there is now an entry under `blocks` for the block hash, drain all messages and import with
|
||||
`import_and_circulate_assignment` and `import_and_circulate_approval`.
|
||||
|
||||
For all peers:
|
||||
* Compute `view_intersection` as the intersection of the peer's view blocks with the hashes of the new blocks.
|
||||
* Invoke `unify_with_peer(peer, view_intersection)`.
|
||||
|
||||
#### `ApprovalDistributionMessage::DistributeAssignment`
|
||||
|
||||
Call `import_and_circulate_assignment` with `MessageSource::Local`.
|
||||
|
||||
#### `ApprovalDistributionMessage::DistributeApproval`
|
||||
|
||||
Call `import_and_circulate_approval` with `MessageSource::Local`.
|
||||
|
||||
#### `OverseerSignal::BlockFinalized`
|
||||
|
||||
Prune all lists from `blocks_by_number` with number less than or equal to `finalized_number`. Prune all the
|
||||
`BlockEntry`s referenced by those lists.
|
||||
|
||||
|
||||
### Utility
|
||||
|
||||
```rust
|
||||
enum MessageSource {
|
||||
Peer(PeerId),
|
||||
Local,
|
||||
}
|
||||
```
|
||||
|
||||
#### `import_and_circulate_assignment(...)`
|
||||
|
||||
`import_and_circulate_assignment(source: MessageSource, assignment: IndirectAssignmentCert, claimed_candidate_index:
|
||||
CandidateIndex)`
|
||||
|
||||
Imports an assignment cert referenced by block hash and candidate index. As a postcondition, if the cert is valid, it
|
||||
will have distributed the cert to all peers who have the block in their view, with the exclusion of the peer referenced
|
||||
by the `MessageSource`.
|
||||
|
||||
We maintain a few invariants:
|
||||
* we only send an assignment to a peer after we add its fingerprint to our knowledge
|
||||
* we add a fingerprint of an assignment to our knowledge only if it's valid and hasn't been added before
|
||||
|
||||
The algorithm is the following:
|
||||
|
||||
* Load the `BlockEntry` using `assignment.block_hash`. If it does not exist, report the source if it is
|
||||
`MessageSource::Peer` and return.
|
||||
* Compute a fingerprint for the `assignment` using `claimed_candidate_index`.
|
||||
* If the source is `MessageSource::Peer(sender)`:
|
||||
* check if `peer` appears under `known_by` and whether the fingerprint is in the knowledge of the peer. If the peer
|
||||
does not know the block, report for providing data out-of-view and proceed. If the peer does know the block and
|
||||
the `sent` knowledge contains the fingerprint, report for providing replicate data and return, otherwise, insert
|
||||
into the `received` knowledge and return.
|
||||
* If the message fingerprint appears under the `BlockEntry`'s `Knowledge`, give the peer a small positive reputation
|
||||
boost, add the fingerprint to the peer's knowledge only if it knows about the block and return. Note that we must do
|
||||
this after checking for out-of-view and if the peers knows about the block to avoid being spammed. If we did this
|
||||
check earlier, a peer could provide data out-of-view repeatedly and be rewarded for it.
|
||||
* Check the assignment certificate is valid.
|
||||
* If the cert kind is `RelayVRFModulo`, then the certificate is valid as long as `sample <
|
||||
session_info.relay_vrf_samples` and the VRF is valid for the validator's key with the input
|
||||
`block_entry.relay_vrf_story ++ sample.encode()` as described with
|
||||
[the approvals protocol section](../../protocol-approval.md#assignment-criteria). We set
|
||||
`core_index = vrf.make_bytes().to_u32() % session_info.n_cores`. If the `BlockEntry` causes
|
||||
inclusion of a candidate at `core_index`, then this is a valid assignment for the candidate
|
||||
at `core_index` and has delay tranche 0. Otherwise, it can be ignored.
|
||||
* If the cert kind is `RelayVRFModuloCompact`, then the certificate is valid as long as the VRF
|
||||
is valid for the validator's key with the input `block_entry.relay_vrf_story ++ relay_vrf_samples.encode()`
|
||||
as described with [the approvals protocol section](../../protocol-approval.md#assignment-criteria).
|
||||
We enforce that all `core_bitfield` indices are included in the set of the core indices sampled from the
|
||||
VRF Output. The assignment is considered a valid tranche0 assignment for all claimed candidates if all
|
||||
`core_bitfield` indices match the core indices where the claimed candidates were included at.
|
||||
* If the cert kind is `RelayVRFDelay`, then we check if the VRF is valid for the validator's key with the
|
||||
input `block_entry.relay_vrf_story ++ cert.core_index.encode()` as described in [the approvals protocol
|
||||
section](../../protocol-approval.md#assignment-criteria). The cert can be ignored if the block did not
|
||||
cause inclusion of a candidate on that core index. Otherwise, this is a valid assignment for the included
|
||||
candidate. The delay tranche for the assignment is determined by reducing
|
||||
`(vrf.make_bytes().to_u64() % (session_info.n_delay_tranches + session_info.zeroth_delay_tranche_width)).saturating_sub(session_info.zeroth_delay_tranche_width)`.
|
||||
* We also check that the core index derived by the output is covered by the `VRFProof` by means of an auxiliary signature.
|
||||
* If the delay tranche is too far in the future, return `AssignmentCheckResult::TooFarInFuture`.
|
||||
* If the result is `AssignmentCheckResult::Accepted`
|
||||
* Dispatch `ApprovalVotingMessage::ImportAssignment(assignment)` to approval-voting to import the assignment.
|
||||
* If the vote was accepted but not duplicate, give the peer a positive reputation boost
|
||||
* add the fingerprint to both our and the peer's knowledge in the `BlockEntry`. Note that we only doing this after
|
||||
making sure we have the right fingerprint.
|
||||
* If the result is `AssignmentCheckResult::AcceptedDuplicate`, add the fingerprint to the peer's knowledge if it
|
||||
knows about the block and return.
|
||||
* If the result is `AssignmentCheckResult::TooFarInFuture`, mildly punish the peer and return.
|
||||
* If the result is `AssignmentCheckResult::Bad`, punish the peer and return.
|
||||
* If the source is `MessageSource::Local(CandidateIndex)`
|
||||
* check if the fingerprint appears under the `BlockEntry's` knowledge. If not, add it.
|
||||
* Load the candidate entry for the given candidate index. It should exist unless there is a logic error in the
|
||||
approval voting subsystem.
|
||||
* Set the approval state for the validator index to `ApprovalState::Assigned` unless the approval state is set
|
||||
already. This should not happen as long as the approval voting subsystem instructs us to ignore duplicate
|
||||
assignments.
|
||||
* Dispatch a `ApprovalDistributionV1Message::Assignment(assignment, candidate_index)` to all peers in the
|
||||
`BlockEntry`'s `known_by` set, excluding the peer in the `source`, if `source` has kind `MessageSource::Peer`. Add
|
||||
the fingerprint of the assignment to the knowledge of each peer.
|
||||
|
||||
|
||||
#### `import_and_circulate_approval(source: MessageSource, approval: IndirectSignedApprovalVote)`
|
||||
|
||||
Imports an approval signature referenced by block hash and candidate index:
|
||||
|
||||
* Load the `BlockEntry` using `approval.block_hash` and the candidate entry using `approval.candidate_entry`. If
|
||||
either does not exist, report the source if it is `MessageSource::Peer` and return.
|
||||
* Compute a fingerprint for the approval.
|
||||
* Compute a fingerprint for the corresponding assignment. If the `BlockEntry`'s knowledge does not contain that
|
||||
fingerprint, then report the source if it is `MessageSource::Peer` and return. All references to a fingerprint after
|
||||
this refer to the approval's, not the assignment's.
|
||||
* If the source is `MessageSource::Peer(sender)`:
|
||||
* check if `peer` appears under `known_by` and whether the fingerprint is in the knowledge of the peer. If the peer
|
||||
does not know the block, report for providing data out-of-view and proceed. If the peer does know the block and
|
||||
the `sent` knowledge contains the fingerprint, report for providing replicate data and return, otherwise, insert
|
||||
into the `received` knowledge and return.
|
||||
* If the message fingerprint appears under the `BlockEntry`'s `Knowledge`, give the peer a small positive reputation
|
||||
boost, add the fingerprint to the peer's knowledge only if it knows about the block and return. Note that we must do
|
||||
this after checking for out-of-view to avoid being spammed. If we did this check earlier, a peer could provide data
|
||||
out-of-view repeatedly and be rewarded for it.
|
||||
* Construct a `SignedApprovalVote` using the candidates hashes and check against the validator's approval key,
|
||||
based on the session info of the block. If invalid or no such validator, return `Err(InvalidVoteError)`.
|
||||
* If the result of checking the signature is `Ok(CheckedIndirectSignedApprovalVote)`:
|
||||
* Dispatch `ApprovalVotingMessage::ImportApproval(approval)` .
|
||||
* Give the peer a positive reputation boost and add the fingerprint to both our and the peer's knowledge.
|
||||
* If the result is `Err(InvalidVoteError)`:
|
||||
* Report the peer and return.
|
||||
* Load the candidate entry for the given candidate index. It should exist unless there is a logic error in the
|
||||
approval voting subsystem.
|
||||
* Set the approval state for the validator index to `ApprovalState::Approved`. It should already be in the `Assigned`
|
||||
state as our `BlockEntry` knowledge contains a fingerprint for the assignment.
|
||||
* Dispatch a `ApprovalDistributionV1Message::Approval(approval)` to all peers in the `BlockEntry`'s `known_by` set,
|
||||
excluding the peer in the `source`, if `source` has kind `MessageSource::Peer`. Add the fingerprint of the
|
||||
assignment to the knowledge of each peer. Note that this obeys the politeness conditions:
|
||||
* We guarantee elsewhere that all peers within `known_by` are aware of all assignments relative to the block.
|
||||
* We've checked that this specific approval has a corresponding assignment within the `BlockEntry`.
|
||||
* Thus, all peers are aware of the assignment or have a message to them in-flight which will make them so.
|
||||
|
||||
#### `unify_with_peer(peer: PeerId, view)`
|
||||
|
||||
1. Initialize a set `missing_knowledge = {}`
|
||||
|
||||
For each block in the view:
|
||||
1. Load the `BlockEntry` for the block. If the block is unknown, or the number is less than or equal to the view's
|
||||
finalized number go to step 6.
|
||||
1. Inspect the `known_by` set of the `BlockEntry`. If the peer already knows all assignments/approvals, go to step 6.
|
||||
1. Add the peer to `known_by` and add the hash and missing knowledge of the block to `missing_knowledge`.
|
||||
1. Return to step 2 with the ancestor of the block.
|
||||
|
||||
1. For each block in `missing_knowledge`, send all assignments and approvals for all candidates in those blocks to the
|
||||
peer.
|
||||
@@ -0,0 +1,30 @@
|
||||
# Approval voting parallel
|
||||
|
||||
The approval-voting-parallel subsystem acts as an orchestrator for the tasks handled by the [Approval Voting](approval-voting.md)
|
||||
and [Approval Distribution](approval-distribution.md) subsystems. Initially, these two systems operated separately and interacted
|
||||
with each other and other subsystems through orchestra.
|
||||
|
||||
With approval-voting-parallel, we have a single subsystem that creates two types of workers:
|
||||
- Four approval-distribution workers that operate in parallel, each handling tasks based on the validator_index of the message
|
||||
originator.
|
||||
- One approval-voting worker that performs the tasks previously managed by the standalone approval-voting subsystem.
|
||||
|
||||
This subsystem does not maintain any state. Instead, it functions as an orchestrator that:
|
||||
- Spawns and initializes each workers.
|
||||
- Forwards each message and signal to the appropriate worker.
|
||||
- Aggregates results for messages that require input from more than one worker, such as GetApprovalSignatures.
|
||||
|
||||
## Forwarding logic
|
||||
|
||||
The messages received and forwarded by approval-voting-parallel split in three categories:
|
||||
- Signals which need to be forwarded to all workers.
|
||||
- Messages that only the `approval-voting` worker needs to handle, `ApprovalVotingParallelMessage::ApprovedAncestor`
|
||||
and `ApprovalVotingParallelMessage::GetApprovalSignaturesForCandidate`
|
||||
- Control messages that all `approval-distribution` workers need to receive `ApprovalVotingParallelMessage::NewBlocks`,
|
||||
`ApprovalVotingParallelMessage::ApprovalCheckingLagUpdate` and all network bridge variants `ApprovalVotingParallelMessage::NetworkBridgeUpdate`
|
||||
except `ApprovalVotingParallelMessage::NetworkBridgeUpdate(NetworkBridgeEvent::PeerMessage)`
|
||||
- Data messages `ApprovalVotingParallelMessage::NetworkBridgeUpdate(NetworkBridgeEvent::PeerMessage)` which need to be sent
|
||||
just to a single `approval-distribution` worker based on the ValidatorIndex. The logic for assigning the work is:
|
||||
```
|
||||
assigned_worker_index = validator_index % number_of_workers;
|
||||
```
|
||||
@@ -0,0 +1,531 @@
|
||||
# Approval Voting
|
||||
|
||||
Reading the [section on the approval protocol](../../protocol-approval.md) will likely be necessary to understand the
|
||||
aims of this subsystem.
|
||||
|
||||
Approval votes are split into two parts: Assignments and Approvals. Validators first broadcast their assignment to
|
||||
indicate intent to check a candidate. Upon successfully checking, they don't immediately send the vote instead
|
||||
they queue the check for a short period of time `MAX_APPROVAL_COALESCE_WAIT_TICKS` to give the opportunity of the
|
||||
validator to vote for more than one candidate. Once MAX_APPROVAL_COALESCE_WAIT_TICKS have passed or at least
|
||||
`MAX_APPROVAL_COALESCE_COUNT` are ready they broadcast an approval vote for all candidates. If a validator
|
||||
doesn't broadcast their approval vote shortly after issuing an assignment, this is an indication that they are
|
||||
being prevented from recovering or validating the block data and that more validators should self-select to
|
||||
check the candidate. This is known as a "no-show".
|
||||
|
||||
The core of this subsystem is a Tick-based timer loop, where Ticks are 500ms. We also reason about time in terms of
|
||||
`DelayTranche`s, which measure the number of ticks elapsed since a block was produced. We track metadata for all
|
||||
un-finalized but included candidates. We compute our local assignments to check each candidate, as well as which
|
||||
`DelayTranche` those assignments may be minimally triggered at. As the same candidate may appear in more than one block,
|
||||
we must produce our potential assignments for each (Block, Candidate) pair. The timing loop is based on waiting for
|
||||
assignments to become no-shows or waiting to broadcast and begin our own assignment to check.
|
||||
|
||||
Another main component of this subsystem is the logic for determining when a (Block, Candidate) pair has been approved
|
||||
and when to broadcast and trigger our own assignment. Once a (Block, Candidate) pair has been approved, we mark a
|
||||
corresponding bit in the `BlockEntry` that indicates the candidate has been approved under the block. When we trigger
|
||||
our own assignment, we broadcast it via Approval Distribution, begin fetching the data from Availability Recovery, and
|
||||
then pass it through to the Candidate Validation. Once these steps are successful, we issue our approval vote. If any of
|
||||
these steps fail, we don't issue any vote and will "no-show" from the perspective of other validators in addition a
|
||||
dispute is raised via the dispute-coordinator, by sending `IssueLocalStatement`.
|
||||
|
||||
Where this all fits into Pezkuwi is via block finality. Our goal is to not finalize any block containing a candidate
|
||||
that is not approved. We provide a hook for a custom GRANDPA voting rule - GRANDPA makes requests of the form (target,
|
||||
minimum) consisting of a target block (i.e. longest chain) that it would like to finalize, and a minimum block which,
|
||||
due to the rules of GRANDPA, must be voted on. The minimum is typically the last finalized block, but may be beyond it,
|
||||
in the case of having a last-round-estimate beyond the last finalized. Thus, our goal is to inform GRANDPA of some block
|
||||
between target and minimum which we believe can be finalized safely. We do this by iterating backwards from the target
|
||||
to the minimum and finding the longest continuous chain from minimum where all candidates included by those blocks have
|
||||
been approved.
|
||||
|
||||
## Protocol
|
||||
|
||||
Input:
|
||||
* `ApprovalVotingMessage::ImportAssignment`
|
||||
* `ApprovalVotingMessage::ImportApproval`
|
||||
* `ApprovalVotingMessage::ApprovedAncestor`
|
||||
|
||||
Output:
|
||||
* `ApprovalDistributionMessage::DistributeAssignment`
|
||||
* `ApprovalDistributionMessage::DistributeApproval`
|
||||
* `RuntimeApiMessage::Request`
|
||||
* `ChainApiMessage`
|
||||
* `AvailabilityRecoveryMessage::Recover`
|
||||
* `CandidateExecutionMessage::ValidateFromExhaustive`
|
||||
|
||||
## Functionality
|
||||
|
||||
The approval voting subsystem is responsible for casting votes and determining approval of candidates and as a result,
|
||||
blocks.
|
||||
|
||||
This subsystem wraps a database which is used to store metadata about unfinalized blocks and the candidates within them.
|
||||
Candidates may appear in multiple blocks, and assignment criteria are chosen differently based on the hash of the block
|
||||
they appear in.
|
||||
|
||||
## Database Schema
|
||||
|
||||
The database schema is designed with the following goals in mind:
|
||||
1. To provide an easy index from unfinalized blocks to candidates
|
||||
1. To provide a lookup from candidate hash to approval status
|
||||
1. To be easy to clear on start-up. What has happened while we were offline is unimportant.
|
||||
1. To be fast to clear entries outdated by finality
|
||||
|
||||
Structs:
|
||||
|
||||
```rust
|
||||
struct TrancheEntry {
|
||||
tranche: DelayTranche,
|
||||
// assigned validators who have not yet approved, and the instant we received
|
||||
// their assignment.
|
||||
assignments: Vec<(ValidatorIndex, Tick)>,
|
||||
}
|
||||
|
||||
pub struct OurAssignment {
|
||||
/// Our assignment certificate.
|
||||
cert: AssignmentCertV2,
|
||||
/// The tranche for which the assignment refers to.
|
||||
tranche: DelayTranche,
|
||||
/// Our validator index for the session in which the candidates were included.
|
||||
validator_index: ValidatorIndex,
|
||||
/// Whether the assignment has been triggered already.
|
||||
triggered: bool,
|
||||
}
|
||||
|
||||
pub struct ApprovalEntry {
|
||||
tranches: Vec<TrancheEntry>, // sorted ascending by tranche number.
|
||||
backing_group: GroupIndex,
|
||||
our_assignment: Option<OurAssignment>,
|
||||
our_approval_sig: Option<ValidatorSignature>,
|
||||
assigned_validators: Bitfield, // `n_validators` bits.
|
||||
approved: bool,
|
||||
}
|
||||
|
||||
|
||||
struct CandidateEntry {
|
||||
candidate: CandidateReceipt,
|
||||
session: SessionIndex,
|
||||
// Assignments are based on blocks, so we need to track assignments separately
|
||||
// based on the block we are looking at.
|
||||
block_assignments: HashMap<Hash, ApprovalEntry>,
|
||||
approvals: Bitfield, // n_validators bits
|
||||
}
|
||||
|
||||
struct BlockEntry {
|
||||
block_hash: Hash,
|
||||
session: SessionIndex,
|
||||
slot: Slot,
|
||||
// random bytes derived from the VRF submitted within the block by the block
|
||||
// author as a credential and used as input to approval assignment criteria.
|
||||
relay_vrf_story: [u8; 32],
|
||||
// The candidates included as-of this block and the index of the core they are
|
||||
// leaving. Sorted ascending by core index.
|
||||
candidates: Vec<(CoreIndex, Hash)>,
|
||||
// A bitfield where the i'th bit corresponds to the i'th candidate in `candidates`.
|
||||
// The i'th bit is `true` iff the candidate has been approved in the context of
|
||||
// this block. The block can be considered approved has all bits set to 1
|
||||
approved_bitfield: Bitfield,
|
||||
children: Vec<Hash>,
|
||||
// A list of candidates we have checked, but didn't not sign and
|
||||
// advertise the vote yet.
|
||||
candidates_pending_signature: BTreeMap<CandidateIndex, CandidateSigningContext>,
|
||||
// Assignments we already distributed. A 1 bit means the candidate index for which
|
||||
// we already have sent out an assignment. We need this to avoid distributing
|
||||
// multiple core assignments more than once.
|
||||
distributed_assignments: Bitfield,
|
||||
}
|
||||
|
||||
// slot_duration * 2 + DelayTranche gives the number of delay tranches since the
|
||||
// unix epoch.
|
||||
type Tick = u64;
|
||||
|
||||
struct StoredBlockRange(BlockNumber, BlockNumber);
|
||||
```
|
||||
|
||||
In the schema, we map
|
||||
|
||||
```
|
||||
"StoredBlocks" => StoredBlockRange
|
||||
BlockNumber => Vec<BlockHash>
|
||||
BlockHash => BlockEntry
|
||||
CandidateHash => CandidateEntry
|
||||
```
|
||||
|
||||
## Logic
|
||||
|
||||
```rust
|
||||
const APPROVAL_SESSIONS: SessionIndex = 6;
|
||||
|
||||
// The minimum amount of ticks that an assignment must have been known for.
|
||||
const APPROVAL_DELAY: Tick = 2;
|
||||
```
|
||||
|
||||
In-memory state:
|
||||
|
||||
```rust
|
||||
struct ApprovalVoteRequest {
|
||||
validator_index: ValidatorIndex,
|
||||
block_hash: Hash,
|
||||
candidate_index: CandidateIndex,
|
||||
}
|
||||
|
||||
// Requests that background work (approval voting tasks) may need to make of the main subsystem
|
||||
// task.
|
||||
enum BackgroundRequest {
|
||||
ApprovalVote(ApprovalVoteRequest),
|
||||
// .. others, unspecified as per implementation.
|
||||
}
|
||||
|
||||
// This is the general state of the subsystem. The actual implementation may split this
|
||||
// into further pieces.
|
||||
struct State {
|
||||
earliest_session: SessionIndex,
|
||||
session_info: Vec<SessionInfo>,
|
||||
babe_epoch: Option<BabeEpoch>, // information about a cached BABE epoch.
|
||||
keystore: Keystore,
|
||||
|
||||
// A scheduler which keeps at most one wakeup per hash, candidate hash pair and
|
||||
// maps such pairs to `Tick`s.
|
||||
wakeups: Wakeups,
|
||||
|
||||
// These are connected to each other.
|
||||
background_tx: mpsc::Sender<BackgroundRequest>,
|
||||
background_rx: mpsc::Receiver<BackgroundRequest>,
|
||||
}
|
||||
```
|
||||
|
||||
This guide section makes no explicit references to writes to or reads from disk. Instead, it handles them implicitly,
|
||||
with the understanding that updates to block, candidate, and approval entries are persisted to disk.
|
||||
|
||||
[`SessionInfo`](../../runtime/session_info.md)
|
||||
|
||||
On start-up, we clear everything currently stored by the database. This is done by loading the `StoredBlockRange`,
|
||||
iterating through each block number, iterating through each block hash, and iterating through each candidate referenced
|
||||
by each block. Although this is `O(o*n*p)`, we don't expect to have more than a few unfinalized blocks at any time and
|
||||
in extreme cases, a few thousand. The clearing operation should be relatively fast as a result.
|
||||
|
||||
Main loop:
|
||||
* Each iteration, select over all of
|
||||
* The next `Tick` in `wakeups`: trigger `wakeup_process` for each `(Hash, Hash)` pair scheduled under the `Tick` and
|
||||
then remove all entries under the `Tick`.
|
||||
* The next message from the overseer: handle the message as described in the [Incoming Messages
|
||||
section](#incoming-messages)
|
||||
* The next approval vote request from `background_rx`
|
||||
* If this is an `ApprovalVoteRequest`, [Issue an approval vote](#issue-approval-vote).
|
||||
|
||||
### Incoming Messages
|
||||
|
||||
#### `OverseerSignal::BlockFinalized`
|
||||
|
||||
On receiving an `OverseerSignal::BlockFinalized(h)`, we fetch the block number `b` of that block from the `ChainApi`
|
||||
subsystem. We update our `StoredBlockRange` to begin at `b+1`. Additionally, we remove all block entries and candidates
|
||||
referenced by them up to and including `b`. Lastly, we prune out all descendants of `h` transitively: when we remove a
|
||||
`BlockEntry` with number `b` that is not equal to `h`, we recursively delete all the `BlockEntry`s referenced as
|
||||
children. We remove the `block_assignments` entry for the block hash and if `block_assignments` is now empty, remove the
|
||||
`CandidateEntry`. We also update each of the `BlockNumber -> Vec<Hash>` keys in the database to reflect the blocks at
|
||||
that height, clearing if empty.
|
||||
|
||||
|
||||
#### `OverseerSignal::ActiveLeavesUpdate`
|
||||
|
||||
On receiving an `OverseerSignal::ActiveLeavesUpdate(update)`:
|
||||
* We determine the set of new blocks that were not in our previous view. This is done by querying the ancestry of all
|
||||
new items in the view and contrasting against the stored `BlockNumber`s. Typically, there will be only one new
|
||||
block. We fetch the headers and information on these blocks from the `ChainApi` subsystem. Stale leaves in the
|
||||
update can be ignored.
|
||||
* We update the `StoredBlockRange` and the `BlockNumber` maps.
|
||||
* We use the `RuntimeApiSubsystem` to determine information about these blocks. It is generally safe to assume that
|
||||
runtime state is available for recent, unfinalized blocks. In the case that it isn't, it means that we are catching
|
||||
up to the head of the chain and needn't worry about assignments to those blocks anyway, as the security assumption
|
||||
of the protocol tolerates nodes being temporarily offline or out-of-date.
|
||||
* We fetch the set of candidates included by each block by dispatching a `RuntimeApiRequest::CandidateEvents` and
|
||||
checking the `CandidateIncluded` events.
|
||||
* We fetch the session of the block by dispatching a `session_index_for_child` request with the parent-hash of the
|
||||
block.
|
||||
* If the `session index - APPROVAL_SESSIONS > state.earliest_session`, then bump `state.earliest_sessions` to that
|
||||
amount and prune earlier sessions.
|
||||
* If the session isn't in our `state.session_info`, load the session info for it and for all sessions since the
|
||||
earliest-session, including the earliest-session, if that is missing. And it can be, just after pruning, if we've
|
||||
done a big jump forward, as is the case when we've just finished chain synchronization.
|
||||
* If any of the runtime API calls fail, we just warn and skip the block.
|
||||
* We use the `RuntimeApiSubsystem` to determine the set of candidates included in these blocks and use BABE logic to
|
||||
determine the slot number and VRF of the blocks.
|
||||
* We also note how late we appear to have received the block. We create a `BlockEntry` for each block and a
|
||||
`CandidateEntry` for each candidate obtained from `CandidateIncluded` events after making a
|
||||
`RuntimeApiRequest::CandidateEvents` request.
|
||||
* For each candidate, if the amount of needed approvals is more than the validators remaining after the backing group
|
||||
of the candidate is subtracted, then the candidate is insta-approved as approval would be impossible otherwise. If
|
||||
all candidates in the block are insta-approved, or there are no candidates in the block, then the block is
|
||||
insta-approved. If the block is insta-approved, a [`ChainSelectionMessage::Approved`][CSM] should be sent for the
|
||||
block.
|
||||
* Ensure that the `CandidateEntry` contains a `block_assignments` entry for the block, with the correct backing group
|
||||
set.
|
||||
* If a validator in this session, compute and assign `our_assignment` for the `block_assignments`
|
||||
* Only if not a member of the backing group.
|
||||
* Run `RelayVRFModulo` and `RelayVRFDelay` according to the [the approvals protocol
|
||||
section](../../protocol-approval.md#assignment-criteria). Ensure that the assigned core derived from the output is
|
||||
covered by the auxiliary signature aggregated in the `VRFPRoof`.
|
||||
* [Handle Wakeup](#handle-wakeup) for each new candidate in each new block - this will automatically broadcast a
|
||||
0-tranche assignment, kick off approval work, and schedule the next delay.
|
||||
* Dispatch an `ApprovalDistributionMessage::NewBlocks` with the meta information filled out for each new block.
|
||||
|
||||
#### `ApprovalVotingMessage::ImportAssignment`
|
||||
|
||||
On receiving a `ApprovalVotingMessage::ImportAssignment` message, we assume the assignment cert itself has already been
|
||||
checked to be valid we proceed then to import the assignment inside the block entry. The cert itself contains
|
||||
information necessary to determine the candidate that is being assigned-to. In detail:
|
||||
* Load the `BlockEntry` for the relay-parent referenced by the message. If there is none, return
|
||||
`AssignmentCheckResult::Bad`.
|
||||
* Fetch the `SessionInfo` for the session of the block
|
||||
* Determine the assignment key of the validator based on that.
|
||||
* Determine the claimed core index by looking up the candidate with given index in `block_entry.candidates`. Return
|
||||
`AssignmentCheckResult::Bad` if missing.
|
||||
* Import the assignment.
|
||||
* Load the candidate in question and access the `approval_entry` for the block hash the cert references.
|
||||
* Ignore if we already observe the validator as having been assigned.
|
||||
* Ensure the validator index is not part of the backing group for the candidate.
|
||||
* Ensure the validator index is not present in the approval entry already.
|
||||
* Create a tranche entry for the delay tranche in the approval entry and note the assignment within it.
|
||||
* Note the candidate index within the approval entry.
|
||||
* [Schedule a wakeup](#schedule-wakeup) for this block, candidate pair.
|
||||
* return the appropriate `AssignmentCheckResult` on the response channel.
|
||||
|
||||
#### `ApprovalVotingMessage::ImportApproval`
|
||||
|
||||
On receiving a `ImportApproval(indirect_approval_vote, response_channel)` message:
|
||||
* Fetch the `BlockEntry` from the indirect approval vote's `block_hash`. If none, return `ApprovalCheckResult::Bad`.
|
||||
* Fetch all `CandidateEntry` from the indirect approval vote's `candidate_indices`. If the block did not trigger
|
||||
inclusion of enough candidates, return `ApprovalCheckResult::Bad`.
|
||||
* Send `ApprovalCheckResult::Accepted`
|
||||
* [Import the checked approval vote](#import-checked-approval) for all candidates
|
||||
|
||||
#### `ApprovalVotingMessage::ApprovedAncestor`
|
||||
|
||||
On receiving an `ApprovedAncestor(Hash, BlockNumber, response_channel)`:
|
||||
* Iterate over the ancestry of the hash all the way back to block number given, starting from the provided block hash.
|
||||
Load the `CandidateHash`es from each block entry.
|
||||
* Keep track of an `all_approved_max: Option<(Hash, BlockNumber, Vec<(Hash, Vec<CandidateHash>))>`.
|
||||
* For each block hash encountered, load the `BlockEntry` associated. If any are not found, return `None` on the
|
||||
response channel and conclude.
|
||||
* If the block entry's `approval_bitfield` has all bits set to 1 and `all_approved_max == None`, set `all_approved_max
|
||||
= Some((current_hash, current_number))`.
|
||||
* If the block entry's `approval_bitfield` has any 0 bits, set `all_approved_max = None`.
|
||||
* If `all_approved_max` is `Some`, push the current block hash and candidate hashes onto the list of blocks and
|
||||
candidates `all_approved_max`.
|
||||
* After iterating all ancestry, return `all_approved_max`.
|
||||
|
||||
### Updates and Auxiliary Logic
|
||||
|
||||
#### Import Checked Approval
|
||||
* Import an approval vote which we can assume to have passed signature checks and correspond to an imported
|
||||
assignment.
|
||||
* Requires `(BlockEntry, CandidateEntry, ValidatorIndex)`
|
||||
* Set the corresponding bit of the `approvals` bitfield in the `CandidateEntry` to `1`. If already `1`, return.
|
||||
* Checks the approval state of a candidate under a specific block, and updates the block and candidate entries
|
||||
accordingly.
|
||||
* Checks the `ApprovalEntry` for the block.
|
||||
* [determine the tranches to inspect](#determine-required-tranches) of the candidate,
|
||||
* [the candidate is approved under the block](#check-approval), set the corresponding bit in the
|
||||
`block_entry.approved_bitfield`.
|
||||
* If the block is now fully approved and was not before, send a [`ChainSelectionMessage::Approved`][CSM].
|
||||
* Otherwise, [schedule a wakeup of the candidate](#schedule-wakeup)
|
||||
* If the approval vote originates locally, set the `our_approval_sig` in the candidate entry.
|
||||
|
||||
#### Handling Wakeup
|
||||
* Handle a previously-scheduled wakeup of a candidate under a specific block.
|
||||
* Requires `(relay_block, candidate_hash)`
|
||||
* Load the `BlockEntry` and `CandidateEntry` from disk. If either is not present, this may have lost a race with
|
||||
finality and can be ignored. Also load the `ApprovalEntry` for the block and candidate.
|
||||
* [determine the `RequiredTranches` of the candidate](#determine-required-tranches).
|
||||
* Determine if we should trigger our assignment.
|
||||
* If we've already triggered or `OurAssignment` is `None`, we do not trigger.
|
||||
* If we have `RequiredTranches::All`, then we trigger if the candidate is [not approved](#check-approval). We have
|
||||
no next wakeup as we assume that other validators are doing the same and we will be implicitly woken up by
|
||||
handling new votes.
|
||||
* If we have `RequiredTranches::Pending { considered, next_no_show, uncovered, maximum_broadcast, clock_drift }`,
|
||||
then we trigger if our assignment's tranche is less than or equal to `maximum_broadcast` and the current tick,
|
||||
with `clock_drift` applied, is at least the tick of our tranche.
|
||||
* If we have `RequiredTranches::Exact { .. }` then we do not trigger, because this value indicates that no new
|
||||
assignments are needed at the moment.
|
||||
* If we should trigger our assignment
|
||||
* Import the assignment to the `ApprovalEntry`
|
||||
* Broadcast on network with an `ApprovalDistributionMessage::DistributeAssignment`.
|
||||
* [Launch approval work](#launch-approval-work) for the candidate.
|
||||
* [Schedule a new wakeup](#schedule-wakeup) of the candidate.
|
||||
|
||||
#### Schedule Wakeup
|
||||
|
||||
* Requires `(approval_entry, candidate_entry)` which effectively denotes a `(Block Hash, Candidate Hash)` pair - the
|
||||
candidate, along with the block it appears in.
|
||||
* Also requires `RequiredTranches`
|
||||
* If the `approval_entry` is approved, this doesn't need to be woken up again.
|
||||
* If `RequiredTranches::All` - no wakeup. We assume other incoming votes will trigger wakeup and potentially
|
||||
re-schedule.
|
||||
* If `RequiredTranches::Pending { considered, next_no_show, uncovered, maximum_broadcast, clock_drift }` - schedule at
|
||||
the lesser of the next no-show tick, or the tick, offset positively by `clock_drift` of the next non-empty tranche
|
||||
we are aware of after `considered`, including any tranche containing our own unbroadcast assignment. This can lead
|
||||
to no wakeup in the case that we have already broadcast our assignment and there are no pending no-shows; that is,
|
||||
we have approval votes for every assignment we've received that is not already a no-show. In this case, we will be
|
||||
re-triggered by other validators broadcasting their assignments.
|
||||
* If `RequiredTranches::Exact { next_no_show, latest_assignment_tick, .. }` - set a wakeup for the earlier of the next
|
||||
no-show tick or the latest assignment tick + `APPROVAL_DELAY`.
|
||||
|
||||
#### Launch Approval Work
|
||||
|
||||
* Requires `(SessionIndex, SessionInfo, CandidateReceipt, ValidatorIndex, backing_group, block_hash, candidate_index)`
|
||||
* Extract the public key of the `ValidatorIndex` from the `SessionInfo` for the session.
|
||||
* Issue an `AvailabilityRecoveryMessage::RecoverAvailableData(candidate, session_index, Some(backing_group),
|
||||
Some(core_index), response_sender)`
|
||||
* Load the historical validation code of the teyrchain by dispatching a
|
||||
`RuntimeApiRequest::ValidationCodeByHash(descriptor.validation_code_hash)` against the state of `block_hash`.
|
||||
* Spawn a background task with a clone of `background_tx`
|
||||
* Wait for the available data
|
||||
* Issue a `CandidateValidationMessage::ValidateFromExhaustive` message with `APPROVAL_EXECUTION_TIMEOUT` as the
|
||||
timeout parameter.
|
||||
* Wait for the result of validation
|
||||
* Check that the result of validation, if valid, matches the commitments in the receipt.
|
||||
* If valid, issue a message on `background_tx` detailing the request.
|
||||
* If any of the data, the candidate, or the commitments are invalid, issue on `background_tx` a
|
||||
[`DisputeCoordinatorMessage::IssueLocalStatement`](../../types/overseer-protocol.md#dispute-coordinator-message)
|
||||
with `valid = false` to initiate a dispute.
|
||||
|
||||
#### Issue Approval Vote
|
||||
* Fetch the block entry and candidate entry. Ignore if `None` - we've probably just lost a race with finality.
|
||||
* [Import the checked approval vote](#import-checked-approval). It is "checked" as we've just issued the signature.
|
||||
* IF `MAX_APPROVAL_COALESCE_COUNT` candidates are in the waiting queue
|
||||
* Construct a `SignedApprovalVote` with the validator index for the session and all candidate hashes in the waiting queue.
|
||||
* Construct a `IndirectSignedApprovalVote` using the information about the vote.
|
||||
* Dispatch `ApprovalDistributionMessage::DistributeApproval`.
|
||||
* ELSE
|
||||
* Queue the candidate in the `BlockEntry::candidates_pending_signature`
|
||||
* Arm a per BlockEntry timer with latest tick we can send the vote.
|
||||
|
||||
### Delayed vote distribution
|
||||
* [Issue Approval Vote](#issue-approval-vote) arms once a per block timer if there are no requirements to send the
|
||||
vote immediately.
|
||||
* When the timer wakes up it will either:
|
||||
* IF there is a candidate in the queue past its sending tick:
|
||||
* Construct a `SignedApprovalVote` with the validator index for the session and all candidate hashes in the waiting queue.
|
||||
* Construct a `IndirectSignedApprovalVote` using the information about the vote.
|
||||
* Dispatch `ApprovalDistributionMessage::DistributeApproval`.
|
||||
* ELSE
|
||||
* Re-arm the timer with latest tick we have then send the vote.
|
||||
|
||||
### Determining Approval of Candidate
|
||||
|
||||
#### Determine Required Tranches
|
||||
|
||||
This logic is for inspecting an approval entry that tracks the assignments received, along with information on which
|
||||
assignments have corresponding approval votes. Inspection also involves the current time and expected requirements and
|
||||
is used to help the higher-level code determine the following:
|
||||
* Whether to broadcast the local assignment
|
||||
* Whether to check that the candidate entry has been completely approved.
|
||||
* If the candidate is waiting on approval, when to schedule the next wakeup of the `(candidate, block)` pair at a
|
||||
point where the state machine could be advanced.
|
||||
|
||||
These routines are pure functions which only depend on the environmental state. The expectation is that this
|
||||
determination is re-run every time we attempt to update an approval entry: either when we trigger a wakeup to advance
|
||||
the state machine based on a no-show or our own broadcast, or when we receive further assignments or approvals from the
|
||||
network.
|
||||
|
||||
Thus it may be that at some point in time, we consider that tranches 0..X is required to be considered, but as we
|
||||
receive more information, we might require fewer tranches. Or votes that we perceived to be missing and require
|
||||
replacement are filled in and change our view.
|
||||
|
||||
Requires `(approval_entry, approvals_received, tranche_now, block_tick, no_show_duration, needed_approvals)`
|
||||
|
||||
```rust
|
||||
enum RequiredTranches {
|
||||
// All validators appear to be required, based on tranches already taken and remaining no-shows.
|
||||
All,
|
||||
// More tranches required - We're awaiting more assignments.
|
||||
Pending {
|
||||
/// The highest considered delay tranche when counting assignments.
|
||||
considered: DelayTranche,
|
||||
/// The tick at which the next no-show, of the assignments counted, would occur.
|
||||
next_no_show: Option<Tick>,
|
||||
/// The highest tranche to consider when looking to broadcast own assignment.
|
||||
/// This should be considered along with the clock drift to avoid broadcasting
|
||||
/// assignments that are before the local time.
|
||||
maximum_broadcast: DelayTranche,
|
||||
/// The clock drift, in ticks, to apply to the local clock when determining whether
|
||||
/// to broadcast an assignment or when to schedule a wakeup. The local clock should be treated
|
||||
/// as though it is `clock_drift` ticks earlier.
|
||||
clock_drift: Tick,
|
||||
},
|
||||
// An exact number of required tranches and a number of no-shows. This indicates that the amount of `needed_approvals`
|
||||
// are assigned and additionally all no-shows are covered.
|
||||
Exact {
|
||||
/// The tranche to inspect up to.
|
||||
needed: DelayTranche,
|
||||
/// The amount of missing votes that should be tolerated.
|
||||
tolerated_missing: usize,
|
||||
/// When the next no-show would be, if any. This is used to schedule the next wakeup in the
|
||||
/// event that there are some assignments that don't have corresponding approval votes. If this
|
||||
/// is `None`, all assignments have approvals.
|
||||
next_no_show: Option<Tick>,
|
||||
/// The last tick at which a needed assignment was received.
|
||||
last_assignment_tick: Option<Tick>,
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Clock-drift and Tranche-taking**
|
||||
|
||||
Our vote-counting procedure depends heavily on how we interpret time based on the presence of no-shows - assignments
|
||||
which have no corresponding approval after some time.
|
||||
|
||||
We have this is because of how we handle no-shows: we keep track of the depth of no-shows we are covering.
|
||||
|
||||
As an example: there may be initial no-shows in tranche 0. It'll take `no_show_duration` ticks before those are
|
||||
considered no-shows. Then, we don't want to immediately take `no_show_duration` more tranches. Instead, we want to take
|
||||
one tranche for each uncovered no-show. However, as we take those tranches, there may be further no-shows. Since these
|
||||
depth-1 no-shows should have only been triggered after the depth-0 no-shows were already known to be no-shows, we need
|
||||
to discount the local clock by `no_show_duration` to see whether these should be considered no-shows or not. There may
|
||||
be malicious parties who broadcast their assignment earlier than they were meant to, who shouldn't be counted as instant
|
||||
no-shows. We continue onwards to cover all depth-1 no-shows which may lead to depth-2 no-shows and so on.
|
||||
|
||||
Likewise, when considering how many tranches to take, the no-show depth should be used to apply a depth-discount or
|
||||
clock drift to the `tranche_now`.
|
||||
|
||||
**Procedure**
|
||||
|
||||
* Start with `depth = 0`.
|
||||
* Set a clock drift of `depth * no_show_duration`
|
||||
* Take tranches up to `tranche_now - clock_drift` until all needed assignments are met.
|
||||
* Keep track of the `next_no_show` according to the clock drift, as we go.
|
||||
* Keep track of the `last_assignment_tick` as we go.
|
||||
* If running out of tranches before then, return `Pending { considered, next_no_show, maximum_broadcast, clock_drift
|
||||
}`
|
||||
* If there are no no-shows, return `Exact { needed, tolerated_missing, next_no_show, last_assignment_tick }`
|
||||
* `maximum_broadcast` is either `DelayTranche::max_value()` at tranche 0 or otherwise by the last considered tranche +
|
||||
the number of uncovered no-shows at this point.
|
||||
* If there are no-shows, return to the beginning, incrementing `depth` and attempting to cover the number of no-shows.
|
||||
Each no-show must be covered by a non-empty tranche, which are tranches that have at least one assignment. Each
|
||||
non-empty tranche covers exactly one no-show.
|
||||
* If at any point, it seems that all validators are required, do an early return with `RequiredTranches::All` which
|
||||
indicates that everyone should broadcast.
|
||||
|
||||
#### Check Approval
|
||||
* Check whether a candidate is approved under a particular block.
|
||||
* Requires `(block_entry, candidate_entry, approval_entry, n_tranches)`
|
||||
* If we have `3 * n_approvals > n_validators`, return true. This is because any set with f+1 validators must have at
|
||||
least one honest validator, who has approved the candidate.
|
||||
* If `n_tranches` is `RequiredTranches::Pending`, return false
|
||||
* If `n_tranches` is `RequiredTranches::All`, return false.
|
||||
* If `n_tranches` is `RequiredTranches::Exact { tranche, tolerated_missing, latest_assignment_tick, .. }`, then we
|
||||
return whether all assigned validators up to `tranche` less `tolerated_missing` have approved and
|
||||
`latest_assignment_tick + APPROVAL_DELAY >= tick_now`.
|
||||
* e.g. if we had 5 tranches and 1 tolerated missing, we would accept only if all but 1 of assigned validators in
|
||||
tranches 0..=5 have approved. In that example, we also accept all validators in tranches 0..=5 having approved,
|
||||
but that would indicate that the `RequiredTranches` value was incorrectly constructed, so it is not realistic.
|
||||
`tolerated_missing` actually represents covered no-shows. If there are more missing approvals than there are
|
||||
tolerated missing, that indicates that there are some assignments which are not yet no-shows, but may become
|
||||
no-shows, and we should wait for the validators to either approve or become no-shows.
|
||||
* e.g. If the above passes and the `latest_assignment_tick` was 5 and the current tick was 6, then we'd return
|
||||
false.
|
||||
|
||||
### Time
|
||||
|
||||
#### Current Tranche
|
||||
* Given the slot number of a block, and the current time, this informs about the current tranche.
|
||||
* Convert `time.saturating_sub(slot_number.to_time())` to a delay tranches value
|
||||
|
||||
[CSM]: ../../types/overseer-protocol.md#chainselectionmessage
|
||||
Reference in New Issue
Block a user