impl approval distribution (#2160)

* initial impl approval distribution

* initial tests and fixes

* batching seems difficult: different peers have different needs

* bridge: fix test after merge

* some guide updates

* only send assignments to peers who know about the block

* fix a test, add approvals test

* simplify

* do not send assignment to peers for finalized blocks

* guide: protocol input and output

* one more test

* more comments, logs, initial metrics

* fix a typo

* one more thing: early return when reimporting a thing locally
This commit is contained in:
Andronik Ordian
2021-01-26 00:14:32 +01:00
committed by GitHub
parent fa6e4b4488
commit 3f1e1a6ff7
18 changed files with 2119 additions and 80 deletions
@@ -25,6 +25,18 @@ However, awareness on its own of a (block, candidate) pair would imply that even
## Protocol
Input:
- `ApprovalDistributionMessage::NewBlocks`
- `ApprovalDistributionMessage::DistributeAssignment`
- `ApprovalDistributionMessage::DistributeApproval`
- `ApprovalDistributionMessage::NetworkBridgeUpdateV1`
- `OverseerSignal::BlockFinalized`
Output:
- `ApprovalVotingMessage::CheckAndImportAssignment`
- `ApprovalVotingMessage::CheckAndImportApproval`
- `NetworkBridgeMessage::SendValidationMessage::ApprovalDistribution`
## Functionality
```rust
@@ -34,10 +46,9 @@ type BlockScopedCandidate = (Hash, CandidateHash);
///
/// It tracks metadata about our view of the unfinalized chain, which assignments and approvals we have seen, and our peers' views.
struct State {
// These three fields are used in conjunction to construct a view over the unfinalized chain.
// These two fields are used in conjunction to construct a view over the unfinalized chain.
blocks_by_number: BTreeMap<BlockNumber, Vec<Hash>>,
blocks: HashMap<Hash, BlockEntry>,
finalized_number: BlockNumber,
// Peer view data is partially stored here, and partially inline within the `BlockEntry`s
peer_views: HashMap<PeerId, View>,
@@ -103,10 +114,6 @@ We augment that by defining `constrain(x)` to output the x bounded by the first
From there, we can loop backwards from `constrain(view.finalized_number)` until `constrain(last_view.finalized_number)` is reached, removing the `PeerId` from all `BlockEntry`s referenced at that height. We can break the loop early if we ever exit the bound supplied by the first block in `state.blocks_by_number`.
#### `NetworkBridgeEvent::OurViewChange`
Prune all lists from `blocks_by_number` with number less than or equal to `view.finalized_number`. Prune all the `BlockEntry`s referenced by those lists.
#### `NetworkBridgeEvent::PeerMessage`
If the message is of type `ApprovalDistributionV1Message::Assignment(assignment_cert, claimed_index)`, then call `import_and_circulate_assignment(MessageSource::Peer(sender), assignment_cert, claimed_index)`
@@ -125,11 +132,16 @@ For all peers:
#### `ApprovalDistributionMessage::DistributeAsignment`
Load the corresponding `BlockEntry`. Distribute to all peers in `known_by`. Add to the corresponding `CandidateEntry`.
Call `import_and_circulate_assignment` with `MessageSource::Local`.
#### `ApprovalDistributionMessage::DistributeApproval`
Load the corresponding `BlockEntry`. Distribute to all peers in `known_by`. Add to the corresponding `CandidateEntry`.
Call `import_and_circulate_approval` with `MessageSource::Local`.
#### `OverseerSignal::BlockFinalized`
Prune all lists from `blocks_by_number` with number less than or equal to `finalized_number`. Prune all the `BlockEntry`s referenced by those lists.
### Utility
@@ -140,19 +152,29 @@ enum MessageSource {
}
```
#### `import_and_circulate_assignment(source: MessageSource, assignment: IndirectAssignmentCert, claimed_candidate_index: u32)`
#### `import_and_circulate_assignment(source: MessageSource, assignment: IndirectAssignmentCert, claimed_candidate_index: CandidateIndex)`
Imports an assignment cert referenced by block hash and candidate index. As a postcondition, if the cert is valid, it will have distributed the cert to all peers who have the block in their view, with the exclusion of the peer referenced by the `MessageSource`.
We maintain a few invariants:
* we only send an assignment to a peer after we add its fingerpring to our knownledge
* we add a fingerprint of an assignment to our knownledge only if it's valid and hasn't been added before
The algorithm is the following:
* Load the BlockEntry using `assignment.block_hash`. If it does not exist, report the source if it is `MessageSource::Peer` and return.
* Compute a fingerprint for the `assignment` using `claimed_candidate_index`.
* If the source is `MessageSource::Peer(sender)`:
* check if `peer` appears under `known_by` and whether the fingerprint is in the `known_messages` of the peer. If the peer does not know the block, report for providing data out-of-view and proceed. If the peer does know the block and the knowledge contains the fingerprint, report for providing replicate data and return.
* If the message fingerprint appears under the `BlockEntry`'s `Knowledge`, give the peer a small positive reputation boost and return. Note that we must do this after checking for out-of-view to avoid being spammed. If we did this check earlier, a peer could provide data out-of-view repeatedly and be rewarded for it.
* If the message fingerprint appears under the `BlockEntry`'s `Knowledge`, give the peer a small positive reputation boost,
add the fingerpring to the peer's knownledge only if it knows about the block and return.
Note that we must do this after checking for out-of-view and if the peers knows about the block to avoid being spammed.
If we did this check earlier, a peer could provide data out-of-view repeatedly and be rewarded for it.
* Dispatch `ApprovalVotingMessage::CheckAndImportAssignment(assignment)` and wait for the response.
* If the result is `AssignmentCheckResult::Accepted` or `AssignmentCheckResult::AcceptedDuplicate`
* If the result is `AssignmentCheckResult::Accepted`
* If the vote was accepted but not duplicate, give the peer a positive reputation boost
* add the fingerprint to both our and the peer's knowledge in the `BlockEntry`. Note that we only doing this after making sure we have the right fingerprint.
* If the result is `AssignmentCheckResult::AcceptedDuplicate`, add the fingerprint to the peer's knowledge if it knows about the block and return.
* If the result is `AssignmentCheckResult::TooFarInFuture`, mildly punish the peer and return.
* If the result is `AssignmentCheckResult::Bad`, punish the peer and return.
* If the source is `MessageSource::Local(CandidateIndex)`
@@ -164,14 +186,16 @@ Imports an assignment cert referenced by block hash and candidate index. As a po
#### `import_and_circulate_approval(source: MessageSource, approval: IndirectSignedApprovalVote)`
Imports an approval signature referenced by block hash and candidate index.
Imports an approval signature referenced by block hash and candidate index:
* Load the BlockEntry using `approval.block_hash` and the candidate entry using `approval.candidate_entry`. If either does not exist, report the source if it is `MessageSource::Peer` and return.
* Compute a fingerprint for the approval.
* Compute a fingerprint for the corresponding assignment. If the `BlockEntry`'s knowledge does not contain that fingerprint, then report the source if it is `MessageSource::Peer` and return. All references to a fingerprint after this refer to the approval's, not the assignment's.
* If the source is `MessageSource::Peer(sender)`:
* check if `peer` appears under `known_by` and whether the fingerprint is in the `known_messages` of the peer. If the peer does not know the block, report for providing data out-of-view and proceed. If the peer does know the block and the knowledge contains the fingerprint, report for providing replicate data and return.
* If the message fingerprint appears under the `BlockEntry`'s `Knowledge`, give the peer a small positive reputation boost and return. Note that we must do this after checking for out-of-view to avoid being spammed. If we did this check earlier, a peer could provide data out-of-view repeatedly and be rewarded for it.
* If the message fingerprint appears under the `BlockEntry`'s `Knowledge`, give the peer a small positive reputation boost,
add the fingerpring to the peer's knownledge only if it knows about the block and return.
Note that we must do this after checking for out-of-view to avoid being spammed. If we did this check earlier, a peer could provide data out-of-view repeatedly and be rewarded for it.
* Dispatch `ApprovalVotingMessage::CheckAndImportApproval(approval)` and wait for the response.
* If the result is `VoteCheckResult::Accepted(())`:
* Give the peer a positive reputation boost and add the fingerprint to both our and the peer's knowledge.
@@ -187,10 +211,12 @@ Imports an approval signature referenced by block hash and candidate index.
#### `unify_with_peer(peer: PeerId, view)`:
1. Initialize a set `fresh_blocks = {}`
For each block in the view:
1. Initialize a set `fresh_blocks = {}`
2. Load the `BlockEntry` for the block. If the block is unknown, or the number is less than the view's finalized number, go to step 6.
2. Load the `BlockEntry` for the block. If the block is unknown, or the number is less than or equal to the view's finalized number, go to step 6.
3. Inspect the `known_by` set of the `BlockEntry`. If the peer is already present, go to step 6.
4. Add the peer to `known_by` with a cloned version of `block_entry.knowledge`. and add the hash of the block to `fresh_blocks`.
5. Return to step 2 with the ancestor of the block.
6. For each block in `fresh_blocks`, send all assignments and approvals for all candidates in those blocks to the peer.
6. For each block in `fresh_blocks`, send all assignments and approvals for all candidates in those blocks to the peer.
@@ -113,7 +113,7 @@ In-memory state:
struct ApprovalVoteRequest {
validator_index: ValidatorIndex,
block_hash: Hash,
candidate_index: u32,
candidate_index: CandidateIndex,
}
struct State {
@@ -121,7 +121,7 @@ struct State {
session_info: Vec<SessionInfo>,
keystore: KeyStorePtr,
wakeups: BTreeMap<Tick, Vec<(Hash, Hash)>>, // Tick -> [(Relay Block, Candidate Hash)]
// These are connected to each other.
approval_vote_tx: mpsc::Sender<ApprovalVoteRequest>,
approval_vote_rx: mpsc::Receiver<ApprovalVoteRequest>,
@@ -148,7 +148,7 @@ On receiving an `OverseerSignal::BlockFinalized(h)`, we fetch the block number `
#### `OverseerSignal::ActiveLeavesUpdate`
On receiving an `OverseerSignal::ActiveLeavesUpdate(update)`:
* We determine the set of new blocks that were not in our previous view. This is done by querying the ancestry of all new items in the view and contrasting against the stored `BlockNumber`s. Typically, there will be only one new block. We fetch the headers and information on these blocks from the ChainApi subsystem.
* We determine the set of new blocks that were not in our previous view. This is done by querying the ancestry of all new items in the view and contrasting against the stored `BlockNumber`s. Typically, there will be only one new block. We fetch the headers and information on these blocks from the ChainApi subsystem.
* We update the `StoredBlockRange` and the `BlockNumber` maps.
* We use the RuntimeApiSubsystem to determine information about these blocks. It is generally safe to assume that runtime state is available for recent, unfinalized blocks. In the case that it isn't, it means that we are catching up to the head of the chain and needn't worry about assignments to those blocks anyway, as the security assumption of the protocol tolerates nodes being temporarily offline or out-of-date.
* We fetch the set of candidates included by each block by dispatching a `RuntimeApiRequest::CandidateEvents` and checking the `CandidateIncluded` events.
@@ -156,7 +156,7 @@ On receiving an `OverseerSignal::ActiveLeavesUpdate(update)`:
* If the `session index - APPROVAL_SESSIONS > state.earliest_session`, then bump `state.earliest_sessions` to that amount and prune earlier sessions.
* If the session isn't in our `state.session_info`, load the session info for it and for all sessions since the earliest-session, including the earliest-session, if that is missing. And it can be, just after pruning, if we've done a big jump forward, as is the case when we've just finished chain synchronization.
* If any of the runtime API calls fail, we just warn and skip the block.
* We use the RuntimeApiSubsystem to determine the set of candidates included in these blocks and use BABE logic to determine the slot number and VRF of the blocks.
* We use the RuntimeApiSubsystem to determine the set of candidates included in these blocks and use BABE logic to determine the slot number and VRF of the blocks.
* We also note how late we appear to have received the block. We create a `BlockEntry` for each block and a `CandidateEntry` for each candidate obtained from `CandidateIncluded` events after making a `RuntimeApiRequest::CandidateEvents` request.
* Ensure that the `CandidateEntry` contains a `block_assignments` entry for the block, with the correct backing group set.
* If a validator in this session, compute and assign `our_assignment` for the `block_assignments`
@@ -222,7 +222,7 @@ On receiving an `ApprovedAncestor(Hash, BlockNumber, response_channel)`:
```rust
enum RequiredTranches {
// All validators appear to be required, based on tranches already taken and remaining no-shows.
// All validators appear to be required, based on tranches already taken and remaining no-shows.
All,
// More tranches required - We're awaiting more assignments. The given `DelayTranche` indicates the
// upper bound of tranches that should broadcast based on the last no-show.
@@ -30,7 +30,7 @@ struct AssignmentCert {
> TODO: RelayEquivocation cert. Probably can only be broadcast to chains that have handled an equivocation report.
## IndirectAssignmentCert
## IndirectAssignmentCert
An assignment cert which refers to the candidate under which the assignment is relevant by block hash.
@@ -74,7 +74,7 @@ struct IndirectSignedApprovalVote {
// A block hash where the candidate appears.
block_hash: Hash,
// The index of the candidate in the list of candidates fully included as-of the block.
candidate_index: u32,
candidate_index: CandidateIndex,
validator: ValidatorIndex,
signature: ValidatorSignature,
}
@@ -98,4 +98,4 @@ struct CheckedAssignmentCert {
```rust
type DelayTranche = u32;
```
```
@@ -95,6 +95,8 @@ struct BlockApprovalMeta {
hash: Hash,
/// The number of the block.
number: BlockNumber,
/// The hash of the parent block.
parent_hash: Hash,
/// The candidates included by the block. Note that these are not the same as the candidates that appear within the
/// block body.
candidates: Vec<CandidateHash>,
@@ -108,9 +110,7 @@ enum ApprovalDistributionMessage {
NewBlocks(Vec<BlockApprovalMeta>),
/// Distribute an assignment cert from the local validator. The cert is assumed
/// to be valid, relevant, and for the given relay-parent and validator index.
///
/// The `u32` param is the candidate index in the fully-included list.
DistributeAssignment(IndirectAssignmentCert, u32),
DistributeAssignment(IndirectAssignmentCert, CandidateIndex),
/// Distribute an approval vote for the local validator. The approval vote is assumed to be
/// valid, relevant, and the corresponding approval already issued. If not, the subsystem is free to drop
/// the message.