We need to distribute the PoV after we have seconded it. Other nodes that will receive our `Secondded` statement and want to validate the candidate another time will request this PoV from us.
8.4 KiB
PoV Distribution
This subsystem is responsible for distributing PoV blocks. For now, unified with Statement Distribution subsystem.
Protocol
PeerSet: Validation
Input: PoVDistributionMessage
Output:
- NetworkBridge::SendMessage(
[PeerId], message) - NetworkBridge::ReportPeer(PeerId, cost_or_benefit)
Functionality
This network protocol is responsible for distributing PoVs by gossip. Since PoVs are heavy in practice, gossip is far from the most efficient way to distribute them. In the future, this should be replaced by a better network protocol that finds validators who have validated the block and connects to them directly. This protocol is described.
This protocol is described in terms of "us" and our peers, with the understanding that this is the procedure that any honest node will run. It has the following goals:
- We never have to buffer an unbounded amount of data
- PoVs will flow transitively across a network of honest nodes, stemming from the validators that originally seconded candidates requiring those PoVs.
As we are gossiping, we need to track which PoVs our peers are waiting for to avoid sending them data that they are not expecting. It is not reasonable to expect our peers to buffer unexpected PoVs, just as we will not buffer unexpected PoVs. So notifying our peers about what is being awaited is key. However it is important that the notifications system is also bounded.
For this, in order to avoid reaching into the internals of the Statement Distribution Subsystem, we can rely on an expected property of candidate backing: that each validator can second up to 2 candidates per chain head. This will typically be only one, because they are only supposed to issue one, but they can equivocate if they are willing to be slashed. So we can set a cap on the number of PoVs each peer is allowed to notify us that they are waiting for at a given relay-parent. This cap will be twice the number of validators at that relay-parent. In practice, this is a very lax upper bound that can be reduced much further if desired.
The view update mechanism of the Network Bridge ensures that peers are only allowed to consider a certain set of relay-parents as live. So this bounding mechanism caps the amount of data we need to store per peer at any time at sum({ 2 * n_validators_at_head(head) * sizeof(hash) for head in view_heads }). Additionally, peers should only be allowed to notify us of PoV hashes they are waiting for in the context of relay-parents in our own local view, which means that n_validators_at_head is implied to be 0 for relay-parents not in our own local view.
View updates from peers and our own view updates are received from the network bridge. These will lag somewhat behind the ActiveLeavesUpdate messages received from the overseer, which will influence the actual data we store. The OurViewUpdates from the NetworkBridgeEvent must be considered canonical in terms of our peers' perception of us.
Lastly, the system needs to be bootstrapped with our own perception of which PoVs we are cognizant of but awaiting data for. This is done by receipt of the PoVDistributionMessage::FetchPoV variant. Proper operation of this subsystem depends on the descriptors passed faithfully representing candidates which have been seconded by other validators.
Formal Description
This protocol can be implemented as a state machine with the following state:
struct State {
relay_parent_state: Map<Hash, BlockBasedState>,
peer_state: Map<PeerId, PeerState>,
our_view: View,
}
struct BlockBasedState {
known: Map<Hash, PoV>, // should be a shared PoV in practice. these things are heavy.
fetching: Map<Hash, [ResponseChannel<PoV>]>,
n_validators: usize,
}
struct PeerState {
awaited: Map<Hash, Set<Hash>>,
}
We also use the PoVDistributionV1Message as our NetworkMessage, which are sent and received by the Network Bridge
Here is the logic of the state machine:
Overseer Signals
- On
ActiveLeavesUpdate(relay_parent):- For each relay-parent in the
activatedlist:- Get the number of validators at that relay parent by querying the Runtime API for the validators and then counting them.
- Create a blank entry in
relay_parent_stateunderrelay_parentwith correctn_validatorsset.
- For each relay-parent in the
deactivatedlist:- Remove the entry for
relay_parentfromrelay_parent_state.
- Remove the entry for
- For each relay-parent in the
- On
Conclude: conclude.
PoV Distribution Messages
- On
FetchPoV(relay_parent, descriptor, response_channel)- If there is no entry in
relay_parent_stateunderrelay_parent, ignore. - If there is a PoV under
descriptor.pov_hashin theknownmap, send that PoV on the channel and return. - Otherwise, place the
response_channelin thefetchingmap underdescriptor.pov_hash. - If the
pov_hashhad no previous entry infetchingand there are2 * n_validatorsor fewer entries in thefetchingset, sendNetworkMessage::Awaiting(relay_parent, vec![pov_hash])to all peers.
- If there is no entry in
- On
DistributePoV(relay_parent, descriptor, PoV)- If there is no entry in
relay_parent_stateunderrelay_parent, ignore. - Complete and remove any channels under
descriptor.pov_hashin thefetchingmap. - Send
NetworkMessage::SendPoV(relay_parent, descriptor.pov_hash, PoV)to all peers who have thedescriptor.pov_hashin the set underrelay_parentin thepeer.awaitedmap and remove the entry frompeer.awaited. - Note the PoV under
descriptor.pov_hashinknown.
- If there is no entry in
Network Bridge Updates
- On
PeerConnected(peer_id, observed_role)- Make a fresh entry in the
peer_statemap for thepeer_id.
- Make a fresh entry in the
- On
PeerDisconnected(peer_id)- Remove the entry for
peer_idfrom thepeer_statemap.
- Remove the entry for
- On
PeerMessage(peer_id, bytes)- If the bytes do not decode to a
NetworkMessageor thepeer_idhas no entry in thepeer_statemap, report and ignore. - If this is
NetworkMessage::Awaiting(relay_parent, pov_hashes):- If there is no entry under
peer_state.awaitedfor therelay_parent, report and ignore. - If
relay_parentis not contained withinour_view, report and ignore. - Otherwise, if the peer's
awaitedmap combined with thepov_hasheswould have more than2 * relay_parent_state[relay_parent].n_validatorsentries, report and ignore. Note that we are leaning on the property of the network bridge that it sets our view based onactivatedheads inActiveLeavesUpdatesignals. - For each new
pov_hashinpov_hashes, if there is apovunderpov_hashin theknownmap, send the peer aNetworkMessage::SendPoV(relay_parent, pov_hash, pov). - Otherwise, add the
pov_hashto theawaitedmap
- If there is no entry under
- If this is
NetworkMessage::SendPoV(relay_parent, pov_hash, pov):- If there is no entry under
relay_parentinrelay_parent_stateor no entry underpov_hashin ourfetchingmap for thatrelay_parent, report and ignore. - If the blake2-256 hash of the pov doesn't equal
pov_hash, report and ignore. - Complete and remove any listeners in the
fetchingmap underpov_hash. However, leave an empty set of listeners in thefetchingmap to denote that this was something we once awaited. This will allow us to recognize peers who have sent us something we were expecting, but just a little late. - Add to
knownmap. - Remove the
pov_hashfrom thepeer.awaitedmap, if any. - Send
NetworkMessage::SendPoV(relay_parent, descriptor.pov_hash, PoV)to all peers who have thedescriptor.pov_hashin the set underrelay_parentin thepeer.awaitedmap and remove the entry frompeer.awaited.
- If there is no entry under
- If the bytes do not decode to a
- On
PeerViewChange(peer_id, view)- If Peer is unknown, ignore.
- Ensure there is an entry under
relay_parentfor eachrelay_parentinviewwithin thepeer.awaitedmap, creating blankawaitedlists as necessary. - Remove all entries under
peer.awaitedthat are not withinview. - For all hashes in
viewbut were not within the old, send the peer all the keys in ourfetchingmap under the block-based state for that hash - i.e. notify the peer of everything we are awaiting at that hash.
- On
OurViewChange(view)- Update
our_viewtoview
- Update