implement candidate selection subsystem (#1645)

* choose the straightforward candidate selection algorithm for now

* add draft implementation of candidate selection

* fix typo in summary

* more properly report misbehaving collators

* describe how CandidateSelection subsystem becomes aware of candidates

* revise candidate selection / collator protocol interaction pattern

* implement rest of candidate selection per the guide

* review: resolve nits

* start writing test suite, harness

* implement first test

* add second test

* implement third test

Co-authored-by: Bernhard Schuster <bernhard@ahoi.io>
This commit is contained in:
Peter Goodspeed-Niklaus
2020-09-08 11:48:48 +02:00
committed by GitHub
parent 2f800d1489
commit d1b1c17285
8 changed files with 739 additions and 17 deletions
@@ -10,15 +10,17 @@ Validation of candidates is a heavy task, and furthermore, the [`PoV`][PoV] itse
> TODO: note the incremental validation function Ximin proposes at https://github.com/paritytech/polkadot/issues/1348
As this network protocol serves as a bridge between collators and validators, it communicates primarily with one subsystem on behalf of each. As a collator, this will receive messages from the [`CollationGeneration`][CG] subsystem. As a validator, this will communicate with the [`CandidateBacking`][CB] subsystem.
As this network protocol serves as a bridge between collators and validators, it communicates primarily with one subsystem on behalf of each. As a collator, this will receive messages from the [`CollationGeneration`][CG] subsystem. As a validator, this will communicate with the [`CandidateBacking`][CB] and [`CandidateSelection`][CS] subsystems.
## Protocol
Input: [`CollatorProtocolMessage`][CPM]
Output:
- [`RuntimeApiMessage`][RAM]
- [`NetworkBridgeMessage`][NBM]
- [`RuntimeApiMessage`][RAM]
- [`NetworkBridgeMessage`][NBM]
- [`CandidateSelectionMessage`][CSM]
## Functionality
@@ -102,18 +104,30 @@ When peers connect to us, they can `Declare` that they represent a collator with
The protocol tracks advertisements received and the source of the advertisement. The advertisement source is the `PeerId` of the peer who sent the message. We accept one advertisement per collator per source per relay-parent.
As a validator, we will handle requests from other subsystems to fetch a collation on a specific `ParaId` and relay-parent. These requests are made with the [`CollatorProtocolMessage`][CPM]`::FetchCollation`. To do so, we need to first check if we have already gathered a collation on that `ParaId` and relay-parent. If not, we need to select one of the advertisements and issue a request for it. If we've already issued a request, we shouldn't issue another one until the first has returned.
When acting on an advertisement, we issue a `WireMessage::RequestCollation`. If the request times out, we need to note the collator as being unreliable and reduce its priority relative to other collators. And then make another request - repeat until we get a response or the chain has moved on.
As a validator, once the collation has been fetched some other subsystem will inspect and do deeper validation of the collation. The subsystem will report to this subsystem with a [`CollatorProtocolMessage`][CPM]`::ReportCollator` or `NoteGoodCollation` message. In that case, if we are connected directly to the collator, we apply a cost to the `PeerId` associated with the collator and potentially disconnect or blacklist it.
[PoV]: ../../types/availability.md#proofofvalidity
[CPM]: ../../types/overseer-protocol.md#collatorprotocolmessage
[CG]: collation-generation.md
### Interaction with [Candidate Selection][CS]
As collators advertise the availability, we notify the Candidate Selection subsystem with a [`CandidateSelection`][CSM]`::Collation` message. Note that this message is lightweight: it only contains the relay parent, para id, and collator id.
At that point, the Candidate Selection algorithm is free to use an arbitrary algorithm to determine which if any of these messages to follow up on. It is expected to use the [`CollatorProtocolMessage`][CPM]`::FetchCollation` message to follow up.
The intent behind this design is to minimize the total number of (large) collations which must be transmitted.
[CB]: ../backing/candidate-backing.md
[CBM]: ../../types/overseer-protocol.md#candidate-backing-mesage
[CG]: collation-generation.md
[CPM]: ../../types/overseer-protocol.md#collator-protocol-message
[CS]: ../backing/candidate-selection.md
[CSM]: ../../types/overseer-protocol.md#candidate-selection-message
[NB]: ../utility/network-bridge.md
[CBM]: ../../types/overseer-protocol.md#candidatebackingmesage
[RAM]: ../../types/overseer-protocol.md#runtimeapimessage
[NBM]: ../../types/overseer-protocol.md#networkbridgemessage
[NBM]: ../../types/overseer-protocol.md#network-bridge-message
[PoV]: ../../types/availability.md#proofofvalidity
[RAM]: ../../types/overseer-protocol.md#runtime-api-message
[SCH]: ../../runtime/scheduler.md