Markdown linter (#1309)

* Add markdown linting

- add linter default rules
- adapt rules to current code
- fix the code for linting to pass
- add CI check

fix #1243

* Fix markdown for Substrate
* Fix tooling install
* Fix workflow
* Add documentation
* Remove trailing spaces
* Update .github/.markdownlint.yaml

Co-authored-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io>
* Fix mangled markdown/lists
* Fix captalization issues on known words
This commit is contained in:
Chevdor
2023-09-04 11:02:32 +02:00
committed by GitHub
parent 830fde2a60
commit a30092ab42
271 changed files with 6289 additions and 4450 deletions
@@ -1,3 +1,7 @@
# Availability Subsystems
The availability subsystems are responsible for ensuring that Proofs of Validity of backed candidates are widely available within the validator set, without requiring every node to retain a full copy. They accomplish this by broadly distributing erasure-coded chunks of the PoV, keeping track of which validator has which chunk by means of signed bitfields. They are also responsible for reassembling a complete PoV when required, e.g. when an approval checker needs to validate a parachain block.
The availability subsystems are responsible for ensuring that Proofs of Validity of backed candidates are widely
available within the validator set, without requiring every node to retain a full copy. They accomplish this by broadly
distributing erasure-coded chunks of the PoV, keeping track of which validator has which chunk by means of signed
bitfields. They are also responsible for reassembling a complete PoV when required, e.g. when an approval checker needs
to validate a parachain block.
@@ -1,31 +1,26 @@
# Availability Distribution
This subsystem is responsible for distribution availability data to peers.
Availability data are chunks, `PoV`s and `AvailableData` (which is `PoV` +
`PersistedValidationData`). It does so via request response protocols.
This subsystem is responsible for distribution availability data to peers. Availability data are chunks, `PoV`s and
`AvailableData` (which is `PoV` + `PersistedValidationData`). It does so via request response protocols.
In particular this subsystem is responsible for:
- Respond to network requests requesting availability data by querying the
[Availability Store](../utility/availability-store.md).
- Request chunks from backing validators to put them in the local `Availability
Store` whenever we find an occupied core on any fresh leaf,
this is to ensure availability by at least 2/3+ of all validators, this
happens after a candidate is backed.
- Fetch `PoV` from validators, when requested via `FetchPoV` message from
backing (`pov_requester` module).
- Respond to network requests requesting availability data by querying the [Availability
Store](../utility/availability-store.md).
- Request chunks from backing validators to put them in the local `Availability Store` whenever we find an occupied core
on any fresh leaf, this is to ensure availability by at least 2/3+ of all validators, this happens after a candidate
is backed.
- Fetch `PoV` from validators, when requested via `FetchPoV` message from backing (`pov_requester` module).
The backing subsystem is responsible of making available data available in the
local `Availability Store` upon validation. This subsystem will serve any
network requests by querying that store.
The backing subsystem is responsible of making available data available in the local `Availability Store` upon
validation. This subsystem will serve any network requests by querying that store.
## Protocol
This subsystem does not handle any peer set messages, but the `pov_requester`
does connect to validators of the same backing group on the validation peer
set, to ensure fast propagation of statements between those validators and for
ensuring already established connections for requesting `PoV`s. Other than that
this subsystem drives request/response protocols.
This subsystem does not handle any peer set messages, but the `pov_requester` does connect to validators of the same
backing group on the validation peer set, to ensure fast propagation of statements between those validators and for
ensuring already established connections for requesting `PoV`s. Other than that this subsystem drives request/response
protocols.
Input:
@@ -48,51 +43,42 @@ Output:
### PoV Requester
The PoV requester in the `pov_requester` module takes care of staying connected
to validators of the current backing group of this very validator on the `Validation`
peer set and it will handle `FetchPoV` requests by issuing network requests to
those validators. It will check the hash of the received `PoV`, but will not do any
further validation. That needs to be done by the original `FetchPoV` sender
(backing subsystem).
The PoV requester in the `pov_requester` module takes care of staying connected to validators of the current backing
group of this very validator on the `Validation` peer set and it will handle `FetchPoV` requests by issuing network
requests to those validators. It will check the hash of the received `PoV`, but will not do any further validation. That
needs to be done by the original `FetchPoV` sender (backing subsystem).
### Chunk Requester
After a candidate is backed, the availability of the PoV block must be confirmed
by 2/3+ of all validators. The chunk requester is responsible of making that
availability a reality.
After a candidate is backed, the availability of the PoV block must be confirmed by 2/3+ of all validators. The chunk
requester is responsible of making that availability a reality.
It does that by querying checking occupied cores for all active leaves. For each
occupied core it will spawn a task fetching the erasure chunk which has the
`ValidatorIndex` of the node. For this an `ChunkFetchingRequest` is issued, via
substrate's generic request/response protocol.
It does that by querying checking occupied cores for all active leaves. For each occupied core it will spawn a task
fetching the erasure chunk which has the `ValidatorIndex` of the node. For this an `ChunkFetchingRequest` is issued, via
Substrate's generic request/response protocol.
The spawned task will start trying to fetch the chunk from validators in
responsible group of the occupied core, in a random order. For ensuring that we
use already open TCP connections wherever possible, the requester maintains a
cache and preserves that random order for the entire session.
The spawned task will start trying to fetch the chunk from validators in responsible group of the occupied core, in a
random order. For ensuring that we use already open TCP connections wherever possible, the requester maintains a cache
and preserves that random order for the entire session.
Note however that, because not all validators in a group have to be actual
backers, not all of them are required to have the needed chunk. This in turn
could lead to low throughput, as we have to wait for fetches to fail,
before reaching a validator finally having our chunk. We do rank back validators
not delivering our chunk, but as backers could vary from block to block on a
perfectly legitimate basis, this is still not ideal. See issues [2509](https://github.com/paritytech/polkadot/issues/2509) and [2512](https://github.com/paritytech/polkadot/issues/2512)
for more information.
Note however that, because not all validators in a group have to be actual backers, not all of them are required to have
the needed chunk. This in turn could lead to low throughput, as we have to wait for fetches to fail, before reaching a
validator finally having our chunk. We do rank back validators not delivering our chunk, but as backers could vary from
block to block on a perfectly legitimate basis, this is still not ideal. See issues
[2509](https://github.com/paritytech/polkadot/issues/2509) and
[2512](https://github.com/paritytech/polkadot/issues/2512) for more information.
The current implementation also only fetches chunks for occupied cores in blocks
in active leaves. This means though, if active leaves skips a block or we are
particularly slow in fetching our chunk, we might not fetch our chunk if
availability reached 2/3 fast enough (slot becomes free). This is not desirable
as we would like as many validators as possible to have their chunk. See this
[issue](https://github.com/paritytech/polkadot/issues/2513) for more details.
The current implementation also only fetches chunks for occupied cores in blocks in active leaves. This means though, if
active leaves skips a block or we are particularly slow in fetching our chunk, we might not fetch our chunk if
availability reached 2/3 fast enough (slot becomes free). This is not desirable as we would like as many validators as
possible to have their chunk. See this [issue](https://github.com/paritytech/polkadot/issues/2513) for more details.
### Serving
On the other side the subsystem will listen for incoming `ChunkFetchingRequest`s
and `PoVFetchingRequest`s from the network bridge and will respond to queries,
by looking the requested chunks and `PoV`s up in the availability store, this
happens in the `responder` module.
On the other side the subsystem will listen for incoming `ChunkFetchingRequest`s and `PoVFetchingRequest`s from the
network bridge and will respond to queries, by looking the requested chunks and `PoV`s up in the availability store,
this happens in the `responder` module.
We rely on the backing subsystem to make available data available locally in the
`Availability Store` after it has validated it.
We rely on the backing subsystem to make available data available locally in the `Availability Store` after it has
validated it.
@@ -1,8 +1,13 @@
# Availability Recovery
This subsystem is the inverse of the [Availability Distribution](availability-distribution.md) subsystem: validators will serve the availability chunks kept in the availability store to nodes who connect to them. And the subsystem will also implement the other side: the logic for nodes to connect to validators, request availability pieces, and reconstruct the `AvailableData`.
This subsystem is the inverse of the [Availability Distribution](availability-distribution.md) subsystem: validators
will serve the availability chunks kept in the availability store to nodes who connect to them. And the subsystem will
also implement the other side: the logic for nodes to connect to validators, request availability pieces, and
reconstruct the `AvailableData`.
This version of the availability recovery subsystem is based off of direct connections to validators. In order to recover any given `AvailableData`, we must recover at least `f + 1` pieces from validators of the session. Thus, we will connect to and query randomly chosen validators until we have received `f + 1` pieces.
This version of the availability recovery subsystem is based off of direct connections to validators. In order to
recover any given `AvailableData`, we must recover at least `f + 1` pieces from validators of the session. Thus, we will
connect to and query randomly chosen validators until we have received `f + 1` pieces.
## Protocol
@@ -10,18 +15,20 @@ This version of the availability recovery subsystem is based off of direct conne
Input:
- `NetworkBridgeUpdate(update)`
- `AvailabilityRecoveryMessage::RecoverAvailableData(candidate, session, backing_group, response)`
* `NetworkBridgeUpdate(update)`
* `AvailabilityRecoveryMessage::RecoverAvailableData(candidate, session, backing_group, response)`
Output:
- `NetworkBridge::SendValidationMessage`
- `NetworkBridge::ReportPeer`
- `AvailabilityStore::QueryChunk`
* `NetworkBridge::SendValidationMessage`
* `NetworkBridge::ReportPeer`
* `AvailabilityStore::QueryChunk`
## Functionality
We hold a state which tracks the currently ongoing recovery tasks, as well as which request IDs correspond to which task. A recovery task is a structure encapsulating all recovery tasks with the network necessary to recover the available data in respect to one candidate.
We hold a state which tracks the currently ongoing recovery tasks, as well as which request IDs correspond to which
task. A recovery task is a structure encapsulating all recovery tasks with the network necessary to recover the
available data in respect to one candidate.
```rust
struct State {
@@ -87,17 +94,22 @@ On `Conclude`, shut down the subsystem.
1. Check the `availability_lru` for the candidate and return the data if so.
1. Check if there is already an recovery handle for the request. If so, add the response handle to it.
1. Otherwise, load the session info for the given session under the state of `live_block_hash`, and initiate a recovery task with *`launch_recovery_task`*. Add a recovery handle to the state and add the response channel to it.
1. Otherwise, load the session info for the given session under the state of `live_block_hash`, and initiate a recovery
task with *`launch_recovery_task`*. Add a recovery handle to the state and add the response channel to it.
1. If the session info is not available, return `RecoveryError::Unavailable` on the response channel.
### Recovery logic
#### `launch_recovery_task(session_index, session_info, candidate_receipt, candidate_hash, Option<backing_group_index>)`
1. Compute the threshold from the session info. It should be `f + 1`, where `n = 3f + k`, where `k in {1, 2, 3}`, and `n` is the number of validators.
1. Set the various fields of `RecoveryParams` based on the validator lists in `session_info` and information about the candidate.
1. If the `backing_group_index` is `Some`, start in the `RequestFromBackers` phase with a shuffling of the backing group validator indices and a `None` requesting value.
1. Otherwise, start in the `RequestChunksFromValidators` source with `received_chunks`,`requesting_chunks`, and `next_shuffling` all empty.
1. Compute the threshold from the session info. It should be `f + 1`, where `n = 3f + k`, where `k in {1, 2, 3}`, and
`n` is the number of validators.
1. Set the various fields of `RecoveryParams` based on the validator lists in `session_info` and information about the
candidate.
1. If the `backing_group_index` is `Some`, start in the `RequestFromBackers` phase with a shuffling of the backing group
validator indices and a `None` requesting value.
1. Otherwise, start in the `RequestChunksFromValidators` source with `received_chunks`,`requesting_chunks`, and
`next_shuffling` all empty.
1. Set the `to_subsystems` sender to be equal to a clone of the `SubsystemContext`'s sender.
1. Initialize `received_chunks` to an empty set, as well as `requesting_chunks`.
@@ -115,19 +127,24 @@ const N_PARALLEL: usize = 50;
* Loop:
* If the `requesting_pov` is `Some`, poll for updates on it. If it concludes, set `requesting_pov` to `None`.
* If the `requesting_pov` is `None`, take the next backer off the `shuffled_backers`.
* If the backer is `Some`, issue a `NetworkBridgeMessage::Requests` with a network request for the `AvailableData` and wait for the response.
* If the backer is `Some`, issue a `NetworkBridgeMessage::Requests` with a network request for the
`AvailableData` and wait for the response.
* If it concludes with a `None` result, return to beginning.
* If it concludes with available data, attempt a re-encoding.
* If it has the correct erasure-root, break and issue a `Ok(available_data)`.
* If it has an incorrect erasure-root, return to beginning.
* Send the result to each member of `awaiting`.
* If the backer is `None`, set the source to `RequestChunksFromValidators` with a random shuffling of validators and empty `received_chunks`, and `requesting_chunks` and break the loop.
* If the backer is `None`, set the source to `RequestChunksFromValidators` with a random shuffling of validators
and empty `received_chunks`, and `requesting_chunks` and break the loop.
* If the task contains `RequestChunksFromValidators`:
* Request `AvailabilityStoreMessage::QueryAllChunks`. For each chunk that exists, add it to `received_chunks` and remote the validator from `shuffling`.
* Request `AvailabilityStoreMessage::QueryAllChunks`. For each chunk that exists, add it to `received_chunks` and
remote the validator from `shuffling`.
* Loop:
* If `received_chunks + requesting_chunks + shuffling` lengths are less than the threshold, break and return `Err(Unavailable)`.
* Poll for new updates from `requesting_chunks`. Check merkle proofs of any received chunks. If the request simply fails due to network issues, insert into the front of `shuffling` to be retried.
* If `received_chunks + requesting_chunks + shuffling` lengths are less than the threshold, break and return
`Err(Unavailable)`.
* Poll for new updates from `requesting_chunks`. Check merkle proofs of any received chunks. If the request simply
fails due to network issues, insert into the front of `shuffling` to be retried.
* If `received_chunks` has more than `threshold` entries, attempt to recover the data.
* If that fails, return `Err(RecoveryError::Invalid)`
* If correct:
@@ -135,5 +152,6 @@ const N_PARALLEL: usize = 50;
* break and issue `Ok(available_data)`
* Send the result to each member of `awaiting`.
* While there are fewer than `N_PARALLEL` entries in `requesting_chunks`,
* Pop the next item from `shuffling`. If it's empty and `requesting_chunks` is empty, return `Err(RecoveryError::Unavailable)`.
* Pop the next item from `shuffling`. If it's empty and `requesting_chunks` is empty, return
`Err(RecoveryError::Unavailable)`.
* Issue a `NetworkBridgeMessage::Requests` and wait for the response in `requesting_chunks`.
@@ -1,34 +1,40 @@
# Bitfield Distribution
Validators vote on the availability of a backed candidate by issuing signed bitfields, where each bit corresponds to a single candidate. These bitfields can be used to compactly determine which backed candidates are available or not based on a 2/3+ quorum.
Validators vote on the availability of a backed candidate by issuing signed bitfields, where each bit corresponds to a
single candidate. These bitfields can be used to compactly determine which backed candidates are available or not based
on a 2/3+ quorum.
## Protocol
`PeerSet`: `Validation`
Input:
[`BitfieldDistributionMessage`](../../types/overseer-protocol.md#bitfield-distribution-message) which are gossiped to all peers, no matter if validator or not.
Input: [`BitfieldDistributionMessage`](../../types/overseer-protocol.md#bitfield-distribution-message) which are
gossiped to all peers, no matter if validator or not.
Output:
- `NetworkBridge::SendValidationMessage([PeerId], message)` gossip a verified incoming bitfield on to interested subsystems within this validator node.
- `NetworkBridge::ReportPeer(PeerId, cost_or_benefit)` improve or penalize the reputation of peers based on the messages that are received relative to the current view.
- `ProvisionerMessage::ProvisionableData(ProvisionableData::Bitfield(relay_parent, SignedAvailabilityBitfield))` pass
on the bitfield to the other submodules via the overseer.
- `NetworkBridge::SendValidationMessage([PeerId], message)` gossip a verified incoming bitfield on to interested
subsystems within this validator node.
- `NetworkBridge::ReportPeer(PeerId, cost_or_benefit)` improve or penalize the reputation of peers based on the messages
that are received relative to the current view.
- `ProvisionerMessage::ProvisionableData(ProvisionableData::Bitfield(relay_parent, SignedAvailabilityBitfield))` pass on
the bitfield to the other submodules via the overseer.
## Functionality
This is implemented as a gossip system.
It is necessary to track peer connection, view change, and disconnection events, in order to maintain an index of which peers are interested in which relay parent bitfields.
It is necessary to track peer connection, view change, and disconnection events, in order to maintain an index of which
peers are interested in which relay parent bitfields.
Before gossiping incoming bitfields, they must be checked to be signed by one of the validators
of the validator set relevant to the current relay parent.
Only accept bitfields relevant to our current view and only distribute bitfields to other peers when relevant to their most recent view.
Accept and distribute only one bitfield per validator.
Before gossiping incoming bitfields, they must be checked to be signed by one of the validators of the validator set
relevant to the current relay parent. Only accept bitfields relevant to our current view and only distribute bitfields
to other peers when relevant to their most recent view. Accept and distribute only one bitfield per validator.
When receiving a bitfield either from the network or from a `DistributeBitfield` message, forward it along to the block authorship (provisioning) subsystem for potential inclusion in a block.
When receiving a bitfield either from the network or from a `DistributeBitfield` message, forward it along to the block
authorship (provisioning) subsystem for potential inclusion in a block.
Peers connecting after a set of valid bitfield gossip messages was received, those messages must be cached and sent upon connection of new peers or re-connecting peers.
Peers connecting after a set of valid bitfield gossip messages was received, those messages must be cached and sent upon
connection of new peers or re-connecting peers.
@@ -1,12 +1,15 @@
# Bitfield Signing
Validators vote on the availability of a backed candidate by issuing signed bitfields, where each bit corresponds to a single candidate. These bitfields can be used to compactly determine which backed candidates are available or not based on a 2/3+ quorum.
Validators vote on the availability of a backed candidate by issuing signed bitfields, where each bit corresponds to a
single candidate. These bitfields can be used to compactly determine which backed candidates are available or not based
on a 2/3+ quorum.
## Protocol
Input:
There is no dedicated input mechanism for bitfield signing. Instead, Bitfield Signing produces a bitfield representing the current state of availability on `StartWork`.
There is no dedicated input mechanism for bitfield signing. Instead, Bitfield Signing produces a bitfield representing
the current state of availability on `StartWork`.
Output:
@@ -15,15 +18,20 @@ Output:
## Functionality
Upon receipt of an `ActiveLeavesUpdate`, launch bitfield signing job for each `activated` head referring to a fresh leaf. Stop the job for each `deactivated` head.
Upon receipt of an `ActiveLeavesUpdate`, launch bitfield signing job for each `activated` head referring to a fresh
leaf. Stop the job for each `deactivated` head.
## Bitfield Signing Job
Localized to a specific relay-parent `r`
If not running as a validator, do nothing.
Localized to a specific relay-parent `r` If not running as a validator, do nothing.
- For each fresh leaf, begin by waiting a fixed period of time so availability distribution has the chance to make candidates available.
- Determine our validator index `i`, the set of backed candidates pending availability in `r`, and which bit of the bitfield each corresponds to.
- Start with an empty bitfield. For each bit in the bitfield, if there is a candidate pending availability, query the [Availability Store](../utility/availability-store.md) for whether we have the availability chunk for our validator index. The `OccupiedCore` struct contains the candidate hash so the full candidate does not need to be fetched from runtime.
- For each fresh leaf, begin by waiting a fixed period of time so availability distribution has the chance to make
candidates available.
- Determine our validator index `i`, the set of backed candidates pending availability in `r`, and which bit of the
bitfield each corresponds to.
- Start with an empty bitfield. For each bit in the bitfield, if there is a candidate pending availability, query the
[Availability Store](../utility/availability-store.md) for whether we have the availability chunk for our validator
index. The `OccupiedCore` struct contains the candidate hash so the full candidate does not need to be fetched from
runtime.
- For all chunks we have, set the corresponding bit in the bitfield.
- Sign the bitfield and dispatch a `BitfieldDistribution::DistributeBitfield` message.