mirror of
https://github.com/pezkuwichain/pezkuwi-subxt.git
synced 2026-07-03 22:47:25 +00:00
Request based PoV distribution (#2640)
* Indentation fix. * Prepare request-response for PoV fetching. * Drop old PoV distribution. * WIP: Fetch PoV directly from backing. * Backing compiles. * Runtime access and connection management for PoV distribution. * Get rid of seemingly dead code. * Implement PoV fetching. Backing does not yet use it. * Don't send `ConnectToValidators` for empty list. * Even better - no need to check over and over again. * PoV fetching implemented. + Typechecks + Should work Missing: - Guide - Tests - Do fallback fetching in case fetching from seconding validator fails. * Check PoV hash upon reception. * Implement retry of PoV fetching in backing. * Avoid pointless validation spawning. * Add jaeger span to pov requesting. * Add back tracing. * Review remarks. * Whitespace. * Whitespace again. * Cleanup + fix tests. * Log to log target in overseer. * Fix more tests. * Don't fail if group cannot be found. * Simple test for PoV fetcher. * Handle missing group membership better. * Add test for retry functionality. * Fix flaky test. * Spaces again. * Guide updates. * Spaces.
This commit is contained in:
+52
-18
@@ -1,49 +1,79 @@
|
||||
# Availability Distribution
|
||||
|
||||
Distribute availability erasure-coded chunks to validators.
|
||||
This subsystem is responsible for distribution availability data to peers.
|
||||
Availability data are chunks, `PoV`s and `AvailableData` (which is `PoV` +
|
||||
`PersistedValidationData`). It does so via request response protocols.
|
||||
|
||||
After a candidate is backed, the availability of the PoV block must be confirmed
|
||||
by 2/3+ of all validators. Backing nodes will serve chunks for a PoV block from
|
||||
their [Availability Store](../utility/availability-store.md), all other
|
||||
validators request their chunks from backing nodes and store those received chunks in
|
||||
their local availability store.
|
||||
In particular this subsystem is responsible for:
|
||||
|
||||
- Respond to network requests requesting availability data by querying the
|
||||
[Availability Store](../utility/availability-store.md).
|
||||
- Request chunks from backing validators to put them in the local `Availability
|
||||
Store` whenever we find an occupied core on the chain,
|
||||
this is to ensure availability by at least 2/3+ of all validators, this
|
||||
happens after a candidate is backed.
|
||||
- Fetch `PoV` from validators, when requested via `FetchPoV` message from
|
||||
backing (pov_requester module).
|
||||
-
|
||||
The backing subsystem is responsible of making available data available in the
|
||||
local `Availability Store` upon validation. This subsystem will serve any
|
||||
network requests by querying that store.
|
||||
|
||||
## Protocol
|
||||
|
||||
This subsystem has no associated peer set right now, but instead relies on
|
||||
a request/response protocol, defined by `Protocol::ChunkFetching`.
|
||||
This subsystem does not handle any peer set messages, but the `pov_requester`
|
||||
does connecto to validators of the same backing group on the validation peer
|
||||
set, to ensure fast propagation of statements between those validators and for
|
||||
ensuring already established connections for requesting `PoV`s. Other than that
|
||||
this subsystem drives request/response protocols.
|
||||
|
||||
Input:
|
||||
|
||||
- OverseerSignal::ActiveLeaves(`[ActiveLeavesUpdate]`)
|
||||
- AvailabilityDistributionMessage{msg: ChunkFetchingRequest}
|
||||
- AvailabilityDistributionMessage{msg: PoVFetchingRequest}
|
||||
- AvailabilityDistributionMessage{msg: FetchPoV}
|
||||
|
||||
Output:
|
||||
|
||||
- NetworkBridgeMessage::SendRequests(`[Requests]`, IfDisconnected::TryConnect)
|
||||
- AvailabilityStore::QueryChunk(candidate_hash, index, response_channel)
|
||||
- AvailabilityStore::StoreChunk(candidate_hash, chunk)
|
||||
- AvailabilityStore::QueryAvailableData(candidate_hash, response_channel)
|
||||
- RuntimeApiRequest::SessionIndexForChild
|
||||
- RuntimeApiRequest::SessionInfo
|
||||
- RuntimeApiRequest::AvailabilityCores
|
||||
|
||||
## Functionality
|
||||
|
||||
### Requesting
|
||||
### PoV Requester
|
||||
|
||||
This subsystems monitors currently occupied cores for all active leaves. For
|
||||
each occupied core it will spawn a task fetching the erasure chunk which has the
|
||||
`ValidatorIndex` of the node. For this an `ChunkFetchingRequest` is
|
||||
issued, via substrate's generic request/response protocol.
|
||||
The PoV requester in the `pov_requester` module takes care of staying connected
|
||||
to validators of the current backing group of this very validator on the `Validation`
|
||||
peer set and it will handle `FetchPoV` requests by issuing network requests to
|
||||
those validators. It will check the hash of the received `PoV`, but will not do any
|
||||
further validation. That needs to be done by the original `FetchPoV` sender
|
||||
(backing subsystem).
|
||||
|
||||
### Chunk Requester
|
||||
|
||||
After a candidate is backed, the availability of the PoV block must be confirmed
|
||||
by 2/3+ of all validators. The chunk requester is responsible of making that
|
||||
availability a reality.
|
||||
|
||||
It does that by querying checking occupied cores for all active leaves. For each
|
||||
occupied core it will spawn a task fetching the erasure chunk which has the
|
||||
`ValidatorIndex` of the node. For this an `ChunkFetchingRequest` is issued, via
|
||||
substrate's generic request/response protocol.
|
||||
|
||||
The spawned task will start trying to fetch the chunk from validators in
|
||||
responsible group of the occupied core, in a random order. For ensuring that we
|
||||
use already open TCP connections wherever possible, the subsystem maintains a
|
||||
use already open TCP connections wherever possible, the requester maintains a
|
||||
cache and preserves that random order for the entire session.
|
||||
|
||||
Note however that, because not all validators in a group have to be actual
|
||||
backers, not all of them are required to have the needed chunk. This in turn
|
||||
could lead to low throughput, as we have to wait for a fetches to fail,
|
||||
could lead to low throughput, as we have to wait for fetches to fail,
|
||||
before reaching a validator finally having our chunk. We do rank back validators
|
||||
not delivering our chunk, but as backers could vary from block to block on a
|
||||
perfectly legitimate basis, this is still not ideal. See issues [2509](https://github.com/paritytech/polkadot/issues/2509) and [2512](https://github.com/paritytech/polkadot/issues/2512)
|
||||
@@ -59,6 +89,10 @@ as we would like as many validators as possible to have their chunk. See this
|
||||
|
||||
### Serving
|
||||
|
||||
On the other side the subsystem will listen for incoming
|
||||
`ChunkFetchingRequest`s from the network bridge and will respond to
|
||||
queries, by looking the requested chunk up in the availability store.
|
||||
On the other side the subsystem will listen for incoming `ChunkFetchingRequest`s
|
||||
and `PoVFetchingRequest`s from the network bridge and will respond to queries,
|
||||
by looking the requested chunks and `PoV`s up in the availability store, this
|
||||
happens in the `responder` module.
|
||||
|
||||
We rely on the backing subsystem to make available data available locally in the
|
||||
`Availability Store` after it has validated it.
|
||||
|
||||
Reference in New Issue
Block a user