mirror of
https://github.com/pezkuwichain/pezkuwi-subxt.git
synced 2026-05-31 01:41:03 +00:00
Retry availability until the receiver of the request is dropped (#2763)
* guide updates * keep interactions alive until receivers drop * retry indefinitely * cancel approval tasks on finality * use swap_remove instead of remove
This commit is contained in:
committed by
GitHub
parent
6514e00144
commit
08d5b268a0
+21
-28
@@ -25,31 +25,27 @@ We hold a state which tracks the current recovery interactions we have live, as
|
||||
|
||||
```rust
|
||||
struct State {
|
||||
/// Each interaction is implemented as its own async task, and these handles are for communicating with them.
|
||||
interactions: Map<CandidateHash, InteractionHandle>,
|
||||
/// Each interaction is implemented as its own remote async task, and these handles are remote
|
||||
/// for it.
|
||||
interactions: FuturesUnordered<InteractionHandle>,
|
||||
/// A multiplexer over receivers from live interactions.
|
||||
interaction_receivers: FuturesUnordered<ResponseReceiver<Concluded>>,
|
||||
/// A recent block hash for which state should be available.
|
||||
live_block_hash: Hash,
|
||||
|
||||
/// interaction communication. This is cloned and given to interactions that are spun up.
|
||||
from_interaction_tx: Sender<FromInteraction>,
|
||||
/// receiver for messages from interactions.
|
||||
from_interaction_rx: Receiver<FromInteraction>,
|
||||
|
||||
// An LRU cache of recently recovered data.
|
||||
availability_lru: LruCache<CandidateHash, Result<AvailableData, RecoveryError>>,
|
||||
}
|
||||
|
||||
/// This is a future, which concludes either when a response is received from the interaction,
|
||||
/// or all the `awaiting` channels have closed.
|
||||
struct InteractionHandle {
|
||||
candidate_hash: CandidateHash,
|
||||
interaction_response: RemoteHandle<Concluded>,
|
||||
awaiting: Vec<ResponseChannel<Result<AvailableData, RecoveryError>>>,
|
||||
}
|
||||
|
||||
struct Unavailable;
|
||||
enum FromInteraction {
|
||||
// An interaction concluded.
|
||||
Concluded(CandidateHash, Result<AvailableData, RecoveryError>),
|
||||
// Send a request on the network.
|
||||
NetworkRequest(Requests),
|
||||
}
|
||||
struct Concluded(CandidateHash, Result<AvailableData, RecoveryError>);
|
||||
|
||||
struct InteractionParams {
|
||||
validator_authority_keys: Vec<AuthorityId>,
|
||||
@@ -71,12 +67,12 @@ enum InteractionPhase {
|
||||
// request the chunk from them.
|
||||
shuffling: Vec<ValidatorIndex>,
|
||||
received_chunks: Map<ValidatorIndex, ErasureChunk>,
|
||||
requesting_chunks: FuturesUnordered<Receiver<DataResponse<ErasureChunk>>>,
|
||||
requesting_chunks: FuturesUnordered<Receiver<ErasureChunkRequestResponse>>,
|
||||
}
|
||||
}
|
||||
|
||||
struct Interaction {
|
||||
to_state: Sender<FromInteraction>,
|
||||
to_subsystems: SubsystemSender,
|
||||
params: InteractionParams,
|
||||
phase: InteractionPhase,
|
||||
}
|
||||
@@ -104,10 +100,6 @@ On `Conclude`, shut down the subsystem.
|
||||
1. Load the entry from the `interactions` map. It should always exist, if not for logic errors. Send the result to each member of `awaiting`.
|
||||
1. Add the entry to the availability_lru.
|
||||
|
||||
#### `FromInteraction::NetworkRequest(requests)`
|
||||
|
||||
1. Forward with `NetworkBridgeMessage::SendRequests`.
|
||||
|
||||
### Interaction logic
|
||||
|
||||
#### `launch_interaction(session_index, session_info, candidate_receipt, candidate_hash, Option<backing_group_index>)`
|
||||
@@ -115,13 +107,13 @@ On `Conclude`, shut down the subsystem.
|
||||
1. Compute the threshold from the session info. It should be `f + 1`, where `n = 3f + k`, where `k in {1, 2, 3}`, and `n` is the number of validators.
|
||||
1. Set the various fields of `InteractionParams` based on the validator lists in `session_info` and information about the candidate.
|
||||
1. If the `backing_group_index` is `Some`, start in the `RequestFromBackers` phase with a shuffling of the backing group validator indices and a `None` requesting value.
|
||||
1. Otherwise, start in the `RequestChunks` phase with `received_chunks` and `requesting_chunks` both empty.
|
||||
1. Set the `to_state` sender to be equal to a clone of `state.from_interaction_tx`.
|
||||
1. Otherwise, start in the `RequestChunks` phase with `received_chunks`,`requesting_chunks`, and `next_shuffling` all empty.
|
||||
1. Set the `to_subsystems` sender to be equal to a clone of the `SubsystemContext`'s sender.
|
||||
1. Initialize `received_chunks` to an empty set, as well as `requesting_chunks`.
|
||||
|
||||
Launch the interaction as a background task running `interaction_loop(interaction)`.
|
||||
|
||||
#### `interaction_loop(interaction)`
|
||||
#### `interaction_loop(interaction) -> Result<AvailableData, RecoeryError>`
|
||||
|
||||
```rust
|
||||
// How many parallel requests to have going at once.
|
||||
@@ -135,13 +127,14 @@ Loop:
|
||||
* If the backer is `Some`, issue a `FromInteraction::NetworkRequest` with a network request for the `AvailableData` and wait for the response.
|
||||
* If it concludes with a `None` result, return to beginning.
|
||||
* If it concludes with available data, attempt a re-encoding.
|
||||
* If it has the correct erasure-root, break and issue a `Concluded(Ok(available_data))`.
|
||||
* If it has the correct erasure-root, break and issue a `Ok(available_data)`.
|
||||
* If it has an incorrect erasure-root, issue a `FromInteraction::ReportPeer` message and return to beginning.
|
||||
* If the backer is `None`, set the phase to `InteractionPhase::RequestChunks` with a random shuffling of validators and empty `received_chunks` and `requesting_chunks`.
|
||||
* If the backer is `None`, set the phase to `InteractionPhase::RequestChunks` with a random shuffling of validators and empty `next_shuffling`, `received_chunks`, and `requesting_chunks`.
|
||||
|
||||
* If the phase is `InteractionPhase::RequestChunks`:
|
||||
* Poll for new updates from `requesting_chunks`. Check merkle proofs of any received chunks, and any failures should lead to issuance of a `FromInteraction::ReportPeer` message.
|
||||
* If `received_chunks` has more than `threshold` entries, attempt to recover the data. If that fails, or a re-encoding produces an incorrect erasure-root, break and issue a `Concluded(RecoveryError::Invalid)`. If correct, break and issue `Concluded(Ok(available_data))`.
|
||||
* If `received_chunks + requesting_chunks + shuffling` lengths are less than the threshold, break and return `Err(Unavailable)`.
|
||||
* Poll for new updates from `requesting_chunks`. Check merkle proofs of any received chunks. If the request simply fails due to network issues, push onto the back of `shuffling` to be retried.
|
||||
* If `received_chunks` has more than `threshold` entries, attempt to recover the data. If that fails, or a re-encoding produces an incorrect erasure-root, break and issue a `Err(RecoveryError::Invalid)`. If correct, break and issue `Ok(available_data)`.
|
||||
* While there are fewer than `N_PARALLEL` entries in `requesting_chunks`,
|
||||
* Pop the next item from `shuffling`. If it's empty and `requesting_chunks` is empty, break and set the phase to `Concluded(None)`.
|
||||
* Pop the next item from `shuffling`. If it's empty and `requesting_chunks` is empty, return `Err(RecoveryError::Unavailable)`.
|
||||
* Issue a `FromInteraction::NetworkRequest` and wait for the response in `requesting_chunks`.
|
||||
|
||||
Reference in New Issue
Block a user