Fix cycle dispute-coordinator <-> dispute-distribution (#6489)

* First iteration of message sender.

* dyn Fn variant (no cloning)

* Full implementation + Clone, without allocs on `Send`

* Further clarifications/cleanup.

* MessageSender -> NestingSender

* Doc update/clarification.

* dispute-coordinator: Send disputes on startup.

+ Some fixes, cleanup.

* Fix whitespace.

* Dispute distribution fixes, cleanup.

* Cargo.lock

* Fix spaces.

* More format fixes.

What is cargo fmt doing actually?

* More fmt fixes.

* Fix nesting sender.

* Fixes.

* Whitespace

* Enable logging.

* Guide update.

* Fmt fixes, typos.

* Remove unused function.

* Simplifications, doc fixes.

* Update roadmap/implementers-guide/src/node/disputes/dispute-coordinator.md

Co-authored-by: Marcin S. <marcin@bytedude.com>

* Fmt + doc example fix.

Co-authored-by: eskimor <eskimor@no-such-url.com>
Co-authored-by: Marcin S. <marcin@bytedude.com>
This commit is contained in:
eskimor
2023-01-10 12:04:05 +01:00
committed by GitHub
parent 44fd95661c
commit cc650fe53d
16 changed files with 778 additions and 642 deletions
@@ -29,6 +29,7 @@ use std::{num::NonZeroUsize, time::Duration};
use futures::{channel::mpsc, FutureExt, StreamExt, TryFutureExt};
use polkadot_node_network_protocol::authority_discovery::AuthorityDiscovery;
use polkadot_node_subsystem_util::nesting_sender::NestingSender;
use sp_keystore::SyncCryptoStorePtr;
use polkadot_node_network_protocol::request_response::{incoming::IncomingRequestReceiver, v1};
@@ -51,33 +52,33 @@ use polkadot_node_subsystem_util::{runtime, runtime::RuntimeInfo};
/// to this subsystem, unknown dispute. This is to make sure, we get our vote out, even on
/// restarts.
///
/// The actual work of sending and keeping track of transmission attempts to each validator for a
/// particular dispute are done by [`SendTask`]. The purpose of the `DisputeSender` is to keep
/// track of all ongoing disputes and start and clean up `SendTask`s accordingly.
/// The actual work of sending and keeping track of transmission attempts to each validator for a
/// particular dispute are done by [`SendTask`]. The purpose of the `DisputeSender` is to keep
/// track of all ongoing disputes and start and clean up `SendTask`s accordingly.
mod sender;
use self::sender::{DisputeSender, TaskFinish};
use self::sender::{DisputeSender, DisputeSenderMessage};
/// ## The receiver [`DisputesReceiver`]
/// ## The receiver [`DisputesReceiver`]
///
/// The receiving side is implemented as `DisputesReceiver` and is run as a separate long running task within
/// this subsystem ([`DisputesReceiver::run`]).
/// The receiving side is implemented as `DisputesReceiver` and is run as a separate long running task within
/// this subsystem ([`DisputesReceiver::run`]).
///
/// Conceptually all the receiver has to do, is waiting for incoming requests which are passed in
/// via a dedicated channel and forwarding them to the dispute coordinator via
/// `DisputeCoordinatorMessage::ImportStatements`. Being the interface to the network and untrusted
/// nodes, the reality is not that simple of course. Before importing statements the receiver will
/// batch up imports as well as possible for efficient imports while maintaining timely dispute
/// resolution and handling of spamming validators:
/// Conceptually all the receiver has to do, is waiting for incoming requests which are passed in
/// via a dedicated channel and forwarding them to the dispute coordinator via
/// `DisputeCoordinatorMessage::ImportStatements`. Being the interface to the network and untrusted
/// nodes, the reality is not that simple of course. Before importing statements the receiver will
/// batch up imports as well as possible for efficient imports while maintaining timely dispute
/// resolution and handling of spamming validators:
///
/// - Drop all messages from non validator nodes, for this it requires the [`AuthorityDiscovery`]
/// service.
/// - Drop messages from a node, if it sends at a too high rate.
/// - Filter out duplicate messages (over some period of time).
/// - Drop any obviously invalid votes (invalid signatures for example).
/// - Ban peers whose votes were deemed invalid.
/// - Drop all messages from non validator nodes, for this it requires the [`AuthorityDiscovery`]
/// service.
/// - Drop messages from a node, if it sends at a too high rate.
/// - Filter out duplicate messages (over some period of time).
/// - Drop any obviously invalid votes (invalid signatures for example).
/// - Ban peers whose votes were deemed invalid.
///
/// In general dispute-distribution works on limiting the work the dispute-coordinator will have to
/// do, while at the same time making it aware of new disputes as fast as possible.
/// In general dispute-distribution works on limiting the work the dispute-coordinator will have to
/// do, while at the same time making it aware of new disputes as fast as possible.
///
/// For successfully imported votes, we will confirm the receipt of the message back to the sender.
/// This way a received confirmation guarantees, that the vote has been stored to disk by the
@@ -87,7 +88,7 @@ use self::receiver::DisputesReceiver;
/// Error and [`Result`] type for this subsystem.
mod error;
use error::{log_error, FatalError, FatalResult, Result};
use error::{log_error, Error, FatalError, FatalResult, Result};
#[cfg(test)]
mod tests;
@@ -118,10 +119,10 @@ pub struct DisputeDistributionSubsystem<AD> {
runtime: RuntimeInfo,
/// Sender for our dispute requests.
disputes_sender: DisputeSender,
disputes_sender: DisputeSender<DisputeSenderMessage>,
/// Receive messages from `SendTask`.
sender_rx: mpsc::Receiver<TaskFinish>,
/// Receive messages from `DisputeSender` background tasks.
sender_rx: mpsc::Receiver<DisputeSenderMessage>,
/// Receiver for incoming requests.
req_receiver: Option<IncomingRequestReceiver<v1::DisputeRequest>>,
@@ -167,7 +168,7 @@ where
session_cache_lru_size: NonZeroUsize::new(DISPUTE_WINDOW.get() as usize)
.expect("Dispute window can not be 0; qed"),
});
let (tx, sender_rx) = mpsc::channel(1);
let (tx, sender_rx) = NestingSender::new_root(1);
let disputes_sender = DisputeSender::new(tx, metrics.clone());
Self {
runtime,
@@ -216,9 +217,16 @@ where
log_error(result, "on FromOrchestra")?;
},
MuxedMessage::Sender(result) => {
self.disputes_sender
.on_task_message(result.ok_or(FatalError::SenderExhausted)?)
.await;
let result = self
.disputes_sender
.on_message(
&mut ctx,
&mut self.runtime,
result.ok_or(FatalError::SenderExhausted)?,
)
.await
.map_err(Error::Sender);
log_error(result, "on_message")?;
},
}
}
@@ -260,14 +268,14 @@ enum MuxedMessage {
/// Messages from other subsystems.
Subsystem(FatalResult<FromOrchestra<DisputeDistributionMessage>>),
/// Messages from spawned sender background tasks.
Sender(Option<TaskFinish>),
Sender(Option<DisputeSenderMessage>),
}
#[overseer::contextbounds(DisputeDistribution, prefix = self::overseer)]
impl MuxedMessage {
async fn receive<Context>(
ctx: &mut Context,
from_sender: &mut mpsc::Receiver<TaskFinish>,
from_sender: &mut mpsc::Receiver<DisputeSenderMessage>,
) -> Self {
// We are only fusing here to make `select` happy, in reality we will quit if the stream
// ends.