Request based availability distribution (#2423)

* WIP

* availability distribution, still very wip.

Work on the requesting side of things.

* Some docs on what I intend to do.

* Checkpoint of session cache implementation

as I will likely replace it with something smarter.

* More work, mostly on cache

and getting things to type check.

* Only derive MallocSizeOf and Debug for std.

* availability-distribution: Cache feature complete.

* Sketch out logic in `FetchTask` for actual fetching.

- Compile fixes.
- Cleanup.

* Format cleanup.

* More format fixes.

* Almost feature complete `fetch_task`.

Missing:

- Check for cancel
- Actual querying of peer ids.

* Finish FetchTask so far.

* Directly use AuthorityDiscoveryId in protocol and cache.

* Resolve `AuthorityDiscoveryId` on sending requests.

* Rework fetch_task

- also make it impossible to check the wrong chunk index.
- Export needed function in validator_discovery.

* From<u32> implementation for `ValidatorIndex`.

* Fixes and more integration work.

* Make session cache proper lru cache.

* Use proper lru cache.

* Requester finished.

* ProtocolState -> Requester

Also make sure to not fetch our own chunk.

* Cleanup + fixes.

* Remove unused functions

- FetchTask::is_finished
- SessionCache::fetch_session_info

* availability-distribution responding side.

* Cleanup + Fixes.

* More fixes.

* More fixes.

adder-collator is running!

* Some docs.

* Docs.

* Fix reporting of bad guys.

* Fix tests

* Make all tests compile.

* Fix test.

* Cleanup + get rid of some warnings.

* state -> requester

* Mostly doc fixes.

* Fix test suite.

* Get rid of now redundant message types.

* WIP

* Rob's review remarks.

* Fix test suite.

* core.relay_parent -> leaf for session request.

* Style fix.

* Decrease request timeout.

* Cleanup obsolete errors.

* Metrics + don't fail on non fatal errors.

* requester.rs -> requester/mod.rs

* Panic on invalid BadValidator report.

* Fix indentation.

* Use typed default timeout constant.

* Make channel size 0, as each sender gets one slot anyways.

* Fix incorrect metrics initialization.

* Fix build after merge.

* More fixes.

* Hopefully valid metrics names.

* Better metrics names.

* Some tests that already work.

* Slightly better docs.

* Some more tests.

* Fix network bridge test.
This commit is contained in:
Robert Klotzner
2021-02-26 18:58:07 +01:00
committed by GitHub
parent 241b1f12a7
commit 48409e5548
45 changed files with 2037 additions and 1523 deletions
+45 -5
View File
@@ -17,6 +17,7 @@
use std::pin::Pin;
use std::sync::Arc;
use async_trait::async_trait;
use futures::future::BoxFuture;
use futures::prelude::*;
use futures::stream::BoxStream;
@@ -24,7 +25,7 @@ use futures::stream::BoxStream;
use parity_scale_codec::Encode;
use sc_network::Event as NetworkEvent;
use sc_network::{NetworkService, IfDisconnected};
use sc_network::{IfDisconnected, NetworkService, OutboundFailure, RequestFailure};
use polkadot_node_network_protocol::{
peer_set::PeerSet,
@@ -34,6 +35,8 @@ use polkadot_node_network_protocol::{
use polkadot_primitives::v1::{Block, Hash};
use polkadot_subsystem::{SubsystemError, SubsystemResult};
use crate::validator_discovery::{peer_id_from_multiaddr, AuthorityDiscovery};
use super::LOG_TARGET;
/// Send a message to the network.
@@ -92,6 +95,7 @@ pub enum NetworkAction {
}
/// An abstraction over networking for the purposes of this subsystem.
#[async_trait]
pub trait Network: Send + 'static {
/// Get a stream of all events occurring on the network. This may include events unrelated
/// to the Polkadot protocol - the user of this function should filter only for events related
@@ -105,7 +109,11 @@ pub trait Network: Send + 'static {
) -> Pin<Box<dyn Sink<NetworkAction, Error = SubsystemError> + Send + 'a>>;
/// Send a request to a remote peer.
fn start_request(&self, req: Requests);
async fn start_request<AD: AuthorityDiscovery>(
&self,
authority_discovery: &mut AD,
req: Requests,
);
/// Report a given peer as either beneficial (+) or costly (-) according to the given scalar.
fn report_peer(
@@ -137,6 +145,7 @@ pub trait Network: Send + 'static {
}
}
#[async_trait]
impl Network for Arc<NetworkService<Block, Hash>> {
fn event_stream(&mut self) -> BoxStream<'static, NetworkEvent> {
NetworkService::event_stream(self, "polkadot-network-bridge").boxed()
@@ -189,7 +198,11 @@ impl Network for Arc<NetworkService<Block, Hash>> {
Box::pin(ActionSink(&**self))
}
fn start_request(&self, req: Requests) {
async fn start_request<AD: AuthorityDiscovery>(
&self,
authority_discovery: &mut AD,
req: Requests,
) {
let (
protocol,
OutgoingRequest {
@@ -199,8 +212,35 @@ impl Network for Arc<NetworkService<Block, Hash>> {
},
) = req.encode_request();
NetworkService::start_request(&*self,
peer,
let peer_id = authority_discovery
.get_addresses_by_authority_id(peer)
.await
.and_then(|addrs| {
addrs
.into_iter()
.find_map(|addr| peer_id_from_multiaddr(&addr))
});
let peer_id = match peer_id {
None => {
tracing::debug!(target: LOG_TARGET, "Discovering authority failed");
match pending_response
.send(Err(RequestFailure::Network(OutboundFailure::DialFailure)))
{
Err(_) => tracing::debug!(
target: LOG_TARGET,
"Sending failed request response failed."
),
Ok(_) => {}
}
return;
}
Some(peer_id) => peer_id,
};
NetworkService::start_request(
&*self,
peer_id,
protocol.into_protocol_name(),
payload,
pending_response,