Request based availability distribution (#2423)

* WIP * availability distribution, still very wip. Work on the requesting side of things. * Some docs on what I intend to do. * Checkpoint of session cache implementation as I will likely replace it with something smarter. * More work, mostly on cache and getting things to type check. * Only derive MallocSizeOf and Debug for std. * availability-distribution: Cache feature complete. * Sketch out logic in `FetchTask` for actual fetching. - Compile fixes. - Cleanup. * Format cleanup. * More format fixes. * Almost feature complete `fetch_task`. Missing: - Check for cancel - Actual querying of peer ids. * Finish FetchTask so far. * Directly use AuthorityDiscoveryId in protocol and cache. * Resolve `AuthorityDiscoveryId` on sending requests. * Rework fetch_task - also make it impossible to check the wrong chunk index. - Export needed function in validator_discovery. * From<u32> implementation for `ValidatorIndex`. * Fixes and more integration work. * Make session cache proper lru cache. * Use proper lru cache. * Requester finished. * ProtocolState -> Requester Also make sure to not fetch our own chunk. * Cleanup + fixes. * Remove unused functions - FetchTask::is_finished - SessionCache::fetch_session_info * availability-distribution responding side. * Cleanup + Fixes. * More fixes. * More fixes. adder-collator is running! * Some docs. * Docs. * Fix reporting of bad guys. * Fix tests * Make all tests compile. * Fix test. * Cleanup + get rid of some warnings. * state -> requester * Mostly doc fixes. * Fix test suite. * Get rid of now redundant message types. * WIP * Rob's review remarks. * Fix test suite. * core.relay_parent -> leaf for session request. * Style fix. * Decrease request timeout. * Cleanup obsolete errors. * Metrics + don't fail on non fatal errors. * requester.rs -> requester/mod.rs * Panic on invalid BadValidator report. * Fix indentation. * Use typed default timeout constant. * Make channel size 0, as each sender gets one slot anyways. * Fix incorrect metrics initialization. * Fix build after merge. * More fixes. * Hopefully valid metrics names. * Better metrics names. * Some tests that already work. * Slightly better docs. * Some more tests. * Fix network bridge test.
2026-07-16 21:15:43 +00:00 · 2021-02-26 18:58:07 +01:00
parent 241b1f12a7
commit 48409e5548
45 changed files with 2037 additions and 1523 deletions
@@ -0,0 +1,97 @@
+// Copyright 2021 Parity Technologies (UK) Ltd.
+// This file is part of Polkadot.
+
+// Polkadot is free software: you can redistribute it and/or modify
+// it under the terms of the GNU General Public License as published by
+// the Free Software Foundation, either version 3 of the License, or
+// (at your option) any later version.
+
+// Polkadot is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License
+// along with Polkadot.  If not, see <http://www.gnu.org/licenses/>.
+
+//! Answer requests for availability chunks.
+
+use futures::channel::oneshot;
+
+use polkadot_node_network_protocol::request_response::{request::IncomingRequest, v1};
+use polkadot_primitives::v1::{CandidateHash, ErasureChunk, ValidatorIndex};
+use polkadot_subsystem::{
+	messages::{AllMessages, AvailabilityStoreMessage},
+	SubsystemContext,
+};
+
+use crate::error::{Error, Result};
+use crate::{LOG_TARGET, metrics::{Metrics, SUCCEEDED, FAILED, NOT_FOUND}};
+
+/// Variant of `answer_request` that does Prometheus metric and logging on errors.
+///
+/// Any errors of `answer_request` will simply be logged.
+pub async fn answer_request_log<Context>(
+	ctx: &mut Context,
+	req: IncomingRequest<v1::AvailabilityFetchingRequest>,
+	metrics: &Metrics,
+) -> ()
+where
+	Context: SubsystemContext,
+{
+	let res = answer_request(ctx, req).await;
+	match res {
+		Ok(result) =>
+			metrics.on_served(if result {SUCCEEDED} else {NOT_FOUND}),
+		Err(err) => {
+			tracing::warn!(
+				target: LOG_TARGET,
+				err= ?err,
+				"Serving chunk failed with error"
+			);
+			metrics.on_served(FAILED);
+		}
+	}
+}
+
+/// Answer an incoming chunk request by querying the av store.
+///
+/// Returns: Ok(true) if chunk was found and served.
+pub async fn answer_request<Context>(
+	ctx: &mut Context,
+	req: IncomingRequest<v1::AvailabilityFetchingRequest>,
+) -> Result<bool>
+where
+	Context: SubsystemContext,
+{
+	let chunk = query_chunk(ctx, req.payload.candidate_hash, req.payload.index).await?;
+
+	let result = chunk.is_some();
+
+	let response = match chunk {
+		None => v1::AvailabilityFetchingResponse::NoSuchChunk,
+		Some(chunk) => v1::AvailabilityFetchingResponse::Chunk(chunk.into()),
+	};
+
+	req.send_response(response).map_err(|_| Error::SendResponse)?;
+	Ok(result)
+}
+
+/// Query chunk from the availability store.
+#[tracing::instrument(level = "trace", skip(ctx), fields(subsystem = LOG_TARGET))]
+async fn query_chunk<Context>(
+	ctx: &mut Context,
+	candidate_hash: CandidateHash,
+	validator_index: ValidatorIndex,
+) -> Result<Option<ErasureChunk>>
+where
+	Context: SubsystemContext,
+{
+	let (tx, rx) = oneshot::channel();
+	ctx.send_message(AllMessages::AvailabilityStore(
+		AvailabilityStoreMessage::QueryChunk(candidate_hash, validator_index, tx),
+	))
+	.await;
+
+	rx.await.map_err(|e| Error::QueryChunkResponseChannel(e))
+}