Req/res optimization for statement distribution (#2803)

* Wip * Increase proposer timeout. * WIP. * Better timeout values now that we are going to be connected to all nodes. (#2778) * Better timeout values. * Fix typo. * Fix validator bandwidth. * Fix compilation. * Better and more consistent sizes. Most importantly code size is now 5 Meg, which is the limit we currently want to support in statement distribution. * Introduce statement fetching request. * WIP * Statement cache retrieval logic. * Review remarks by @rphmeier * Fixes. * Better requester logic. * WIP: Handle requester messages. * Missing dep. * Fix request launching logic. * Finish fetching logic. * Sending logic. * Redo code size calculations. Now that max code size is compressed size. * Update Cargo.lock (new dep) * Get request receiver to statement distribution. * Expose new functionality for responding to requests. * Cleanup. * Responder logic. * Fixes + Cleanup. * Cargo.lock * Whitespace. * Add lost copyright. * Launch responder task. * Typo. * info -> warn * Typo. * Fix. * Fix. * Update comment. * Doc fix. * Better large statement heuristics. * Fix tests. * Fix network bridge tests. * Add test for size estimate. * Very simple tests that checks we get LargeStatement. * Basic check, that fetching of large candidates is performed. * More tests. * Basic metrics for responder. * More metrics. * Use Encode::encoded_size(). * Some useful spans. * Get rid of redundant metrics. * Don't add peer on duplicate. * Properly check hash instead of relying on signatures alone. * Preserve ordering + better flood protection. * Get rid of redundant clone. * Don't shutdown responder on failed query. And add test for this. * Smaller fixes. * Quotes. * Better queue size calculation. * A bit saner response sizes. * Fixes.
2026-04-27 03:27:58 +00:00 · 2021-04-09 23:30:12 +02:00
parent 69bd6d8ef2
commit 305375e1e4
19 changed files with 1711 additions and 190 deletions
@@ -32,11 +32,12 @@
 //!
 //!  Versioned (v1 module): The actual requests and responses as sent over the network.

-use std::borrow::Cow;
+use std::{borrow::Cow, u64};
 use std::time::Duration;

 use futures::channel::mpsc;
 use polkadot_node_primitives::MAX_POV_SIZE;
+use polkadot_primitives::v1::MAX_CODE_SIZE;
 use strum::EnumIter;

 pub use sc_network::config as network;
@@ -64,8 +65,15 @@ pub enum Protocol {
 	PoVFetching,
 	/// Protocol for fetching available data.
 	AvailableDataFetching,
+	/// Fetching of statements that are too large for gossip.
+	StatementFetching,
 }

+
+/// Minimum bandwidth we expect for validators - 500Mbit/s is the recommendation, so approximately
+/// 50Meg bytes per second:
+const MIN_BANDWIDTH_BYTES: u64  = 50 * 1024 * 1024;
+
 /// Default request timeout in seconds.
 ///
 /// When decreasing this value, take into account that the very first request might need to open a
@@ -78,14 +86,22 @@ const DEFAULT_REQUEST_TIMEOUT: Duration = Duration::from_secs(3);
 /// peer set as well).
 const DEFAULT_REQUEST_TIMEOUT_CONNECTED: Duration = Duration::from_secs(1);

-/// Minimum bandwidth we expect for validators - 500Mbit/s is the recommendation, so approximately
-/// 50Meg bytes per second:
-const MIN_BANDWIDTH_BYTES: u64  = 50 * 1024 * 1024;
 /// Timeout for PoV like data, 2 times what it should take, assuming we can fully utilize the
 /// bandwidth. This amounts to two seconds right now.
 const POV_REQUEST_TIMEOUT_CONNECTED: Duration =
 	Duration::from_millis(2 * 1000 * (MAX_POV_SIZE as u64)  / MIN_BANDWIDTH_BYTES);

+/// We want timeout statement requests fast, so we don't waste time on slow nodes. Responders will
+/// try their best to either serve within that timeout or return an error immediately. (We need to
+/// fit statement distribution within a block of 6 seconds.)
+const STATEMENTS_TIMEOUT: Duration = Duration::from_secs(1);
+
+/// We don't want a slow peer to slow down all the others, at the same time we want to get out the
+/// data quickly in full to at least some peers (as this will reduce load on us as they then can
+/// start serving the data). So this value is a tradeoff. 3 seems to be sensible. So we would need
+/// to have 3 slow noded connected, to delay transfer for others by `STATEMENTS_TIMEOUT`.
+pub const MAX_PARALLEL_STATEMENT_REQUESTS: u32 = 3;
+
 impl Protocol {
 	/// Get a configuration for a given Request response protocol.
 	///
@@ -105,16 +121,16 @@ impl Protocol {
 		let cfg = match self {
 			Protocol::ChunkFetching => RequestResponseConfig {
 				name: p_name,
-				max_request_size: 10_000,
-				max_response_size: 10_000_000,
+				max_request_size: 1_000,
+				max_response_size: MAX_POV_SIZE as u64 / 10,
 				// We are connected to all validators:
 				request_timeout: DEFAULT_REQUEST_TIMEOUT_CONNECTED,
 				inbound_queue: Some(tx),
 			},
 			Protocol::CollationFetching => RequestResponseConfig {
 				name: p_name,
-				max_request_size: 10_000,
-				max_response_size: MAX_POV_SIZE as u64,
+				max_request_size: 1_000,
+				max_response_size: MAX_POV_SIZE as u64 + 1000,
 				// Taken from initial implementation in collator protocol:
 				request_timeout: POV_REQUEST_TIMEOUT_CONNECTED,
 				inbound_queue: Some(tx),
@@ -130,10 +146,28 @@ impl Protocol {
 				name: p_name,
 				max_request_size: 1_000,
 				// Available data size is dominated by the PoV size.
-				max_response_size: MAX_POV_SIZE as u64,
+				max_response_size: MAX_POV_SIZE as u64 + 1000,
 				request_timeout: POV_REQUEST_TIMEOUT_CONNECTED,
 				inbound_queue: Some(tx),
 			},
+			Protocol::StatementFetching => RequestResponseConfig {
+				name: p_name,
+				max_request_size: 1_000,
+				// Available data size is dominated code size.
+                // + 1000 to account for protocol overhead (should be way less).
+				max_response_size: MAX_CODE_SIZE as u64 + 1000,
+				// We need statement fetching to be fast and will try our best at the responding
+				// side to answer requests within that timeout, assuming a bandwidth of 500Mbit/s
+				// - which is the recommended minimum bandwidth for nodes on Kusama as of April
+				// 2021.
+				// Responders will reject requests, if it is unlikely they can serve them within
+				// the timeout, so the requester can immediately try another node, instead of
+				// waiting for timeout on an overloaded node.  Fetches from slow nodes will likely
+				// fail, but this is desired, so we can quickly move on to a faster one - we should
+				// also decrease its reputation.
+				request_timeout: Duration::from_secs(1),
+				inbound_queue: Some(tx),
+			},
 		};
 		(rx, cfg)
 	}
@@ -154,6 +188,26 @@ impl Protocol {
 			// Validators are constantly self-selecting to request available data which may lead
 			// to constant load and occasional burstiness.
 			Protocol::AvailableDataFetching => 100,
+			// Our queue size approximation is how many blocks of the size of
+			// a runtime we can transfer within a statements timeout, minus the requests we handle
+			// in parallel.
+			Protocol::StatementFetching => {
+				// We assume we can utilize up to 70% of the available bandwidth for statements.
+				// This is just a guess/estimate, with the following considerations: If we are
+				// faster than that, queue size will stay low anyway, even if not - requesters will
+				// get an immediate error, but if we are slower, requesters will run in a timeout -
+				// waisting precious time.
+				let available_bandwidth = 7 * MIN_BANDWIDTH_BYTES / 10;
+				let size = u64::saturating_sub(
+                    STATEMENTS_TIMEOUT.as_millis() as u64 * available_bandwidth / (1000 * MAX_CODE_SIZE as u64),
+					MAX_PARALLEL_STATEMENT_REQUESTS as u64
+				);
+				debug_assert!(
+					size > 0,
+					"We should have a channel size greater zero, otherwise we won't accept any requests."
+				);
+				size as usize
+			}
 		}
 	}

@@ -169,6 +223,7 @@ impl Protocol {
 			Protocol::CollationFetching => "/polkadot/req_collation/1",
 			Protocol::PoVFetching => "/polkadot/req_pov/1",
 			Protocol::AvailableDataFetching => "/polkadot/req_available_data/1",
+			Protocol::StatementFetching => "/polkadot/req_statement/1",
 		}
 	}
 }