Introduce subsystem benchmarking tool (#2528)

This tool makes it easy to run parachain consensus stress/performance
testing on your development machine or in CI.

## Motivation
The parachain consensus node implementation spans across many modules
which we call subsystems. Each subsystem is responsible for a small part
of logic of the parachain consensus pipeline, but in general the most
load and performance issues are localized in just a few core subsystems
like `availability-recovery`, `approval-voting` or
`dispute-coordinator`. In the absence of such a tool, we would run large
test nets to load/stress test these parts of the system. Setting up and
making sense of the amount of data produced by such a large test is very
expensive, hard to orchestrate and is a huge development time sink.

## PR contents
- CLI tool 
- Data Availability Read test
- reusable mockups and components needed so far
- Documentation on how to get started

### Data Availability Read test

An overseer is built with using a real `availability-recovery` susbsytem
instance while dependent subsystems like `av-store`, `network-bridge`
and `runtime-api` are mocked. The network bridge will emulate all the
network peers and their answering to requests.

The test is going to be run for a number of blocks. For each block it
will generate send a “RecoverAvailableData” request for an arbitrary
number of candidates. We wait for the subsystem to respond to all
requests before moving to the next block.
At the same time we collect the usual subsystem metrics and task CPU
metrics and show some nice progress reports while running.

### Here is how the CLI looks like:

```
[2023-11-28T13:06:27Z INFO  subsystem_bench::core::display] n_validators = 1000, n_cores = 20, pov_size = 5120 - 5120, error = 3, latency = Some(PeerLatency { min_latency: 1ms, max_latency: 100ms })
[2023-11-28T13:06:27Z INFO  subsystem-bench::availability] Generating template candidate index=0 pov_size=5242880
[2023-11-28T13:06:27Z INFO  subsystem-bench::availability] Created test environment.
[2023-11-28T13:06:27Z INFO  subsystem-bench::availability] Pre-generating 60 candidates.
[2023-11-28T13:06:30Z INFO  subsystem-bench::core] Initializing network emulation for 1000 peers.
[2023-11-28T13:06:30Z INFO  subsystem-bench::availability] Current block 1/3
[2023-11-28T13:06:30Z INFO  substrate_prometheus_endpoint] 〽️ Prometheus exporter started at 127.0.0.1:9999
[2023-11-28T13:06:30Z INFO  subsystem_bench::availability] 20 recoveries pending
[2023-11-28T13:06:37Z INFO  subsystem_bench::availability] Block time 6262ms
[2023-11-28T13:06:37Z INFO  subsystem-bench::availability] Sleeping till end of block (0ms)
[2023-11-28T13:06:37Z INFO  subsystem-bench::availability] Current block 2/3
[2023-11-28T13:06:37Z INFO  subsystem_bench::availability] 20 recoveries pending
[2023-11-28T13:06:43Z INFO  subsystem_bench::availability] Block time 6369ms
[2023-11-28T13:06:43Z INFO  subsystem-bench::availability] Sleeping till end of block (0ms)
[2023-11-28T13:06:43Z INFO  subsystem-bench::availability] Current block 3/3
[2023-11-28T13:06:43Z INFO  subsystem_bench::availability] 20 recoveries pending
[2023-11-28T13:06:49Z INFO  subsystem_bench::availability] Block time 6194ms
[2023-11-28T13:06:49Z INFO  subsystem-bench::availability] Sleeping till end of block (0ms)
[2023-11-28T13:06:49Z INFO  subsystem_bench::availability] All blocks processed in 18829ms
[2023-11-28T13:06:49Z INFO  subsystem_bench::availability] Throughput: 102400 KiB/block
[2023-11-28T13:06:49Z INFO  subsystem_bench::availability] Block time: 6276 ms
[2023-11-28T13:06:49Z INFO  subsystem_bench::availability] 
    
    Total received from network: 415 MiB
    Total sent to network: 724 KiB
    Total subsystem CPU usage 24.00s
    CPU usage per block 8.00s
    Total test environment CPU usage 0.15s
    CPU usage per block 0.05s
```

### Prometheus/Grafana stack in action
<img width="1246" alt="Screenshot 2023-11-28 at 15 11 10"
src="https://github.com/paritytech/polkadot-sdk/assets/54316454/eaa47422-4a5e-4a3a-aaef-14ca644c1574">
<img width="1246" alt="Screenshot 2023-11-28 at 15 12 01"
src="https://github.com/paritytech/polkadot-sdk/assets/54316454/237329d6-1710-4c27-8f67-5fb11d7f66ea">
<img width="1246" alt="Screenshot 2023-11-28 at 15 12 38"
src="https://github.com/paritytech/polkadot-sdk/assets/54316454/a07119e8-c9f1-4810-a1b3-f1b7b01cf357">

---------

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
This commit is contained in:
Andrei Sandu
2023-12-14 12:57:17 +02:00
committed by GitHub
parent 07550e2d71
commit 8a6e9ef189
31 changed files with 5829 additions and 38 deletions
@@ -0,0 +1,37 @@
// Copyright (C) Parity Technologies (UK) Ltd.
// This file is part of Polkadot.
// Polkadot is free software: you can redistribute it and/or modify
// it under the terms of the GNU General Public License as published by
// the Free Software Foundation, either version 3 of the License, or
// (at your option) any later version.
// Polkadot is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU General Public License for more details.
// You should have received a copy of the GNU General Public License
// along with Polkadot. If not, see <http://www.gnu.org/licenses/>.
use serde::{Deserialize, Serialize};
#[derive(clap::ValueEnum, Clone, Copy, Debug, PartialEq)]
#[value(rename_all = "kebab-case")]
#[non_exhaustive]
pub enum NetworkEmulation {
Ideal,
Healthy,
Degraded,
}
#[derive(Debug, Clone, Serialize, Deserialize, clap::Parser)]
#[clap(rename_all = "kebab-case")]
#[allow(missing_docs)]
pub struct DataAvailabilityReadOptions {
#[clap(short, long, default_value_t = false)]
/// Turbo boost AD Read by fetching the full availability datafrom backers first. Saves CPU as
/// we don't need to re-construct from chunks. Tipically this is only faster if nodes have
/// enough bandwidth.
pub fetch_from_backers: bool,
}
@@ -0,0 +1,339 @@
// Copyright (C) Parity Technologies (UK) Ltd.
// This file is part of Polkadot.
// Polkadot is free software: you can redistribute it and/or modify
// it under the terms of the GNU General Public License as published by
// the Free Software Foundation, either version 3 of the License, or
// (at your option) any later version.
// Polkadot is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU General Public License for more details.
// You should have received a copy of the GNU General Public License
// along with Polkadot. If not, see <http://www.gnu.org/licenses/>.
use itertools::Itertools;
use std::{collections::HashMap, iter::Cycle, ops::Sub, sync::Arc, time::Instant};
use crate::TestEnvironment;
use polkadot_node_subsystem::{Overseer, OverseerConnector, SpawnGlue};
use polkadot_node_subsystem_test_helpers::derive_erasure_chunks_with_proofs_and_root;
use polkadot_overseer::Handle as OverseerHandle;
use sc_network::request_responses::ProtocolConfig;
use colored::Colorize;
use futures::{channel::oneshot, stream::FuturesUnordered, StreamExt};
use polkadot_node_metrics::metrics::Metrics;
use polkadot_availability_recovery::AvailabilityRecoverySubsystem;
use crate::GENESIS_HASH;
use parity_scale_codec::Encode;
use polkadot_node_network_protocol::request_response::{IncomingRequest, ReqProtocolNames};
use polkadot_node_primitives::{BlockData, PoV};
use polkadot_node_subsystem::messages::{AllMessages, AvailabilityRecoveryMessage};
use crate::core::{
environment::TestEnvironmentDependencies,
mock::{
av_store,
network_bridge::{self, MockNetworkBridgeTx, NetworkAvailabilityState},
runtime_api, MockAvailabilityStore, MockRuntimeApi,
},
};
use super::core::{configuration::TestConfiguration, mock::dummy_builder, network::*};
const LOG_TARGET: &str = "subsystem-bench::availability";
use polkadot_node_primitives::{AvailableData, ErasureChunk};
use super::{cli::TestObjective, core::mock::AlwaysSupportsParachains};
use polkadot_node_subsystem_test_helpers::mock::new_block_import_info;
use polkadot_primitives::{
CandidateHash, CandidateReceipt, GroupIndex, Hash, HeadData, PersistedValidationData,
};
use polkadot_primitives_test_helpers::{dummy_candidate_receipt, dummy_hash};
use sc_service::SpawnTaskHandle;
mod cli;
pub use cli::{DataAvailabilityReadOptions, NetworkEmulation};
fn build_overseer(
spawn_task_handle: SpawnTaskHandle,
runtime_api: MockRuntimeApi,
av_store: MockAvailabilityStore,
network_bridge: MockNetworkBridgeTx,
availability_recovery: AvailabilityRecoverySubsystem,
) -> (Overseer<SpawnGlue<SpawnTaskHandle>, AlwaysSupportsParachains>, OverseerHandle) {
let overseer_connector = OverseerConnector::with_event_capacity(64000);
let dummy = dummy_builder!(spawn_task_handle);
let builder = dummy
.replace_runtime_api(|_| runtime_api)
.replace_availability_store(|_| av_store)
.replace_network_bridge_tx(|_| network_bridge)
.replace_availability_recovery(|_| availability_recovery);
let (overseer, raw_handle) =
builder.build_with_connector(overseer_connector).expect("Should not fail");
(overseer, OverseerHandle::new(raw_handle))
}
/// Takes a test configuration and uses it to creates the `TestEnvironment`.
pub fn prepare_test(
config: TestConfiguration,
state: &mut TestState,
) -> (TestEnvironment, ProtocolConfig) {
prepare_test_inner(config, state, TestEnvironmentDependencies::default())
}
fn prepare_test_inner(
config: TestConfiguration,
state: &mut TestState,
dependencies: TestEnvironmentDependencies,
) -> (TestEnvironment, ProtocolConfig) {
// Generate test authorities.
let test_authorities = config.generate_authorities();
let runtime_api = runtime_api::MockRuntimeApi::new(config.clone(), test_authorities.clone());
let av_store =
av_store::MockAvailabilityStore::new(state.chunks.clone(), state.candidate_hashes.clone());
let availability_state = NetworkAvailabilityState {
candidate_hashes: state.candidate_hashes.clone(),
available_data: state.available_data.clone(),
chunks: state.chunks.clone(),
};
let network = NetworkEmulator::new(&config, &dependencies, &test_authorities);
let network_bridge_tx = network_bridge::MockNetworkBridgeTx::new(
config.clone(),
availability_state,
network.clone(),
);
let use_fast_path = match &state.config().objective {
TestObjective::DataAvailabilityRead(options) => options.fetch_from_backers,
_ => panic!("Unexpected objective"),
};
let (collation_req_receiver, req_cfg) =
IncomingRequest::get_config_receiver(&ReqProtocolNames::new(GENESIS_HASH, None));
let subsystem = if use_fast_path {
AvailabilityRecoverySubsystem::with_fast_path(
collation_req_receiver,
Metrics::try_register(&dependencies.registry).unwrap(),
)
} else {
AvailabilityRecoverySubsystem::with_chunks_only(
collation_req_receiver,
Metrics::try_register(&dependencies.registry).unwrap(),
)
};
let (overseer, overseer_handle) = build_overseer(
dependencies.task_manager.spawn_handle(),
runtime_api,
av_store,
network_bridge_tx,
subsystem,
);
(TestEnvironment::new(dependencies, config, network, overseer, overseer_handle), req_cfg)
}
#[derive(Clone)]
pub struct TestState {
// Full test configuration
config: TestConfiguration,
// A cycle iterator on all PoV sizes used in the test.
pov_sizes: Cycle<std::vec::IntoIter<usize>>,
// Generated candidate receipts to be used in the test
candidates: Cycle<std::vec::IntoIter<CandidateReceipt>>,
// Map from pov size to candidate index
pov_size_to_candidate: HashMap<usize, usize>,
// Map from generated candidate hashes to candidate index in `available_data`
// and `chunks`.
candidate_hashes: HashMap<CandidateHash, usize>,
// Per candidate index receipts.
candidate_receipt_templates: Vec<CandidateReceipt>,
// Per candidate index `AvailableData`
available_data: Vec<AvailableData>,
// Per candiadte index chunks
chunks: Vec<Vec<ErasureChunk>>,
}
impl TestState {
fn config(&self) -> &TestConfiguration {
&self.config
}
pub fn next_candidate(&mut self) -> Option<CandidateReceipt> {
let candidate = self.candidates.next();
let candidate_hash = candidate.as_ref().unwrap().hash();
gum::trace!(target: LOG_TARGET, "Next candidate selected {:?}", candidate_hash);
candidate
}
/// Generate candidates to be used in the test.
fn generate_candidates(&mut self) {
let count = self.config.n_cores * self.config.num_blocks;
gum::info!(target: LOG_TARGET,"{}", format!("Pre-generating {} candidates.", count).bright_blue());
// Generate all candidates
self.candidates = (0..count)
.map(|index| {
let pov_size = self.pov_sizes.next().expect("This is a cycle; qed");
let candidate_index = *self
.pov_size_to_candidate
.get(&pov_size)
.expect("pov_size always exists; qed");
let mut candidate_receipt =
self.candidate_receipt_templates[candidate_index].clone();
// Make it unique.
candidate_receipt.descriptor.relay_parent = Hash::from_low_u64_be(index as u64);
// Store the new candidate in the state
self.candidate_hashes.insert(candidate_receipt.hash(), candidate_index);
gum::debug!(target: LOG_TARGET, candidate_hash = ?candidate_receipt.hash(), "new candidate");
candidate_receipt
})
.collect::<Vec<_>>()
.into_iter()
.cycle();
}
pub fn new(config: &TestConfiguration) -> Self {
let config = config.clone();
let mut chunks = Vec::new();
let mut available_data = Vec::new();
let mut candidate_receipt_templates = Vec::new();
let mut pov_size_to_candidate = HashMap::new();
// we use it for all candidates.
let persisted_validation_data = PersistedValidationData {
parent_head: HeadData(vec![7, 8, 9]),
relay_parent_number: Default::default(),
max_pov_size: 1024,
relay_parent_storage_root: Default::default(),
};
// For each unique pov we create a candidate receipt.
for (index, pov_size) in config.pov_sizes().iter().cloned().unique().enumerate() {
gum::info!(target: LOG_TARGET, index, pov_size, "{}", "Generating template candidate".bright_blue());
let mut candidate_receipt = dummy_candidate_receipt(dummy_hash());
let pov = PoV { block_data: BlockData(vec![index as u8; pov_size]) };
let new_available_data = AvailableData {
validation_data: persisted_validation_data.clone(),
pov: Arc::new(pov),
};
let (new_chunks, erasure_root) = derive_erasure_chunks_with_proofs_and_root(
config.n_validators,
&new_available_data,
|_, _| {},
);
candidate_receipt.descriptor.erasure_root = erasure_root;
chunks.push(new_chunks);
available_data.push(new_available_data);
pov_size_to_candidate.insert(pov_size, index);
candidate_receipt_templates.push(candidate_receipt);
}
let pov_sizes = config.pov_sizes().to_owned();
let pov_sizes = pov_sizes.into_iter().cycle();
gum::info!(target: LOG_TARGET, "{}","Created test environment.".bright_blue());
let mut _self = Self {
config,
available_data,
candidate_receipt_templates,
chunks,
pov_size_to_candidate,
pov_sizes,
candidate_hashes: HashMap::new(),
candidates: Vec::new().into_iter().cycle(),
};
_self.generate_candidates();
_self
}
}
pub async fn benchmark_availability_read(env: &mut TestEnvironment, mut state: TestState) {
let config = env.config().clone();
env.import_block(new_block_import_info(Hash::repeat_byte(1), 1)).await;
let start_marker = Instant::now();
let mut batch = FuturesUnordered::new();
let mut availability_bytes = 0u128;
env.metrics().set_n_validators(config.n_validators);
env.metrics().set_n_cores(config.n_cores);
for block_num in 0..env.config().num_blocks {
gum::info!(target: LOG_TARGET, "Current block {}/{}", block_num + 1, env.config().num_blocks);
env.metrics().set_current_block(block_num);
let block_start_ts = Instant::now();
for candidate_num in 0..config.n_cores as u64 {
let candidate =
state.next_candidate().expect("We always send up to n_cores*num_blocks; qed");
let (tx, rx) = oneshot::channel();
batch.push(rx);
let message = AllMessages::AvailabilityRecovery(
AvailabilityRecoveryMessage::RecoverAvailableData(
candidate.clone(),
1,
Some(GroupIndex(
candidate_num as u32 % (std::cmp::max(5, config.n_cores) / 5) as u32,
)),
tx,
),
);
env.send_message(message).await;
}
gum::info!("{}", format!("{} recoveries pending", batch.len()).bright_black());
while let Some(completed) = batch.next().await {
let available_data = completed.unwrap().unwrap();
env.metrics().on_pov_size(available_data.encoded_size());
availability_bytes += available_data.encoded_size() as u128;
}
let block_time = Instant::now().sub(block_start_ts).as_millis() as u64;
env.metrics().set_block_time(block_time);
gum::info!("All work for block completed in {}", format!("{:?}ms", block_time).cyan());
}
let duration: u128 = start_marker.elapsed().as_millis();
let availability_bytes = availability_bytes / 1024;
gum::info!("All blocks processed in {}", format!("{:?}ms", duration).cyan());
gum::info!(
"Throughput: {}",
format!("{} KiB/block", availability_bytes / env.config().num_blocks as u128).bright_red()
);
gum::info!(
"Block time: {}",
format!("{} ms", start_marker.elapsed().as_millis() / env.config().num_blocks as u128)
.red()
);
gum::info!("{}", &env);
env.stop().await;
}
+60
View File
@@ -0,0 +1,60 @@
// Copyright (C) Parity Technologies (UK) Ltd.
// This file is part of Polkadot.
// Polkadot is free software: you can redistribute it and/or modify
// it under the terms of the GNU General Public License as published by
// the Free Software Foundation, either version 3 of the License, or
// (at your option) any later version.
// Polkadot is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU General Public License for more details.
// You should have received a copy of the GNU General Public License
// along with Polkadot. If not, see <http://www.gnu.org/licenses/>.
use super::availability::DataAvailabilityReadOptions;
use serde::{Deserialize, Serialize};
#[derive(Debug, Clone, Serialize, Deserialize, clap::Parser)]
#[clap(rename_all = "kebab-case")]
#[allow(missing_docs)]
pub struct TestSequenceOptions {
#[clap(short, long, ignore_case = true)]
pub path: String,
}
/// Define the supported benchmarks targets
#[derive(Debug, Clone, clap::Parser, Serialize, Deserialize)]
#[command(rename_all = "kebab-case")]
pub enum TestObjective {
/// Benchmark availability recovery strategies.
DataAvailabilityRead(DataAvailabilityReadOptions),
/// Run a test sequence specified in a file
TestSequence(TestSequenceOptions),
}
#[derive(Debug, clap::Parser)]
#[clap(rename_all = "kebab-case")]
#[allow(missing_docs)]
pub struct StandardTestOptions {
#[clap(long, ignore_case = true, default_value_t = 100)]
/// Number of cores to fetch availability for.
pub n_cores: usize,
#[clap(long, ignore_case = true, default_value_t = 500)]
/// Number of validators to fetch chunks from.
pub n_validators: usize,
#[clap(long, ignore_case = true, default_value_t = 5120)]
/// The minimum pov size in KiB
pub min_pov_size: usize,
#[clap(long, ignore_case = true, default_value_t = 5120)]
/// The maximum pov size bytes
pub max_pov_size: usize,
#[clap(short, long, ignore_case = true, default_value_t = 1)]
/// The number of blocks the test is going to run.
pub num_blocks: usize,
}
@@ -0,0 +1,262 @@
// Copyright (C) Parity Technologies (UK) Ltd.
// This file is part of Polkadot.
// Polkadot is free software: you can redistribute it and/or modify
// it under the terms of the GNU General Public License as published by
// the Free Software Foundation, either version 3 of the License, or
// (at your option) any later version.
// Polkadot is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU General Public License for more details.
// You should have received a copy of the GNU General Public License
// along with Polkadot. If not, see <http://www.gnu.org/licenses/>.
//
//! Test configuration definition and helpers.
use super::*;
use keyring::Keyring;
use std::{path::Path, time::Duration};
pub use crate::cli::TestObjective;
use polkadot_primitives::{AuthorityDiscoveryId, ValidatorId};
use rand::{distributions::Uniform, prelude::Distribution, thread_rng};
use serde::{Deserialize, Serialize};
pub fn random_pov_size(min_pov_size: usize, max_pov_size: usize) -> usize {
random_uniform_sample(min_pov_size, max_pov_size)
}
fn random_uniform_sample<T: Into<usize> + From<usize>>(min_value: T, max_value: T) -> T {
Uniform::from(min_value.into()..=max_value.into())
.sample(&mut thread_rng())
.into()
}
/// Peer response latency configuration.
#[derive(Clone, Debug, Default, Serialize, Deserialize)]
pub struct PeerLatency {
/// Min latency for `NetworkAction` completion.
pub min_latency: Duration,
/// Max latency or `NetworkAction` completion.
pub max_latency: Duration,
}
// Default PoV size in KiB.
fn default_pov_size() -> usize {
5120
}
// Default bandwidth in bytes
fn default_bandwidth() -> usize {
52428800
}
// Default connectivity percentage
fn default_connectivity() -> usize {
100
}
/// The test input parameters
#[derive(Clone, Debug, Serialize, Deserialize)]
pub struct TestConfiguration {
/// The test objective
pub objective: TestObjective,
/// Number of validators
pub n_validators: usize,
/// Number of cores
pub n_cores: usize,
/// The min PoV size
#[serde(default = "default_pov_size")]
pub min_pov_size: usize,
/// The max PoV size,
#[serde(default = "default_pov_size")]
pub max_pov_size: usize,
/// Randomly sampled pov_sizes
#[serde(skip)]
pov_sizes: Vec<usize>,
/// The amount of bandiwdth remote validators have.
#[serde(default = "default_bandwidth")]
pub peer_bandwidth: usize,
/// The amount of bandiwdth our node has.
#[serde(default = "default_bandwidth")]
pub bandwidth: usize,
/// Optional peer emulation latency
#[serde(default)]
pub latency: Option<PeerLatency>,
/// Error probability, applies to sending messages to the emulated network peers
#[serde(default)]
pub error: usize,
/// Connectivity ratio, the percentage of peers we are not connected to, but ar part of
/// the topology.
#[serde(default = "default_connectivity")]
pub connectivity: usize,
/// Number of blocks to run the test for
pub num_blocks: usize,
}
fn generate_pov_sizes(count: usize, min_kib: usize, max_kib: usize) -> Vec<usize> {
(0..count).map(|_| random_pov_size(min_kib * 1024, max_kib * 1024)).collect()
}
#[derive(Serialize, Deserialize)]
pub struct TestSequence {
#[serde(rename(serialize = "TestConfiguration", deserialize = "TestConfiguration"))]
test_configurations: Vec<TestConfiguration>,
}
impl TestSequence {
pub fn into_vec(self) -> Vec<TestConfiguration> {
self.test_configurations
.into_iter()
.map(|mut config| {
config.pov_sizes =
generate_pov_sizes(config.n_cores, config.min_pov_size, config.max_pov_size);
config
})
.collect()
}
}
impl TestSequence {
pub fn new_from_file(path: &Path) -> std::io::Result<TestSequence> {
let string = String::from_utf8(std::fs::read(path)?).expect("File is valid UTF8");
Ok(serde_yaml::from_str(&string).expect("File is valid test sequence YA"))
}
}
/// Helper struct for authority related state.
#[derive(Clone)]
pub struct TestAuthorities {
pub keyrings: Vec<Keyring>,
pub validator_public: Vec<ValidatorId>,
pub validator_authority_id: Vec<AuthorityDiscoveryId>,
}
impl TestConfiguration {
#[allow(unused)]
pub fn write_to_disk(&self) {
// Serialize a slice of configurations
let yaml = serde_yaml::to_string(&TestSequence { test_configurations: vec![self.clone()] })
.unwrap();
std::fs::write("last_test.yaml", yaml).unwrap();
}
pub fn pov_sizes(&self) -> &[usize] {
&self.pov_sizes
}
/// Generates the authority keys we need for the network emulation.
pub fn generate_authorities(&self) -> TestAuthorities {
let keyrings = (0..self.n_validators)
.map(|peer_index| Keyring::new(format!("Node{}", peer_index)))
.collect::<Vec<_>>();
// Generate `AuthorityDiscoveryId`` for each peer
let validator_public: Vec<ValidatorId> = keyrings
.iter()
.map(|keyring: &Keyring| keyring.clone().public().into())
.collect::<Vec<_>>();
let validator_authority_id: Vec<AuthorityDiscoveryId> = keyrings
.iter()
.map(|keyring| keyring.clone().public().into())
.collect::<Vec<_>>();
TestAuthorities { keyrings, validator_public, validator_authority_id }
}
/// An unconstrained standard configuration matching Polkadot/Kusama
pub fn ideal_network(
objective: TestObjective,
num_blocks: usize,
n_validators: usize,
n_cores: usize,
min_pov_size: usize,
max_pov_size: usize,
) -> TestConfiguration {
Self {
objective,
n_cores,
n_validators,
pov_sizes: generate_pov_sizes(n_cores, min_pov_size, max_pov_size),
bandwidth: 50 * 1024 * 1024,
peer_bandwidth: 50 * 1024 * 1024,
// No latency
latency: None,
error: 0,
num_blocks,
min_pov_size,
max_pov_size,
connectivity: 100,
}
}
pub fn healthy_network(
objective: TestObjective,
num_blocks: usize,
n_validators: usize,
n_cores: usize,
min_pov_size: usize,
max_pov_size: usize,
) -> TestConfiguration {
Self {
objective,
n_cores,
n_validators,
pov_sizes: generate_pov_sizes(n_cores, min_pov_size, max_pov_size),
bandwidth: 50 * 1024 * 1024,
peer_bandwidth: 50 * 1024 * 1024,
latency: Some(PeerLatency {
min_latency: Duration::from_millis(1),
max_latency: Duration::from_millis(100),
}),
error: 3,
num_blocks,
min_pov_size,
max_pov_size,
connectivity: 95,
}
}
pub fn degraded_network(
objective: TestObjective,
num_blocks: usize,
n_validators: usize,
n_cores: usize,
min_pov_size: usize,
max_pov_size: usize,
) -> TestConfiguration {
Self {
objective,
n_cores,
n_validators,
pov_sizes: generate_pov_sizes(n_cores, min_pov_size, max_pov_size),
bandwidth: 50 * 1024 * 1024,
peer_bandwidth: 50 * 1024 * 1024,
latency: Some(PeerLatency {
min_latency: Duration::from_millis(10),
max_latency: Duration::from_millis(500),
}),
error: 33,
num_blocks,
min_pov_size,
max_pov_size,
connectivity: 67,
}
}
}
/// Produce a randomized duration between `min` and `max`.
pub fn random_latency(maybe_peer_latency: Option<&PeerLatency>) -> Option<Duration> {
maybe_peer_latency.map(|peer_latency| {
Uniform::from(peer_latency.min_latency..=peer_latency.max_latency).sample(&mut thread_rng())
})
}
/// Generate a random error based on `probability`.
/// `probability` should be a number between 0 and 100.
pub fn random_error(probability: usize) -> bool {
Uniform::from(0..=99).sample(&mut thread_rng()) < probability
}
@@ -0,0 +1,191 @@
// Copyright (C) Parity Technologies (UK) Ltd.
// This file is part of Polkadot.
// Polkadot is free software: you can redistribute it and/or modify
// it under the terms of the GNU General Public License as published by
// the Free Software Foundation, either version 3 of the License, or
// (at your option) any later version.
// Polkadot is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU General Public License for more details.
// You should have received a copy of the GNU General Public License
// along with Polkadot. If not, see <http://www.gnu.org/licenses/>.
//
//! Display implementations and helper methods for parsing prometheus metrics
//! to a format that can be displayed in the CLI.
//!
//! Currently histogram buckets are skipped.
use super::{configuration::TestConfiguration, LOG_TARGET};
use colored::Colorize;
use prometheus::{
proto::{MetricFamily, MetricType},
Registry,
};
use std::fmt::Display;
#[derive(Default)]
pub struct MetricCollection(Vec<TestMetric>);
impl From<Vec<TestMetric>> for MetricCollection {
fn from(metrics: Vec<TestMetric>) -> Self {
MetricCollection(metrics)
}
}
impl MetricCollection {
pub fn all(&self) -> &Vec<TestMetric> {
&self.0
}
/// Sums up all metrics with the given name in the collection
pub fn sum_by(&self, name: &str) -> f64 {
self.all()
.iter()
.filter(|metric| metric.name == name)
.map(|metric| metric.value)
.sum()
}
pub fn subset_with_label_value(&self, label_name: &str, label_value: &str) -> MetricCollection {
self.0
.iter()
.filter_map(|metric| {
if let Some(index) = metric.label_names.iter().position(|label| label == label_name)
{
if Some(&String::from(label_value)) == metric.label_values.get(index) {
Some(metric.clone())
} else {
None
}
} else {
None
}
})
.collect::<Vec<_>>()
.into()
}
}
impl Display for MetricCollection {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
writeln!(f)?;
let metrics = self.all();
for metric in metrics {
writeln!(f, "{}", metric)?;
}
Ok(())
}
}
#[derive(Debug, Clone)]
pub struct TestMetric {
name: String,
label_names: Vec<String>,
label_values: Vec<String>,
value: f64,
}
impl Display for TestMetric {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
write!(
f,
"({} = {}) [{:?}, {:?}]",
self.name.cyan(),
format!("{}", self.value).white(),
self.label_names,
self.label_values
)
}
}
// Returns `false` if metric should be skipped.
fn check_metric_family(mf: &MetricFamily) -> bool {
if mf.get_metric().is_empty() {
gum::error!(target: LOG_TARGET, "MetricFamily has no metrics: {:?}", mf);
return false
}
if mf.get_name().is_empty() {
gum::error!(target: LOG_TARGET, "MetricFamily has no name: {:?}", mf);
return false
}
true
}
pub fn parse_metrics(registry: &Registry) -> MetricCollection {
let metric_families = registry.gather();
let mut test_metrics = Vec::new();
for mf in metric_families {
if !check_metric_family(&mf) {
continue
}
let name: String = mf.get_name().into();
let metric_type = mf.get_field_type();
for m in mf.get_metric() {
let (label_names, label_values): (Vec<String>, Vec<String>) = m
.get_label()
.iter()
.map(|pair| (String::from(pair.get_name()), String::from(pair.get_value())))
.unzip();
match metric_type {
MetricType::COUNTER => {
test_metrics.push(TestMetric {
name: name.clone(),
label_names,
label_values,
value: m.get_counter().get_value(),
});
},
MetricType::GAUGE => {
test_metrics.push(TestMetric {
name: name.clone(),
label_names,
label_values,
value: m.get_gauge().get_value(),
});
},
MetricType::HISTOGRAM => {
let h = m.get_histogram();
let h_name = name.clone() + "_sum";
test_metrics.push(TestMetric {
name: h_name,
label_names: label_names.clone(),
label_values: label_values.clone(),
value: h.get_sample_sum(),
});
let h_name = name.clone() + "_count";
test_metrics.push(TestMetric {
name: h_name,
label_names,
label_values,
value: h.get_sample_sum(),
});
},
MetricType::SUMMARY => {
unimplemented!();
},
MetricType::UNTYPED => {
unimplemented!();
},
}
}
}
test_metrics.into()
}
pub fn display_configuration(test_config: &TestConfiguration) {
gum::info!(
"{}, {}, {}, {}, {}",
format!("n_validators = {}", test_config.n_validators).blue(),
format!("n_cores = {}", test_config.n_cores).blue(),
format!("pov_size = {} - {}", test_config.min_pov_size, test_config.max_pov_size)
.bright_black(),
format!("error = {}", test_config.error).bright_black(),
format!("latency = {:?}", test_config.latency).bright_black(),
);
}
@@ -0,0 +1,333 @@
// Copyright (C) Parity Technologies (UK) Ltd.
// This file is part of Polkadot.
// Polkadot is free software: you can redistribute it and/or modify
// it under the terms of the GNU General Public License as published by
// the Free Software Foundation, either version 3 of the License, or
// (at your option) any later version.
// Polkadot is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU General Public License for more details.
// You should have received a copy of the GNU General Public License
// along with Polkadot. If not, see <http://www.gnu.org/licenses/>.
//! Test environment implementation
use crate::{
core::{mock::AlwaysSupportsParachains, network::NetworkEmulator},
TestConfiguration,
};
use colored::Colorize;
use core::time::Duration;
use futures::FutureExt;
use polkadot_overseer::{BlockInfo, Handle as OverseerHandle};
use polkadot_node_subsystem::{messages::AllMessages, Overseer, SpawnGlue, TimeoutExt};
use polkadot_node_subsystem_types::Hash;
use polkadot_node_subsystem_util::metrics::prometheus::{
self, Gauge, Histogram, PrometheusError, Registry, U64,
};
use sc_network::peer_store::LOG_TARGET;
use sc_service::{SpawnTaskHandle, TaskManager};
use std::{
fmt::Display,
net::{Ipv4Addr, SocketAddr},
};
use tokio::runtime::Handle;
const MIB: f64 = 1024.0 * 1024.0;
/// Test environment/configuration metrics
#[derive(Clone)]
pub struct TestEnvironmentMetrics {
/// Number of bytes sent per peer.
n_validators: Gauge<U64>,
/// Number of received sent per peer.
n_cores: Gauge<U64>,
/// PoV size
pov_size: Histogram,
/// Current block
current_block: Gauge<U64>,
/// Current block
block_time: Gauge<U64>,
}
impl TestEnvironmentMetrics {
pub fn new(registry: &Registry) -> Result<Self, PrometheusError> {
let mut buckets = prometheus::exponential_buckets(16384.0, 2.0, 9)
.expect("arguments are always valid; qed");
buckets.extend(vec![5.0 * MIB, 6.0 * MIB, 7.0 * MIB, 8.0 * MIB, 9.0 * MIB, 10.0 * MIB]);
Ok(Self {
n_validators: prometheus::register(
Gauge::new(
"subsystem_benchmark_n_validators",
"Total number of validators in the test",
)?,
registry,
)?,
n_cores: prometheus::register(
Gauge::new(
"subsystem_benchmark_n_cores",
"Number of cores we fetch availability for each block",
)?,
registry,
)?,
current_block: prometheus::register(
Gauge::new("subsystem_benchmark_current_block", "The current test block")?,
registry,
)?,
block_time: prometheus::register(
Gauge::new("subsystem_benchmark_block_time", "The time it takes for the target subsystems(s) to complete all the requests in a block")?,
registry,
)?,
pov_size: prometheus::register(
Histogram::with_opts(
prometheus::HistogramOpts::new(
"subsystem_benchmark_pov_size",
"The compressed size of the proof of validity of a candidate",
)
.buckets(buckets),
)?,
registry,
)?,
})
}
pub fn set_n_validators(&self, n_validators: usize) {
self.n_validators.set(n_validators as u64);
}
pub fn set_n_cores(&self, n_cores: usize) {
self.n_cores.set(n_cores as u64);
}
pub fn set_current_block(&self, current_block: usize) {
self.current_block.set(current_block as u64);
}
pub fn set_block_time(&self, block_time_ms: u64) {
self.block_time.set(block_time_ms);
}
pub fn on_pov_size(&self, pov_size: usize) {
self.pov_size.observe(pov_size as f64);
}
}
fn new_runtime() -> tokio::runtime::Runtime {
tokio::runtime::Builder::new_multi_thread()
.thread_name("subsystem-bench")
.enable_all()
.thread_stack_size(3 * 1024 * 1024)
.build()
.unwrap()
}
/// Wrapper for dependencies
pub struct TestEnvironmentDependencies {
pub registry: Registry,
pub task_manager: TaskManager,
pub runtime: tokio::runtime::Runtime,
}
impl Default for TestEnvironmentDependencies {
fn default() -> Self {
let runtime = new_runtime();
let registry = Registry::new();
let task_manager: TaskManager =
TaskManager::new(runtime.handle().clone(), Some(&registry)).unwrap();
Self { runtime, registry, task_manager }
}
}
// A dummy genesis hash
pub const GENESIS_HASH: Hash = Hash::repeat_byte(0xff);
// We use this to bail out sending messages to the subsystem if it is overloaded such that
// the time of flight is breaches 5s.
// This should eventually be a test parameter.
const MAX_TIME_OF_FLIGHT: Duration = Duration::from_millis(5000);
/// The test environment is the high level wrapper of all things required to test
/// a certain subsystem.
///
/// ## Mockups
/// The overseer is passed in during construction and it can host an arbitrary number of
/// real subsystems instances and the corresponding mocked instances such that the real
/// subsystems can get their messages answered.
///
/// As the subsystem's performance depends on network connectivity, the test environment
/// emulates validator nodes on the network, see `NetworkEmulator`. The network emulation
/// is configurable in terms of peer bandwidth, latency and connection error rate using
/// uniform distribution sampling.
///
///
/// ## Usage
/// `TestEnvironment` is used in tests to send `Overseer` messages or signals to the subsystem
/// under test.
///
/// ## Collecting test metrics
///
/// ### Prometheus
/// A prometheus endpoint is exposed while the test is running. A local Prometheus instance
/// can scrape it every 1s and a Grafana dashboard is the preferred way of visualizing
/// the performance characteristics of the subsystem.
///
/// ### CLI
/// A subset of the Prometheus metrics are printed at the end of the test.
pub struct TestEnvironment {
/// Test dependencies
dependencies: TestEnvironmentDependencies,
/// A runtime handle
runtime_handle: tokio::runtime::Handle,
/// A handle to the lovely overseer
overseer_handle: OverseerHandle,
/// The test configuration.
config: TestConfiguration,
/// A handle to the network emulator.
network: NetworkEmulator,
/// Configuration/env metrics
metrics: TestEnvironmentMetrics,
}
impl TestEnvironment {
/// Create a new test environment
pub fn new(
dependencies: TestEnvironmentDependencies,
config: TestConfiguration,
network: NetworkEmulator,
overseer: Overseer<SpawnGlue<SpawnTaskHandle>, AlwaysSupportsParachains>,
overseer_handle: OverseerHandle,
) -> Self {
let metrics = TestEnvironmentMetrics::new(&dependencies.registry)
.expect("Metrics need to be registered");
let spawn_handle = dependencies.task_manager.spawn_handle();
spawn_handle.spawn_blocking("overseer", "overseer", overseer.run().boxed());
let registry_clone = dependencies.registry.clone();
dependencies.task_manager.spawn_handle().spawn_blocking(
"prometheus",
"test-environment",
async move {
prometheus_endpoint::init_prometheus(
SocketAddr::new(std::net::IpAddr::V4(Ipv4Addr::LOCALHOST), 9999),
registry_clone,
)
.await
.unwrap();
},
);
TestEnvironment {
runtime_handle: dependencies.runtime.handle().clone(),
dependencies,
overseer_handle,
config,
network,
metrics,
}
}
pub fn config(&self) -> &TestConfiguration {
&self.config
}
pub fn network(&self) -> &NetworkEmulator {
&self.network
}
pub fn registry(&self) -> &Registry {
&self.dependencies.registry
}
pub fn metrics(&self) -> &TestEnvironmentMetrics {
&self.metrics
}
pub fn runtime(&self) -> Handle {
self.runtime_handle.clone()
}
// Send a message to the subsystem under test environment.
pub async fn send_message(&mut self, msg: AllMessages) {
self.overseer_handle
.send_msg(msg, LOG_TARGET)
.timeout(MAX_TIME_OF_FLIGHT)
.await
.unwrap_or_else(|| {
panic!("{}ms maximum time of flight breached", MAX_TIME_OF_FLIGHT.as_millis())
});
}
// Send an `ActiveLeavesUpdate` signal to all subsystems under test.
pub async fn import_block(&mut self, block: BlockInfo) {
self.overseer_handle
.block_imported(block)
.timeout(MAX_TIME_OF_FLIGHT)
.await
.unwrap_or_else(|| {
panic!("{}ms maximum time of flight breached", MAX_TIME_OF_FLIGHT.as_millis())
});
}
// Stop overseer and subsystems.
pub async fn stop(&mut self) {
self.overseer_handle.stop().await;
}
}
impl Display for TestEnvironment {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
let stats = self.network().stats();
writeln!(f, "\n")?;
writeln!(
f,
"Total received from network: {}",
format!(
"{} MiB",
stats
.iter()
.enumerate()
.map(|(_index, stats)| stats.tx_bytes_total as u128)
.sum::<u128>() / (1024 * 1024)
)
.cyan()
)?;
writeln!(
f,
"Total sent to network: {}",
format!("{} KiB", stats[0].tx_bytes_total / (1024)).cyan()
)?;
let test_metrics = super::display::parse_metrics(self.registry());
let subsystem_cpu_metrics =
test_metrics.subset_with_label_value("task_group", "availability-recovery");
let total_cpu = subsystem_cpu_metrics.sum_by("substrate_tasks_polling_duration_sum");
writeln!(f, "Total subsystem CPU usage {}", format!("{:.2}s", total_cpu).bright_purple())?;
writeln!(
f,
"CPU usage per block {}",
format!("{:.2}s", total_cpu / self.config().num_blocks as f64).bright_purple()
)?;
let test_env_cpu_metrics =
test_metrics.subset_with_label_value("task_group", "test-environment");
let total_cpu = test_env_cpu_metrics.sum_by("substrate_tasks_polling_duration_sum");
writeln!(
f,
"Total test environment CPU usage {}",
format!("{:.2}s", total_cpu).bright_purple()
)?;
writeln!(
f,
"CPU usage per block {}",
format!("{:.2}s", total_cpu / self.config().num_blocks as f64).bright_purple()
)
}
}
@@ -0,0 +1,40 @@
// Copyright (C) Parity Technologies (UK) Ltd.
// This file is part of Polkadot.
// Polkadot is free software: you can redistribute it and/or modify
// it under the terms of the GNU General Public License as published by
// the Free Software Foundation, either version 3 of the License, or
// (at your option) any later version.
// Polkadot is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU General Public License for more details.
// You should have received a copy of the GNU General Public License
// along with Polkadot. If not, see <http://www.gnu.org/licenses/>.
pub use sp_core::sr25519;
use sp_core::{
sr25519::{Pair, Public},
Pair as PairT,
};
/// Set of test accounts.
#[derive(Debug, Clone, PartialEq, Eq, Hash)]
pub struct Keyring {
name: String,
}
impl Keyring {
pub fn new(name: String) -> Keyring {
Self { name }
}
pub fn pair(self) -> Pair {
Pair::from_string(&format!("//{}", self.name), None).expect("input is always good; qed")
}
pub fn public(self) -> Public {
self.pair().public()
}
}
@@ -0,0 +1,137 @@
// Copyright (C) Parity Technologies (UK) Ltd.
// This file is part of Polkadot.
// Polkadot is free software: you can redistribute it and/or modify
// it under the terms of the GNU General Public License as published by
// the Free Software Foundation, either version 3 of the License, or
// (at your option) any later version.
// Polkadot is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU General Public License for more details.
// You should have received a copy of the GNU General Public License
// along with Polkadot. If not, see <http://www.gnu.org/licenses/>.
//!
//! A generic av store subsystem mockup suitable to be used in benchmarks.
use parity_scale_codec::Encode;
use polkadot_primitives::CandidateHash;
use std::collections::HashMap;
use futures::{channel::oneshot, FutureExt};
use polkadot_node_primitives::ErasureChunk;
use polkadot_node_subsystem::{
messages::AvailabilityStoreMessage, overseer, SpawnedSubsystem, SubsystemError,
};
use polkadot_node_subsystem_types::OverseerSignal;
pub struct AvailabilityStoreState {
candidate_hashes: HashMap<CandidateHash, usize>,
chunks: Vec<Vec<ErasureChunk>>,
}
const LOG_TARGET: &str = "subsystem-bench::av-store-mock";
/// A mock of the availability store subsystem. This one also generates all the
/// candidates that a
pub struct MockAvailabilityStore {
state: AvailabilityStoreState,
}
impl MockAvailabilityStore {
pub fn new(
chunks: Vec<Vec<ErasureChunk>>,
candidate_hashes: HashMap<CandidateHash, usize>,
) -> MockAvailabilityStore {
Self { state: AvailabilityStoreState { chunks, candidate_hashes } }
}
async fn respond_to_query_all_request(
&self,
candidate_hash: CandidateHash,
send_chunk: impl Fn(usize) -> bool,
tx: oneshot::Sender<Vec<ErasureChunk>>,
) {
let candidate_index = self
.state
.candidate_hashes
.get(&candidate_hash)
.expect("candidate was generated previously; qed");
gum::debug!(target: LOG_TARGET, ?candidate_hash, candidate_index, "Candidate mapped to index");
let v = self
.state
.chunks
.get(*candidate_index)
.unwrap()
.iter()
.filter(|c| send_chunk(c.index.0 as usize))
.cloned()
.collect();
let _ = tx.send(v);
}
}
#[overseer::subsystem(AvailabilityStore, error=SubsystemError, prefix=self::overseer)]
impl<Context> MockAvailabilityStore {
fn start(self, ctx: Context) -> SpawnedSubsystem {
let future = self.run(ctx).map(|_| Ok(())).boxed();
SpawnedSubsystem { name: "test-environment", future }
}
}
#[overseer::contextbounds(AvailabilityStore, prefix = self::overseer)]
impl MockAvailabilityStore {
async fn run<Context>(self, mut ctx: Context) {
gum::debug!(target: LOG_TARGET, "Subsystem running");
loop {
let msg = ctx.recv().await.expect("Overseer never fails us");
match msg {
orchestra::FromOrchestra::Signal(signal) =>
if signal == OverseerSignal::Conclude {
return
},
orchestra::FromOrchestra::Communication { msg } => match msg {
AvailabilityStoreMessage::QueryAvailableData(candidate_hash, tx) => {
gum::debug!(target: LOG_TARGET, candidate_hash = ?candidate_hash, "Responding to QueryAvailableData");
// We never have the full available data.
let _ = tx.send(None);
},
AvailabilityStoreMessage::QueryAllChunks(candidate_hash, tx) => {
// We always have our own chunk.
gum::debug!(target: LOG_TARGET, candidate_hash = ?candidate_hash, "Responding to QueryAllChunks");
self.respond_to_query_all_request(candidate_hash, |index| index == 0, tx)
.await;
},
AvailabilityStoreMessage::QueryChunkSize(candidate_hash, tx) => {
gum::debug!(target: LOG_TARGET, candidate_hash = ?candidate_hash, "Responding to QueryChunkSize");
let candidate_index = self
.state
.candidate_hashes
.get(&candidate_hash)
.expect("candidate was generated previously; qed");
gum::debug!(target: LOG_TARGET, ?candidate_hash, candidate_index, "Candidate mapped to index");
let chunk_size =
self.state.chunks.get(*candidate_index).unwrap()[0].encoded_size();
let _ = tx.send(Some(chunk_size));
},
_ => {
unimplemented!("Unexpected av-store message")
},
},
}
}
}
}
@@ -0,0 +1,98 @@
// Copyright (C) Parity Technologies (UK) Ltd.
// This file is part of Polkadot.
// Polkadot is free software: you can redistribute it and/or modify
// it under the terms of the GNU General Public License as published by
// the Free Software Foundation, either version 3 of the License, or
// (at your option) any later version.
// Polkadot is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU General Public License for more details.
// You should have received a copy of the GNU General Public License
// along with Polkadot. If not, see <http://www.gnu.org/licenses/>.
//! Dummy subsystem mocks.
use paste::paste;
use futures::FutureExt;
use polkadot_node_subsystem::{overseer, SpawnedSubsystem, SubsystemError};
use std::time::Duration;
use tokio::time::sleep;
const LOG_TARGET: &str = "subsystem-bench::mockery";
macro_rules! mock {
// Just query by relay parent
($subsystem_name:ident) => {
paste! {
pub struct [<Mock $subsystem_name >] {}
#[overseer::subsystem($subsystem_name, error=SubsystemError, prefix=self::overseer)]
impl<Context> [<Mock $subsystem_name >] {
fn start(self, ctx: Context) -> SpawnedSubsystem {
let future = self.run(ctx).map(|_| Ok(())).boxed();
// The name will appear in substrate CPU task metrics as `task_group`.`
SpawnedSubsystem { name: "test-environment", future }
}
}
#[overseer::contextbounds($subsystem_name, prefix = self::overseer)]
impl [<Mock $subsystem_name >] {
async fn run<Context>(self, mut ctx: Context) {
let mut count_total_msg = 0;
loop {
futures::select!{
msg = ctx.recv().fuse() => {
match msg.unwrap() {
orchestra::FromOrchestra::Signal(signal) => {
match signal {
polkadot_node_subsystem_types::OverseerSignal::Conclude => {return},
_ => {}
}
},
orchestra::FromOrchestra::Communication { msg } => {
gum::debug!(target: LOG_TARGET, msg = ?msg, "mocked subsystem received message");
}
}
count_total_msg +=1;
}
_ = sleep(Duration::from_secs(6)).fuse() => {
if count_total_msg > 0 {
gum::trace!(target: LOG_TARGET, "Subsystem {} processed {} messages since last time", stringify!($subsystem_name), count_total_msg);
}
count_total_msg = 0;
}
}
}
}
}
}
};
}
mock!(AvailabilityStore);
mock!(StatementDistribution);
mock!(BitfieldSigning);
mock!(BitfieldDistribution);
mock!(Provisioner);
mock!(NetworkBridgeRx);
mock!(CollationGeneration);
mock!(CollatorProtocol);
mock!(GossipSupport);
mock!(DisputeDistribution);
mock!(DisputeCoordinator);
mock!(ProspectiveParachains);
mock!(PvfChecker);
mock!(CandidateBacking);
mock!(AvailabilityDistribution);
mock!(CandidateValidation);
mock!(AvailabilityRecovery);
mock!(NetworkBridgeTx);
mock!(ChainApi);
mock!(ChainSelection);
mock!(ApprovalVoting);
mock!(ApprovalDistribution);
mock!(RuntimeApi);
@@ -0,0 +1,77 @@
// Copyright (C) Parity Technologies (UK) Ltd.
// This file is part of Polkadot.
// Polkadot is free software: you can redistribute it and/or modify
// it under the terms of the GNU General Public License as published by
// the Free Software Foundation, either version 3 of the License, or
// (at your option) any later version.
// Polkadot is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU General Public License for more details.
// You should have received a copy of the GNU General Public License
// along with Polkadot. If not, see <http://www.gnu.org/licenses/>.
use polkadot_node_subsystem::HeadSupportsParachains;
use polkadot_node_subsystem_types::Hash;
pub mod av_store;
pub mod dummy;
pub mod network_bridge;
pub mod runtime_api;
pub use av_store::*;
pub use network_bridge::*;
pub use runtime_api::*;
pub struct AlwaysSupportsParachains {}
#[async_trait::async_trait]
impl HeadSupportsParachains for AlwaysSupportsParachains {
async fn head_supports_parachains(&self, _head: &Hash) -> bool {
true
}
}
// An orchestra with dummy subsystems
macro_rules! dummy_builder {
($spawn_task_handle: ident) => {{
use super::core::mock::dummy::*;
// Initialize a mock overseer.
// All subsystem except approval_voting and approval_distribution are mock subsystems.
Overseer::builder()
.approval_voting(MockApprovalVoting {})
.approval_distribution(MockApprovalDistribution {})
.availability_recovery(MockAvailabilityRecovery {})
.candidate_validation(MockCandidateValidation {})
.chain_api(MockChainApi {})
.chain_selection(MockChainSelection {})
.dispute_coordinator(MockDisputeCoordinator {})
.runtime_api(MockRuntimeApi {})
.network_bridge_tx(MockNetworkBridgeTx {})
.availability_distribution(MockAvailabilityDistribution {})
.availability_store(MockAvailabilityStore {})
.pvf_checker(MockPvfChecker {})
.candidate_backing(MockCandidateBacking {})
.statement_distribution(MockStatementDistribution {})
.bitfield_signing(MockBitfieldSigning {})
.bitfield_distribution(MockBitfieldDistribution {})
.provisioner(MockProvisioner {})
.network_bridge_rx(MockNetworkBridgeRx {})
.collation_generation(MockCollationGeneration {})
.collator_protocol(MockCollatorProtocol {})
.gossip_support(MockGossipSupport {})
.dispute_distribution(MockDisputeDistribution {})
.prospective_parachains(MockProspectiveParachains {})
.activation_external_listeners(Default::default())
.span_per_active_leaf(Default::default())
.active_leaves(Default::default())
.metrics(Default::default())
.supports_parachains(AlwaysSupportsParachains {})
.spawner(SpawnGlue($spawn_task_handle))
}};
}
pub(crate) use dummy_builder;
@@ -0,0 +1,323 @@
// Copyright (C) Parity Technologies (UK) Ltd.
// This file is part of Polkadot.
// Polkadot is free software: you can redistribute it and/or modify
// it under the terms of the GNU General Public License as published by
// the Free Software Foundation, either version 3 of the License, or
// (at your option) any later version.
// Polkadot is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU General Public License for more details.
// You should have received a copy of the GNU General Public License
// along with Polkadot. If not, see <http://www.gnu.org/licenses/>.
//!
//! A generic av store subsystem mockup suitable to be used in benchmarks.
use futures::Future;
use parity_scale_codec::Encode;
use polkadot_node_subsystem_types::OverseerSignal;
use std::{collections::HashMap, pin::Pin};
use futures::FutureExt;
use polkadot_node_primitives::{AvailableData, ErasureChunk};
use polkadot_primitives::CandidateHash;
use sc_network::{OutboundFailure, RequestFailure};
use polkadot_node_subsystem::{
messages::NetworkBridgeTxMessage, overseer, SpawnedSubsystem, SubsystemError,
};
use polkadot_node_network_protocol::request_response::{
self as req_res, v1::ChunkResponse, Requests,
};
use polkadot_primitives::AuthorityDiscoveryId;
use crate::core::{
configuration::{random_error, random_latency, TestConfiguration},
network::{NetworkAction, NetworkEmulator, RateLimit},
};
/// The availability store state of all emulated peers.
/// The network bridge tx mock will respond to requests as if the request is being serviced
/// by a remote peer on the network
pub struct NetworkAvailabilityState {
pub candidate_hashes: HashMap<CandidateHash, usize>,
pub available_data: Vec<AvailableData>,
pub chunks: Vec<Vec<ErasureChunk>>,
}
const LOG_TARGET: &str = "subsystem-bench::network-bridge-tx-mock";
/// A mock of the network bridge tx subsystem.
pub struct MockNetworkBridgeTx {
/// The test configurationg
config: TestConfiguration,
/// The network availability state
availabilty: NetworkAvailabilityState,
/// A network emulator instance
network: NetworkEmulator,
}
impl MockNetworkBridgeTx {
pub fn new(
config: TestConfiguration,
availabilty: NetworkAvailabilityState,
network: NetworkEmulator,
) -> MockNetworkBridgeTx {
Self { config, availabilty, network }
}
fn not_connected_response(
&self,
authority_discovery_id: &AuthorityDiscoveryId,
future: Pin<Box<dyn Future<Output = ()> + Send>>,
) -> NetworkAction {
// The network action will send the error after a random delay expires.
return NetworkAction::new(
authority_discovery_id.clone(),
future,
0,
// Generate a random latency based on configuration.
random_latency(self.config.latency.as_ref()),
)
}
/// Returns an `NetworkAction` corresponding to the peer sending the response. If
/// the peer is connected, the error is sent with a randomized latency as defined in
/// configuration.
fn respond_to_send_request(
&mut self,
request: Requests,
ingress_tx: &mut tokio::sync::mpsc::UnboundedSender<NetworkAction>,
) -> NetworkAction {
let ingress_tx = ingress_tx.clone();
match request {
Requests::ChunkFetchingV1(outgoing_request) => {
let authority_discovery_id = match outgoing_request.peer {
req_res::Recipient::Authority(authority_discovery_id) => authority_discovery_id,
_ => unimplemented!("Peer recipient not supported yet"),
};
// Account our sent request bytes.
self.network.peer_stats(0).inc_sent(outgoing_request.payload.encoded_size());
// If peer is disconnected return an error
if !self.network.is_peer_connected(&authority_discovery_id) {
// We always send `NotConnected` error and we ignore `IfDisconnected` value in
// the caller.
let future = async move {
let _ = outgoing_request
.pending_response
.send(Err(RequestFailure::NotConnected));
}
.boxed();
return self.not_connected_response(&authority_discovery_id, future)
}
// Account for remote received request bytes.
self.network
.peer_stats_by_id(&authority_discovery_id)
.inc_received(outgoing_request.payload.encoded_size());
let validator_index: usize = outgoing_request.payload.index.0 as usize;
let candidate_hash = outgoing_request.payload.candidate_hash;
let candidate_index = self
.availabilty
.candidate_hashes
.get(&candidate_hash)
.expect("candidate was generated previously; qed");
gum::warn!(target: LOG_TARGET, ?candidate_hash, candidate_index, "Candidate mapped to index");
let chunk: ChunkResponse = self.availabilty.chunks.get(*candidate_index).unwrap()
[validator_index]
.clone()
.into();
let mut size = chunk.encoded_size();
let response = if random_error(self.config.error) {
// Error will not account to any bandwidth used.
size = 0;
Err(RequestFailure::Network(OutboundFailure::ConnectionClosed))
} else {
Ok(req_res::v1::ChunkFetchingResponse::from(Some(chunk)).encode())
};
let authority_discovery_id_clone = authority_discovery_id.clone();
let future = async move {
let _ = outgoing_request.pending_response.send(response);
}
.boxed();
let future_wrapper = async move {
// Forward the response to the ingress channel of our node.
// On receive side we apply our node receiving rate limit.
let action =
NetworkAction::new(authority_discovery_id_clone, future, size, None);
ingress_tx.send(action).unwrap();
}
.boxed();
NetworkAction::new(
authority_discovery_id,
future_wrapper,
size,
// Generate a random latency based on configuration.
random_latency(self.config.latency.as_ref()),
)
},
Requests::AvailableDataFetchingV1(outgoing_request) => {
let candidate_hash = outgoing_request.payload.candidate_hash;
let candidate_index = self
.availabilty
.candidate_hashes
.get(&candidate_hash)
.expect("candidate was generated previously; qed");
gum::debug!(target: LOG_TARGET, ?candidate_hash, candidate_index, "Candidate mapped to index");
let authority_discovery_id = match outgoing_request.peer {
req_res::Recipient::Authority(authority_discovery_id) => authority_discovery_id,
_ => unimplemented!("Peer recipient not supported yet"),
};
// Account our sent request bytes.
self.network.peer_stats(0).inc_sent(outgoing_request.payload.encoded_size());
// If peer is disconnected return an error
if !self.network.is_peer_connected(&authority_discovery_id) {
let future = async move {
let _ = outgoing_request
.pending_response
.send(Err(RequestFailure::NotConnected));
}
.boxed();
return self.not_connected_response(&authority_discovery_id, future)
}
// Account for remote received request bytes.
self.network
.peer_stats_by_id(&authority_discovery_id)
.inc_received(outgoing_request.payload.encoded_size());
let available_data =
self.availabilty.available_data.get(*candidate_index).unwrap().clone();
let size = available_data.encoded_size();
let response = if random_error(self.config.error) {
Err(RequestFailure::Network(OutboundFailure::ConnectionClosed))
} else {
Ok(req_res::v1::AvailableDataFetchingResponse::from(Some(available_data))
.encode())
};
let future = async move {
let _ = outgoing_request.pending_response.send(response);
}
.boxed();
let authority_discovery_id_clone = authority_discovery_id.clone();
let future_wrapper = async move {
// Forward the response to the ingress channel of our node.
// On receive side we apply our node receiving rate limit.
let action =
NetworkAction::new(authority_discovery_id_clone, future, size, None);
ingress_tx.send(action).unwrap();
}
.boxed();
NetworkAction::new(
authority_discovery_id,
future_wrapper,
size,
// Generate a random latency based on configuration.
random_latency(self.config.latency.as_ref()),
)
},
_ => panic!("received an unexpected request"),
}
}
}
#[overseer::subsystem(NetworkBridgeTx, error=SubsystemError, prefix=self::overseer)]
impl<Context> MockNetworkBridgeTx {
fn start(self, ctx: Context) -> SpawnedSubsystem {
let future = self.run(ctx).map(|_| Ok(())).boxed();
SpawnedSubsystem { name: "test-environment", future }
}
}
#[overseer::contextbounds(NetworkBridgeTx, prefix = self::overseer)]
impl MockNetworkBridgeTx {
async fn run<Context>(mut self, mut ctx: Context) {
let (mut ingress_tx, mut ingress_rx) =
tokio::sync::mpsc::unbounded_channel::<NetworkAction>();
// Initialize our node bandwidth limits.
let mut rx_limiter = RateLimit::new(10, self.config.bandwidth);
let our_network = self.network.clone();
// This task will handle node messages receipt from the simulated network.
ctx.spawn_blocking(
"network-receive",
async move {
while let Some(action) = ingress_rx.recv().await {
let size = action.size();
// account for our node receiving the data.
our_network.inc_received(size);
rx_limiter.reap(size).await;
action.run().await;
}
}
.boxed(),
)
.expect("We never fail to spawn tasks");
// Main subsystem loop.
loop {
let msg = ctx.recv().await.expect("Overseer never fails us");
match msg {
orchestra::FromOrchestra::Signal(signal) =>
if signal == OverseerSignal::Conclude {
return
},
orchestra::FromOrchestra::Communication { msg } => match msg {
NetworkBridgeTxMessage::SendRequests(requests, _if_disconnected) => {
for request in requests {
gum::debug!(target: LOG_TARGET, request = ?request, "Processing request");
self.network.inc_sent(request_size(&request));
let action = self.respond_to_send_request(request, &mut ingress_tx);
// Will account for our node sending the request over the emulated
// network.
self.network.submit_peer_action(action.peer(), action);
}
},
_ => {
unimplemented!("Unexpected network bridge message")
},
},
}
}
}
}
// A helper to determine the request payload size.
fn request_size(request: &Requests) -> usize {
match request {
Requests::ChunkFetchingV1(outgoing_request) => outgoing_request.payload.encoded_size(),
Requests::AvailableDataFetchingV1(outgoing_request) =>
outgoing_request.payload.encoded_size(),
_ => unimplemented!("received an unexpected request"),
}
}
@@ -0,0 +1,110 @@
// Copyright (C) Parity Technologies (UK) Ltd.
// This file is part of Polkadot.
// Polkadot is free software: you can redistribute it and/or modify
// it under the terms of the GNU General Public License as published by
// the Free Software Foundation, either version 3 of the License, or
// (at your option) any later version.
// Polkadot is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU General Public License for more details.
// You should have received a copy of the GNU General Public License
// along with Polkadot. If not, see <http://www.gnu.org/licenses/>.
//!
//! A generic runtime api subsystem mockup suitable to be used in benchmarks.
use polkadot_primitives::{GroupIndex, IndexedVec, SessionInfo, ValidatorIndex};
use polkadot_node_subsystem::{
messages::{RuntimeApiMessage, RuntimeApiRequest},
overseer, SpawnedSubsystem, SubsystemError,
};
use polkadot_node_subsystem_types::OverseerSignal;
use crate::core::configuration::{TestAuthorities, TestConfiguration};
use futures::FutureExt;
const LOG_TARGET: &str = "subsystem-bench::runtime-api-mock";
pub struct RuntimeApiState {
authorities: TestAuthorities,
}
pub struct MockRuntimeApi {
state: RuntimeApiState,
config: TestConfiguration,
}
impl MockRuntimeApi {
pub fn new(config: TestConfiguration, authorities: TestAuthorities) -> MockRuntimeApi {
Self { state: RuntimeApiState { authorities }, config }
}
fn session_info(&self) -> SessionInfo {
let all_validators = (0..self.config.n_validators)
.map(|i| ValidatorIndex(i as _))
.collect::<Vec<_>>();
let validator_groups = all_validators.chunks(5).map(Vec::from).collect::<Vec<_>>();
SessionInfo {
validators: self.state.authorities.validator_public.clone().into(),
discovery_keys: self.state.authorities.validator_authority_id.clone(),
validator_groups: IndexedVec::<GroupIndex, Vec<ValidatorIndex>>::from(validator_groups),
assignment_keys: vec![],
n_cores: self.config.n_cores as u32,
zeroth_delay_tranche_width: 0,
relay_vrf_modulo_samples: 0,
n_delay_tranches: 0,
no_show_slots: 0,
needed_approvals: 0,
active_validator_indices: vec![],
dispute_period: 6,
random_seed: [0u8; 32],
}
}
}
#[overseer::subsystem(RuntimeApi, error=SubsystemError, prefix=self::overseer)]
impl<Context> MockRuntimeApi {
fn start(self, ctx: Context) -> SpawnedSubsystem {
let future = self.run(ctx).map(|_| Ok(())).boxed();
SpawnedSubsystem { name: "test-environment", future }
}
}
#[overseer::contextbounds(RuntimeApi, prefix = self::overseer)]
impl MockRuntimeApi {
async fn run<Context>(self, mut ctx: Context) {
loop {
let msg = ctx.recv().await.expect("Overseer never fails us");
match msg {
orchestra::FromOrchestra::Signal(signal) =>
if signal == OverseerSignal::Conclude {
return
},
orchestra::FromOrchestra::Communication { msg } => {
gum::debug!(target: LOG_TARGET, msg=?msg, "recv message");
match msg {
RuntimeApiMessage::Request(
_request,
RuntimeApiRequest::SessionInfo(_session_index, sender),
) => {
let _ = sender.send(Ok(Some(self.session_info())));
},
// Long term TODO: implement more as needed.
_ => {
unimplemented!("Unexpected runtime-api message")
},
}
},
}
}
}
}
@@ -0,0 +1,24 @@
// Copyright (C) Parity Technologies (UK) Ltd.
// This file is part of Polkadot.
// Polkadot is free software: you can redistribute it and/or modify
// it under the terms of the GNU General Public License as published by
// the Free Software Foundation, either version 3 of the License, or
// (at your option) any later version.
// Polkadot is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU General Public License for more details.
// You should have received a copy of the GNU General Public License
// along with Polkadot. If not, see <http://www.gnu.org/licenses/>.
const LOG_TARGET: &str = "subsystem-bench::core";
pub mod configuration;
pub mod display;
pub mod environment;
pub mod keyring;
pub mod mock;
pub mod network;
@@ -0,0 +1,485 @@
// Copyright (C) Parity Technologies (UK) Ltd.
// This file is part of Polkadot.
// Polkadot is free software: you can redistribute it and/or modify
// it under the terms of the GNU General Public License as published by
// the Free Software Foundation, either version 3 of the License, or
// (at your option) any later version.
// Polkadot is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU General Public License for more details.
// You should have received a copy of the GNU General Public License
// along with Polkadot. If not, see <http://www.gnu.org/licenses/>.
use super::{
configuration::{TestAuthorities, TestConfiguration},
environment::TestEnvironmentDependencies,
*,
};
use colored::Colorize;
use polkadot_primitives::AuthorityDiscoveryId;
use prometheus_endpoint::U64;
use rand::{seq::SliceRandom, thread_rng};
use sc_service::SpawnTaskHandle;
use std::{
collections::HashMap,
sync::{
atomic::{AtomicU64, Ordering},
Arc,
},
time::{Duration, Instant},
};
use tokio::sync::mpsc::UnboundedSender;
// An emulated node egress traffic rate_limiter.
#[derive(Debug)]
pub struct RateLimit {
// How often we refill credits in buckets
tick_rate: usize,
// Total ticks
total_ticks: usize,
// Max refill per tick
max_refill: usize,
// Available credit. We allow for bursts over 1/tick_rate of `cps` budget, but we
// account it by negative credit.
credits: isize,
// When last refilled.
last_refill: Instant,
}
impl RateLimit {
// Create a new `RateLimit` from a `cps` (credits per second) budget and
// `tick_rate`.
pub fn new(tick_rate: usize, cps: usize) -> Self {
// Compute how much refill for each tick
let max_refill = cps / tick_rate;
RateLimit {
tick_rate,
total_ticks: 0,
max_refill,
// A fresh start
credits: max_refill as isize,
last_refill: Instant::now(),
}
}
pub async fn refill(&mut self) {
// If this is called to early, we need to sleep until next tick.
let now = Instant::now();
let next_tick_delta =
(self.last_refill + Duration::from_millis(1000 / self.tick_rate as u64)) - now;
// Sleep until next tick.
if !next_tick_delta.is_zero() {
gum::trace!(target: LOG_TARGET, "need to sleep {}ms", next_tick_delta.as_millis());
tokio::time::sleep(next_tick_delta).await;
}
self.total_ticks += 1;
self.credits += self.max_refill as isize;
self.last_refill = Instant::now();
}
// Reap credits from the bucket.
// Blocks if credits budged goes negative during call.
pub async fn reap(&mut self, amount: usize) {
self.credits -= amount as isize;
if self.credits >= 0 {
return
}
while self.credits < 0 {
gum::trace!(target: LOG_TARGET, "Before refill: {:?}", &self);
self.refill().await;
gum::trace!(target: LOG_TARGET, "After refill: {:?}", &self);
}
}
}
#[cfg(test)]
mod tests {
use std::time::Instant;
use super::RateLimit;
#[tokio::test]
async fn test_expected_rate() {
let tick_rate = 200;
let budget = 1_000_000;
// rate must not exceeed 100 credits per second
let mut rate_limiter = RateLimit::new(tick_rate, budget);
let mut total_sent = 0usize;
let start = Instant::now();
let mut reap_amount = 0;
while rate_limiter.total_ticks < tick_rate {
reap_amount += 1;
reap_amount %= 100;
rate_limiter.reap(reap_amount).await;
total_sent += reap_amount;
}
let end = Instant::now();
println!("duration: {}", (end - start).as_millis());
// Allow up to `budget/max_refill` error tolerance
let lower_bound = budget as u128 * ((end - start).as_millis() / 1000u128);
let upper_bound = budget as u128 *
((end - start).as_millis() / 1000u128 + rate_limiter.max_refill as u128);
assert!(total_sent as u128 >= lower_bound);
assert!(total_sent as u128 <= upper_bound);
}
}
// A network peer emulator. It spawns a task that accepts `NetworkActions` and
// executes them with a configurable delay and bandwidth constraints. Tipically
// these actions wrap a future that performs a channel send to the subsystem(s) under test.
#[derive(Clone)]
struct PeerEmulator {
// The queue of requests waiting to be served by the emulator
actions_tx: UnboundedSender<NetworkAction>,
}
impl PeerEmulator {
pub fn new(
bandwidth: usize,
spawn_task_handle: SpawnTaskHandle,
stats: Arc<PeerEmulatorStats>,
) -> Self {
let (actions_tx, mut actions_rx) = tokio::sync::mpsc::unbounded_channel();
spawn_task_handle
.clone()
.spawn("peer-emulator", "test-environment", async move {
// Rate limit peer send.
let mut rate_limiter = RateLimit::new(10, bandwidth);
loop {
let stats_clone = stats.clone();
let maybe_action: Option<NetworkAction> = actions_rx.recv().await;
if let Some(action) = maybe_action {
let size = action.size();
rate_limiter.reap(size).await;
if let Some(latency) = action.latency {
spawn_task_handle.spawn(
"peer-emulator-latency",
"test-environment",
async move {
tokio::time::sleep(latency).await;
action.run().await;
stats_clone.inc_sent(size);
},
)
} else {
action.run().await;
stats_clone.inc_sent(size);
}
} else {
break
}
}
});
Self { actions_tx }
}
// Queue a send request from the emulated peer.
pub fn send(&mut self, action: NetworkAction) {
self.actions_tx.send(action).expect("peer emulator task lives");
}
}
pub type ActionFuture = std::pin::Pin<Box<dyn futures::Future<Output = ()> + std::marker::Send>>;
/// An network action to be completed by the emulator task.
pub struct NetworkAction {
// The function that performs the action
run: ActionFuture,
// The payload size that we simulate sending/receiving from a peer
size: usize,
// Peer which should run the action.
peer: AuthorityDiscoveryId,
// The amount of time to delay the polling `run`
latency: Option<Duration>,
}
unsafe impl Send for NetworkAction {}
/// Book keeping of sent and received bytes.
pub struct PeerEmulatorStats {
rx_bytes_total: AtomicU64,
tx_bytes_total: AtomicU64,
metrics: Metrics,
peer_index: usize,
}
impl PeerEmulatorStats {
pub(crate) fn new(peer_index: usize, metrics: Metrics) -> Self {
Self {
metrics,
rx_bytes_total: AtomicU64::from(0),
tx_bytes_total: AtomicU64::from(0),
peer_index,
}
}
pub fn inc_sent(&self, bytes: usize) {
self.tx_bytes_total.fetch_add(bytes as u64, Ordering::Relaxed);
self.metrics.on_peer_sent(self.peer_index, bytes);
}
pub fn inc_received(&self, bytes: usize) {
self.rx_bytes_total.fetch_add(bytes as u64, Ordering::Relaxed);
self.metrics.on_peer_received(self.peer_index, bytes);
}
pub fn sent(&self) -> u64 {
self.tx_bytes_total.load(Ordering::Relaxed)
}
pub fn received(&self) -> u64 {
self.rx_bytes_total.load(Ordering::Relaxed)
}
}
#[derive(Debug, Default)]
pub struct PeerStats {
pub rx_bytes_total: u64,
pub tx_bytes_total: u64,
}
impl NetworkAction {
pub fn new(
peer: AuthorityDiscoveryId,
run: ActionFuture,
size: usize,
latency: Option<Duration>,
) -> Self {
Self { run, size, peer, latency }
}
pub fn size(&self) -> usize {
self.size
}
pub async fn run(self) {
self.run.await;
}
pub fn peer(&self) -> AuthorityDiscoveryId {
self.peer.clone()
}
}
/// The state of a peer on the emulated network.
#[derive(Clone)]
enum Peer {
Connected(PeerEmulator),
Disconnected(PeerEmulator),
}
impl Peer {
pub fn disconnect(&mut self) {
let new_self = match self {
Peer::Connected(peer) => Peer::Disconnected(peer.clone()),
_ => return,
};
*self = new_self;
}
pub fn is_connected(&self) -> bool {
matches!(self, Peer::Connected(_))
}
pub fn emulator(&mut self) -> &mut PeerEmulator {
match self {
Peer::Connected(ref mut emulator) => emulator,
Peer::Disconnected(ref mut emulator) => emulator,
}
}
}
/// Mocks the network bridge and an arbitrary number of connected peer nodes.
/// Implements network latency, bandwidth and connection errors.
#[derive(Clone)]
pub struct NetworkEmulator {
// Per peer network emulation.
peers: Vec<Peer>,
/// Per peer stats.
stats: Vec<Arc<PeerEmulatorStats>>,
/// Each emulated peer is a validator.
validator_authority_ids: HashMap<AuthorityDiscoveryId, usize>,
}
impl NetworkEmulator {
pub fn new(
config: &TestConfiguration,
dependencies: &TestEnvironmentDependencies,
authorities: &TestAuthorities,
) -> Self {
let n_peers = config.n_validators;
gum::info!(target: LOG_TARGET, "{}",format!("Initializing emulation for a {} peer network.", n_peers).bright_blue());
gum::info!(target: LOG_TARGET, "{}",format!("connectivity {}%, error {}%", config.connectivity, config.error).bright_black());
let metrics =
Metrics::new(&dependencies.registry).expect("Metrics always register succesfully");
let mut validator_authority_id_mapping = HashMap::new();
// Create a `PeerEmulator` for each peer.
let (stats, mut peers): (_, Vec<_>) = (0..n_peers)
.zip(authorities.validator_authority_id.clone())
.map(|(peer_index, authority_id)| {
validator_authority_id_mapping.insert(authority_id, peer_index);
let stats = Arc::new(PeerEmulatorStats::new(peer_index, metrics.clone()));
(
stats.clone(),
Peer::Connected(PeerEmulator::new(
config.peer_bandwidth,
dependencies.task_manager.spawn_handle(),
stats,
)),
)
})
.unzip();
let connected_count = config.n_validators as f64 / (100.0 / config.connectivity as f64);
let (_connected, to_disconnect) =
peers.partial_shuffle(&mut thread_rng(), connected_count as usize);
for peer in to_disconnect {
peer.disconnect();
}
gum::info!(target: LOG_TARGET, "{}",format!("Network created, connected validator count {}", connected_count).bright_black());
Self { peers, stats, validator_authority_ids: validator_authority_id_mapping }
}
pub fn is_peer_connected(&self, peer: &AuthorityDiscoveryId) -> bool {
self.peer(peer).is_connected()
}
pub fn submit_peer_action(&mut self, peer: AuthorityDiscoveryId, action: NetworkAction) {
let index = self
.validator_authority_ids
.get(&peer)
.expect("all test authorities are valid; qed");
let peer = self.peers.get_mut(*index).expect("We just retrieved the index above; qed");
// Only actions of size 0 are allowed on disconnected peers.
// Typically this are delayed error response sends.
if action.size() > 0 && !peer.is_connected() {
gum::warn!(target: LOG_TARGET, peer_index = index, "Attempted to send data from a disconnected peer, operation ignored");
return
}
peer.emulator().send(action);
}
// Returns the sent/received stats for `peer_index`.
pub fn peer_stats(&self, peer_index: usize) -> Arc<PeerEmulatorStats> {
self.stats[peer_index].clone()
}
// Helper to get peer index by `AuthorityDiscoveryId`
fn peer_index(&self, peer: &AuthorityDiscoveryId) -> usize {
*self
.validator_authority_ids
.get(peer)
.expect("all test authorities are valid; qed")
}
// Return the Peer entry for a given `AuthorityDiscoveryId`.
fn peer(&self, peer: &AuthorityDiscoveryId) -> &Peer {
&self.peers[self.peer_index(peer)]
}
// Returns the sent/received stats for `peer`.
pub fn peer_stats_by_id(&mut self, peer: &AuthorityDiscoveryId) -> Arc<PeerEmulatorStats> {
let peer_index = self.peer_index(peer);
self.stats[peer_index].clone()
}
// Returns the sent/received stats for all peers.
pub fn stats(&self) -> Vec<PeerStats> {
let r = self
.stats
.iter()
.map(|stats| PeerStats {
rx_bytes_total: stats.received(),
tx_bytes_total: stats.sent(),
})
.collect::<Vec<_>>();
r
}
// Increment bytes sent by our node (the node that contains the subsystem under test)
pub fn inc_sent(&self, bytes: usize) {
// Our node always is peer 0.
self.peer_stats(0).inc_sent(bytes);
}
// Increment bytes received by our node (the node that contains the subsystem under test)
pub fn inc_received(&self, bytes: usize) {
// Our node always is peer 0.
self.peer_stats(0).inc_received(bytes);
}
}
use polkadot_node_subsystem_util::metrics::prometheus::{
self, CounterVec, Opts, PrometheusError, Registry,
};
/// Emulated network metrics.
#[derive(Clone)]
pub(crate) struct Metrics {
/// Number of bytes sent per peer.
peer_total_sent: CounterVec<U64>,
/// Number of received sent per peer.
peer_total_received: CounterVec<U64>,
}
impl Metrics {
pub fn new(registry: &Registry) -> Result<Self, PrometheusError> {
Ok(Self {
peer_total_sent: prometheus::register(
CounterVec::new(
Opts::new(
"subsystem_benchmark_network_peer_total_bytes_sent",
"Total number of bytes a peer has sent.",
),
&["peer"],
)?,
registry,
)?,
peer_total_received: prometheus::register(
CounterVec::new(
Opts::new(
"subsystem_benchmark_network_peer_total_bytes_received",
"Total number of bytes a peer has received.",
),
&["peer"],
)?,
registry,
)?,
})
}
/// Increment total sent for a peer.
pub fn on_peer_sent(&self, peer_index: usize, bytes: usize) {
self.peer_total_sent
.with_label_values(vec![format!("node{}", peer_index).as_str()].as_slice())
.inc_by(bytes as u64);
}
/// Increment total receioved for a peer.
pub fn on_peer_received(&self, peer_index: usize, bytes: usize) {
self.peer_total_received
.with_label_values(vec![format!("node{}", peer_index).as_str()].as_slice())
.inc_by(bytes as u64);
}
}
@@ -0,0 +1,186 @@
// Copyright (C) Parity Technologies (UK) Ltd.
// This file is part of Polkadot.
// Polkadot is free software: you can redistribute it and/or modify
// it under the terms of the GNU General Public License as published by
// the Free Software Foundation, either version 3 of the License, or
// (at your option) any later version.
// Polkadot is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU General Public License for more details.
// You should have received a copy of the GNU General Public License
// along with Polkadot. If not, see <http://www.gnu.org/licenses/>.
//! A tool for running subsystem benchmark tests designed for development and
//! CI regression testing.
use clap::Parser;
use color_eyre::eyre;
use colored::Colorize;
use std::{path::Path, time::Duration};
pub(crate) mod availability;
pub(crate) mod cli;
pub(crate) mod core;
use availability::{prepare_test, NetworkEmulation, TestState};
use cli::TestObjective;
use core::{
configuration::TestConfiguration,
environment::{TestEnvironment, GENESIS_HASH},
};
use clap_num::number_range;
use crate::core::display::display_configuration;
fn le_100(s: &str) -> Result<usize, String> {
number_range(s, 0, 100)
}
fn le_5000(s: &str) -> Result<usize, String> {
number_range(s, 0, 5000)
}
#[derive(Debug, Parser)]
#[allow(missing_docs)]
struct BenchCli {
#[arg(long, value_enum, ignore_case = true, default_value_t = NetworkEmulation::Ideal)]
/// The type of network to be emulated
pub network: NetworkEmulation,
#[clap(flatten)]
pub standard_configuration: cli::StandardTestOptions,
#[clap(short, long)]
/// The bandwidth of simulated remote peers in KiB
pub peer_bandwidth: Option<usize>,
#[clap(short, long)]
/// The bandwidth of our simulated node in KiB
pub bandwidth: Option<usize>,
#[clap(long, value_parser=le_100)]
/// Simulated conection error ratio [0-100].
pub peer_error: Option<usize>,
#[clap(long, value_parser=le_5000)]
/// Minimum remote peer latency in milliseconds [0-5000].
pub peer_min_latency: Option<u64>,
#[clap(long, value_parser=le_5000)]
/// Maximum remote peer latency in milliseconds [0-5000].
pub peer_max_latency: Option<u64>,
#[command(subcommand)]
pub objective: cli::TestObjective,
}
impl BenchCli {
fn launch(self) -> eyre::Result<()> {
let configuration = self.standard_configuration;
let mut test_config = match self.objective {
TestObjective::TestSequence(options) => {
let test_sequence =
core::configuration::TestSequence::new_from_file(Path::new(&options.path))
.expect("File exists")
.into_vec();
let num_steps = test_sequence.len();
gum::info!(
"{}",
format!("Sequence contains {} step(s)", num_steps).bright_purple()
);
for (index, test_config) in test_sequence.into_iter().enumerate() {
gum::info!("{}", format!("Step {}/{}", index + 1, num_steps).bright_purple(),);
display_configuration(&test_config);
let mut state = TestState::new(&test_config);
let (mut env, _protocol_config) = prepare_test(test_config, &mut state);
env.runtime()
.block_on(availability::benchmark_availability_read(&mut env, state));
}
return Ok(())
},
TestObjective::DataAvailabilityRead(ref _options) => match self.network {
NetworkEmulation::Healthy => TestConfiguration::healthy_network(
self.objective,
configuration.num_blocks,
configuration.n_validators,
configuration.n_cores,
configuration.min_pov_size,
configuration.max_pov_size,
),
NetworkEmulation::Degraded => TestConfiguration::degraded_network(
self.objective,
configuration.num_blocks,
configuration.n_validators,
configuration.n_cores,
configuration.min_pov_size,
configuration.max_pov_size,
),
NetworkEmulation::Ideal => TestConfiguration::ideal_network(
self.objective,
configuration.num_blocks,
configuration.n_validators,
configuration.n_cores,
configuration.min_pov_size,
configuration.max_pov_size,
),
},
};
let mut latency_config = test_config.latency.clone().unwrap_or_default();
if let Some(latency) = self.peer_min_latency {
latency_config.min_latency = Duration::from_millis(latency);
}
if let Some(latency) = self.peer_max_latency {
latency_config.max_latency = Duration::from_millis(latency);
}
if let Some(error) = self.peer_error {
test_config.error = error;
}
if let Some(bandwidth) = self.peer_bandwidth {
// CLI expects bw in KiB
test_config.peer_bandwidth = bandwidth * 1024;
}
if let Some(bandwidth) = self.bandwidth {
// CLI expects bw in KiB
test_config.bandwidth = bandwidth * 1024;
}
display_configuration(&test_config);
let mut state = TestState::new(&test_config);
let (mut env, _protocol_config) = prepare_test(test_config, &mut state);
// test_config.write_to_disk();
env.runtime()
.block_on(availability::benchmark_availability_read(&mut env, state));
Ok(())
}
}
fn main() -> eyre::Result<()> {
color_eyre::install()?;
env_logger::builder()
.filter(Some("hyper"), log::LevelFilter::Info)
// Avoid `Terminating due to subsystem exit subsystem` warnings
.filter(Some("polkadot_overseer"), log::LevelFilter::Error)
.filter(None, log::LevelFilter::Info)
// .filter(None, log::LevelFilter::Trace)
.try_init()
.unwrap();
let cli: BenchCli = BenchCli::parse();
cli.launch()?;
Ok(())
}