mirror of
https://github.com/pezkuwichain/pezkuwi-subxt.git
synced 2026-05-31 09:51:02 +00:00
a84dd0dba5
Initial implementation for the plan discussed here: https://github.com/paritytech/polkadot-sdk/issues/701 Built on top of https://github.com/paritytech/polkadot-sdk/pull/1178 v0: https://github.com/paritytech/polkadot/pull/7554, ## Overall idea When approval-voting checks a candidate and is ready to advertise the approval, defer it in a per-relay chain block until we either have MAX_APPROVAL_COALESCE_COUNT candidates to sign or a candidate has stayed MAX_APPROVALS_COALESCE_TICKS in the queue, in both cases we sign what candidates we have available. This should allow us to reduce the number of approvals messages we have to create/send/verify. The parameters are configurable, so we should find some values that balance: - Security of the network: Delaying broadcasting of an approval shouldn't but the finality at risk and to make sure that never happens we won't delay sending a vote if we are past 2/3 from the no-show time. - Scalability of the network: MAX_APPROVAL_COALESCE_COUNT = 1 & MAX_APPROVALS_COALESCE_TICKS =0, is what we have now and we know from the measurements we did on versi, it bottlenecks approval-distribution/approval-voting when increase significantly the number of validators and parachains - Block storage: In case of disputes we have to import this votes on chain and that increase the necessary storage with MAX_APPROVAL_COALESCE_COUNT * CandidateHash per vote. Given that disputes are not the normal way of the network functioning and we will limit MAX_APPROVAL_COALESCE_COUNT in the single digits numbers, this should be good enough. Alternatively, we could try to create a better way to store this on-chain through indirection, if that's needed. ## Other fixes: - Fixed the fact that we were sending random assignments to non-validators, that was wrong because those won't do anything with it and they won't gossip it either because they do not have a grid topology set, so we would waste the random assignments. - Added metrics to be able to debug potential no-shows and mis-processing of approvals/assignments. ## TODO: - [x] Get feedback, that this is moving in the right direction. @ordian @sandreim @eskimor @burdges, let me know what you think. - [x] More and more testing. - [x] Test in versi. - [x] Make MAX_APPROVAL_COALESCE_COUNT & MAX_APPROVAL_COALESCE_WAIT_MILLIS a parachain host configuration. - [x] Make sure the backwards compatibility works correctly - [x] Make sure this direction is compatible with other streams of work: https://github.com/paritytech/polkadot-sdk/issues/635 & https://github.com/paritytech/polkadot-sdk/issues/742 - [x] Final versi burn-in before merging --------- Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
85 lines
6.3 KiB
Plaintext
85 lines
6.3 KiB
Plaintext
Description: PVF preparation & execution time
|
|
Network: ./0001-parachains-pvf.toml
|
|
Creds: config
|
|
|
|
# Check authority status.
|
|
alice: reports node_roles is 4
|
|
bob: reports node_roles is 4
|
|
charlie: reports node_roles is 4
|
|
dave: reports node_roles is 4
|
|
eve: reports node_roles is 4
|
|
ferdie: reports node_roles is 4
|
|
one: reports node_roles is 4
|
|
two: reports node_roles is 4
|
|
|
|
# Ensure parachains are registered.
|
|
alice: parachain 2000 is registered within 60 seconds
|
|
bob: parachain 2001 is registered within 60 seconds
|
|
charlie: parachain 2002 is registered within 60 seconds
|
|
dave: parachain 2003 is registered within 60 seconds
|
|
ferdie: parachain 2004 is registered within 60 seconds
|
|
eve: parachain 2005 is registered within 60 seconds
|
|
one: parachain 2006 is registered within 60 seconds
|
|
two: parachain 2007 is registered within 60 seconds
|
|
|
|
# Ensure parachains made progress.
|
|
alice: parachain 2000 block height is at least 10 within 300 seconds
|
|
alice: parachain 2001 block height is at least 10 within 300 seconds
|
|
alice: parachain 2002 block height is at least 10 within 300 seconds
|
|
alice: parachain 2003 block height is at least 10 within 300 seconds
|
|
alice: parachain 2004 block height is at least 10 within 300 seconds
|
|
alice: parachain 2005 block height is at least 10 within 300 seconds
|
|
alice: parachain 2006 block height is at least 10 within 300 seconds
|
|
alice: parachain 2007 block height is at least 10 within 300 seconds
|
|
|
|
alice: reports substrate_block_height{status="finalized"} is at least 30 within 400 seconds
|
|
|
|
# Check preparation time is under 10s.
|
|
# Check all buckets <= 10.
|
|
alice: reports histogram polkadot_pvf_preparation_time has at least 1 samples in buckets ["0.1", "0.5", "1", "2", "3", "10"] within 10 seconds
|
|
bob: reports histogram polkadot_pvf_preparation_time has at least 1 samples in buckets ["0.1", "0.5", "1", "2", "3", "10"] within 10 seconds
|
|
charlie: reports histogram polkadot_pvf_preparation_time has at least 1 samples in buckets ["0.1", "0.5", "1", "2", "3", "10"] within 10 seconds
|
|
dave: reports histogram polkadot_pvf_preparation_time has at least 1 samples in buckets ["0.1", "0.5", "1", "2", "3", "10"] within 10 seconds
|
|
ferdie: reports histogram polkadot_pvf_preparation_time has at least 1 samples in buckets ["0.1", "0.5", "1", "2", "3", "10"] within 10 seconds
|
|
eve: reports histogram polkadot_pvf_preparation_time has at least 1 samples in buckets ["0.1", "0.5", "1", "2", "3", "10"] within 10 seconds
|
|
one: reports histogram polkadot_pvf_preparation_time has at least 1 samples in buckets ["0.1", "0.5", "1", "2", "3", "10"] within 10 seconds
|
|
two: reports histogram polkadot_pvf_preparation_time has at least 1 samples in buckets ["0.1", "0.5", "1", "2", "3", "10"] within 10 seconds
|
|
|
|
# Check all buckets >= 20.
|
|
alice: reports histogram polkadot_pvf_preparation_time has 0 samples in buckets ["20", "30", "60", "120", "+Inf"] within 10 seconds
|
|
bob: reports histogram polkadot_pvf_preparation_time has 0 samples in buckets ["20", "30", "60", "120", "+Inf"] within 10 seconds
|
|
charlie: reports histogram polkadot_pvf_preparation_time has 0 samples in buckets ["20", "30", "60", "120", "+Inf"] within 10 seconds
|
|
dave: reports histogram polkadot_pvf_preparation_time has 0 samples in buckets ["20", "30", "60", "120", "+Inf"] within 10 seconds
|
|
ferdie: reports histogram polkadot_pvf_preparation_time has 0 samples in buckets ["20", "30", "60", "120", "+Inf"] within 10 seconds
|
|
eve: reports histogram polkadot_pvf_preparation_time has 0 samples in buckets ["20", "30", "60", "120", "+Inf"] within 10 seconds
|
|
one: reports histogram polkadot_pvf_preparation_time has 0 samples in buckets ["20", "30", "60", "120", "+Inf"] within 10 seconds
|
|
two: reports histogram polkadot_pvf_preparation_time has 0 samples in buckets ["20", "30", "60", "120", "+Inf"] within 10 seconds
|
|
|
|
# Check execution time.
|
|
# There are two different timeout conditions: DEFAULT_BACKING_EXECUTION_TIMEOUT(2s) and
|
|
# DEFAULT_APPROVAL_EXECUTION_TIMEOUT(12s). Currently these are not differentiated by metrics
|
|
# because the metrics are defined in `polkadot-node-core-pvf` which is a level below
|
|
# the relevant subsystems.
|
|
# That being said, we will take the simplifying assumption of testing only the
|
|
# 2s timeout.
|
|
# We do this check by ensuring all executions fall into bucket le="2" or lower.
|
|
# First, check if we have at least 1 sample, but we should have many more.
|
|
alice: reports histogram polkadot_pvf_execution_time has at least 1 samples in buckets ["0.1", "0.5", "1", "2"] within 10 seconds
|
|
bob: reports histogram polkadot_pvf_execution_time has at least 1 samples in buckets ["0.1", "0.5", "1", "2"] within 10 seconds
|
|
charlie: reports histogram polkadot_pvf_execution_time has at least 1 samples in buckets ["0.1", "0.5", "1", "2"] within 10 seconds
|
|
dave: reports histogram polkadot_pvf_execution_time has at least 1 samples in buckets ["0.1", "0.5", "1", "2"] within 10 seconds
|
|
ferdie: reports histogram polkadot_pvf_execution_time has at least 1 samples in buckets ["0.1", "0.5", "1", "2"] within 10 seconds
|
|
eve: reports histogram polkadot_pvf_execution_time has at least 1 samples in buckets ["0.1", "0.5", "1", "2"] within 10 seconds
|
|
one: reports histogram polkadot_pvf_execution_time has at least 1 samples in buckets ["0.1", "0.5", "1", "2"] within 10 seconds
|
|
two: reports histogram polkadot_pvf_execution_time has at least 1 samples in buckets ["0.1", "0.5", "1", "2"] within 10 seconds
|
|
|
|
# Check if we have no samples > 2s.
|
|
alice: reports histogram polkadot_pvf_execution_time has 0 samples in buckets ["3", "4", "5", "6", "+Inf"] within 10 seconds
|
|
bob: reports histogram polkadot_pvf_execution_time has 0 samples in buckets ["3", "4", "5", "6", "+Inf"] within 10 seconds
|
|
charlie: reports histogram polkadot_pvf_execution_time has 0 samples in buckets ["3", "4", "5", "6", "+Inf"] within 10 seconds
|
|
dave: reports histogram polkadot_pvf_execution_time has 0 samples in buckets ["3", "4", "5", "6", "+Inf"] within 10 seconds
|
|
ferdie: reports histogram polkadot_pvf_execution_time has 0 samples in buckets ["3", "4", "5", "6", "+Inf"] within 10 seconds
|
|
eve: reports histogram polkadot_pvf_execution_time has 0 samples in buckets ["3", "4", "5", "6", "+Inf"] within 10 seconds
|
|
one: reports histogram polkadot_pvf_execution_time has 0 samples in buckets ["3", "4", "5", "6", "+Inf"] within 10 seconds
|
|
two: reports histogram polkadot_pvf_execution_time has 0 samples in buckets ["3", "4", "5", "6", "+Inf"] within 10 seconds
|