Approve multiple candidates with a single signature (#1191)

Initial implementation for the plan discussed here: https://github.com/paritytech/polkadot-sdk/issues/701
Built on top of https://github.com/paritytech/polkadot-sdk/pull/1178
v0: https://github.com/paritytech/polkadot/pull/7554,

## Overall idea

When approval-voting checks a candidate and is ready to advertise the
approval, defer it in a per-relay chain block until we either have
MAX_APPROVAL_COALESCE_COUNT candidates to sign or a candidate has stayed
MAX_APPROVALS_COALESCE_TICKS in the queue, in both cases we sign what
candidates we have available.

This should allow us to reduce the number of approvals messages we have
to create/send/verify. The parameters are configurable, so we should
find some values that balance:

- Security of the network: Delaying broadcasting of an approval
shouldn't but the finality at risk and to make sure that never happens
we won't delay sending a vote if we are past 2/3 from the no-show time.
- Scalability of the network: MAX_APPROVAL_COALESCE_COUNT = 1 &
MAX_APPROVALS_COALESCE_TICKS =0, is what we have now and we know from
the measurements we did on versi, it bottlenecks
approval-distribution/approval-voting when increase significantly the
number of validators and parachains
- Block storage: In case of disputes we have to import this votes on
chain and that increase the necessary storage with
MAX_APPROVAL_COALESCE_COUNT * CandidateHash per vote. Given that
disputes are not the normal way of the network functioning and we will
limit MAX_APPROVAL_COALESCE_COUNT in the single digits numbers, this
should be good enough. Alternatively, we could try to create a better
way to store this on-chain through indirection, if that's needed.

## Other fixes:
- Fixed the fact that we were sending random assignments to
non-validators, that was wrong because those won't do anything with it
and they won't gossip it either because they do not have a grid topology
set, so we would waste the random assignments.
- Added metrics to be able to debug potential no-shows and
mis-processing of approvals/assignments.

## TODO:
- [x] Get feedback, that this is moving in the right direction. @ordian
@sandreim @eskimor @burdges, let me know what you think.
- [x] More and more testing.
- [x]  Test in versi.
- [x] Make MAX_APPROVAL_COALESCE_COUNT &
MAX_APPROVAL_COALESCE_WAIT_MILLIS a parachain host configuration.
- [x] Make sure the backwards compatibility works correctly
- [x] Make sure this direction is compatible with other streams of work:
https://github.com/paritytech/polkadot-sdk/issues/635 &
https://github.com/paritytech/polkadot-sdk/issues/742
- [x] Final versi burn-in before merging

---------

Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
This commit is contained in:
Alexandru Gheorghe
2023-12-13 08:43:15 +02:00
committed by GitHub
parent d18a682bf7
commit a84dd0dba5
82 changed files with 5883 additions and 1483 deletions
+38 -17
View File
@@ -73,7 +73,11 @@ impl PeerSet {
// Networking layer relies on `get_main_name()` being the main name of the protocol
// for peersets and connection management.
let protocol = peerset_protocol_names.get_main_name(self);
let fallback_names = PeerSetProtocolNames::get_fallback_names(self);
let fallback_names = PeerSetProtocolNames::get_fallback_names(
self,
&peerset_protocol_names.genesis_hash,
peerset_protocol_names.fork_id.as_deref(),
);
let max_notification_size = self.get_max_notification_size(is_authority);
match self {
@@ -127,15 +131,8 @@ impl PeerSet {
/// Networking layer relies on `get_main_version()` being the version
/// of the main protocol name reported by [`PeerSetProtocolNames::get_main_name()`].
pub fn get_main_version(self) -> ProtocolVersion {
#[cfg(not(feature = "network-protocol-staging"))]
match self {
PeerSet::Validation => ValidationVersion::V2.into(),
PeerSet::Collation => CollationVersion::V2.into(),
}
#[cfg(feature = "network-protocol-staging")]
match self {
PeerSet::Validation => ValidationVersion::VStaging.into(),
PeerSet::Validation => ValidationVersion::V3.into(),
PeerSet::Collation => CollationVersion::V2.into(),
}
}
@@ -163,7 +160,7 @@ impl PeerSet {
Some("validation/1")
} else if version == ValidationVersion::V2.into() {
Some("validation/2")
} else if version == ValidationVersion::VStaging.into() {
} else if version == ValidationVersion::V3.into() {
Some("validation/3")
} else {
None
@@ -236,9 +233,10 @@ pub enum ValidationVersion {
V1 = 1,
/// The second version.
V2 = 2,
/// The staging version to gather changes
/// that before the release become v3.
VStaging = 3,
/// The third version where changes to ApprovalDistributionMessage had been made.
/// The changes are translatable to V2 format untill assignments v2 and approvals
/// coalescing is enabled through a runtime upgrade.
V3 = 3,
}
/// Supported collation protocol versions. Only versions defined here must be used in the codebase.
@@ -299,6 +297,8 @@ impl From<CollationVersion> for ProtocolVersion {
pub struct PeerSetProtocolNames {
protocols: HashMap<ProtocolName, (PeerSet, ProtocolVersion)>,
names: HashMap<(PeerSet, ProtocolVersion), ProtocolName>,
genesis_hash: Hash,
fork_id: Option<String>,
}
impl PeerSetProtocolNames {
@@ -333,7 +333,7 @@ impl PeerSetProtocolNames {
}
Self::register_legacy_protocol(&mut protocols, protocol);
}
Self { protocols, names }
Self { protocols, names, genesis_hash, fork_id: fork_id.map(|fork_id| fork_id.into()) }
}
/// Helper function to register main protocol.
@@ -437,9 +437,30 @@ impl PeerSetProtocolNames {
}
/// Get the protocol fallback names. Currently only holds the legacy name
/// for `LEGACY_PROTOCOL_VERSION` = 1.
fn get_fallback_names(protocol: PeerSet) -> Vec<ProtocolName> {
std::iter::once(Self::get_legacy_name(protocol)).collect()
/// for `LEGACY_PROTOCOL_VERSION` = 1 and v2 for validation.
fn get_fallback_names(
protocol: PeerSet,
genesis_hash: &Hash,
fork_id: Option<&str>,
) -> Vec<ProtocolName> {
let mut fallbacks = vec![Self::get_legacy_name(protocol)];
match protocol {
PeerSet::Validation => {
// Fallbacks are tried one by one, till one matches so push v2 at the top, so
// that it is used ahead of the legacy one(v1).
fallbacks.insert(
0,
Self::generate_name(
genesis_hash,
fork_id,
protocol,
ValidationVersion::V2.into(),
),
)
},
PeerSet::Collation => {},
};
fallbacks
}
}