Approve multiple candidates with a single signature (#1191)

Initial implementation for the plan discussed here: https://github.com/paritytech/polkadot-sdk/issues/701
Built on top of https://github.com/paritytech/polkadot-sdk/pull/1178
v0: https://github.com/paritytech/polkadot/pull/7554,

## Overall idea

When approval-voting checks a candidate and is ready to advertise the
approval, defer it in a per-relay chain block until we either have
MAX_APPROVAL_COALESCE_COUNT candidates to sign or a candidate has stayed
MAX_APPROVALS_COALESCE_TICKS in the queue, in both cases we sign what
candidates we have available.

This should allow us to reduce the number of approvals messages we have
to create/send/verify. The parameters are configurable, so we should
find some values that balance:

- Security of the network: Delaying broadcasting of an approval
shouldn't but the finality at risk and to make sure that never happens
we won't delay sending a vote if we are past 2/3 from the no-show time.
- Scalability of the network: MAX_APPROVAL_COALESCE_COUNT = 1 &
MAX_APPROVALS_COALESCE_TICKS =0, is what we have now and we know from
the measurements we did on versi, it bottlenecks
approval-distribution/approval-voting when increase significantly the
number of validators and parachains
- Block storage: In case of disputes we have to import this votes on
chain and that increase the necessary storage with
MAX_APPROVAL_COALESCE_COUNT * CandidateHash per vote. Given that
disputes are not the normal way of the network functioning and we will
limit MAX_APPROVAL_COALESCE_COUNT in the single digits numbers, this
should be good enough. Alternatively, we could try to create a better
way to store this on-chain through indirection, if that's needed.

## Other fixes:
- Fixed the fact that we were sending random assignments to
non-validators, that was wrong because those won't do anything with it
and they won't gossip it either because they do not have a grid topology
set, so we would waste the random assignments.
- Added metrics to be able to debug potential no-shows and
mis-processing of approvals/assignments.

## TODO:
- [x] Get feedback, that this is moving in the right direction. @ordian
@sandreim @eskimor @burdges, let me know what you think.
- [x] More and more testing.
- [x]  Test in versi.
- [x] Make MAX_APPROVAL_COALESCE_COUNT &
MAX_APPROVAL_COALESCE_WAIT_MILLIS a parachain host configuration.
- [x] Make sure the backwards compatibility works correctly
- [x] Make sure this direction is compatible with other streams of work:
https://github.com/paritytech/polkadot-sdk/issues/635 &
https://github.com/paritytech/polkadot-sdk/issues/742
- [x] Final versi burn-in before merging

---------

Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
This commit is contained in:
Alexandru Gheorghe
2023-12-13 08:43:15 +02:00
committed by GitHub
parent d18a682bf7
commit a84dd0dba5
82 changed files with 5883 additions and 1483 deletions
File diff suppressed because it is too large Load Diff
@@ -31,6 +31,8 @@ struct MetricsInner {
time_unify_with_peer: prometheus::Histogram,
time_import_pending_now_known: prometheus::Histogram,
time_awaiting_approval_voting: prometheus::Histogram,
assignments_received_result: prometheus::CounterVec<prometheus::U64>,
approvals_received_result: prometheus::CounterVec<prometheus::U64>,
}
trait AsLabel {
@@ -78,6 +80,132 @@ impl Metrics {
.map(|metrics| metrics.time_import_pending_now_known.start_timer())
}
pub fn on_approval_already_known(&self) {
if let Some(metrics) = &self.0 {
metrics.approvals_received_result.with_label_values(&["known"]).inc()
}
}
pub fn on_approval_entry_not_found(&self) {
if let Some(metrics) = &self.0 {
metrics.approvals_received_result.with_label_values(&["noapprovalentry"]).inc()
}
}
pub fn on_approval_recent_outdated(&self) {
if let Some(metrics) = &self.0 {
metrics.approvals_received_result.with_label_values(&["outdated"]).inc()
}
}
pub fn on_approval_invalid_block(&self) {
if let Some(metrics) = &self.0 {
metrics.approvals_received_result.with_label_values(&["invalidblock"]).inc()
}
}
pub fn on_approval_unknown_assignment(&self) {
if let Some(metrics) = &self.0 {
metrics
.approvals_received_result
.with_label_values(&["unknownassignment"])
.inc()
}
}
pub fn on_approval_duplicate(&self) {
if let Some(metrics) = &self.0 {
metrics.approvals_received_result.with_label_values(&["duplicate"]).inc()
}
}
pub fn on_approval_out_of_view(&self) {
if let Some(metrics) = &self.0 {
metrics.approvals_received_result.with_label_values(&["outofview"]).inc()
}
}
pub fn on_approval_good_known(&self) {
if let Some(metrics) = &self.0 {
metrics.approvals_received_result.with_label_values(&["goodknown"]).inc()
}
}
pub fn on_approval_bad(&self) {
if let Some(metrics) = &self.0 {
metrics.approvals_received_result.with_label_values(&["bad"]).inc()
}
}
pub fn on_approval_unexpected(&self) {
if let Some(metrics) = &self.0 {
metrics.approvals_received_result.with_label_values(&["unexpected"]).inc()
}
}
pub fn on_approval_bug(&self) {
if let Some(metrics) = &self.0 {
metrics.approvals_received_result.with_label_values(&["bug"]).inc()
}
}
pub fn on_assignment_already_known(&self) {
if let Some(metrics) = &self.0 {
metrics.assignments_received_result.with_label_values(&["known"]).inc()
}
}
pub fn on_assignment_recent_outdated(&self) {
if let Some(metrics) = &self.0 {
metrics.assignments_received_result.with_label_values(&["outdated"]).inc()
}
}
pub fn on_assignment_invalid_block(&self) {
if let Some(metrics) = &self.0 {
metrics.assignments_received_result.with_label_values(&["invalidblock"]).inc()
}
}
pub fn on_assignment_duplicate(&self) {
if let Some(metrics) = &self.0 {
metrics.assignments_received_result.with_label_values(&["duplicate"]).inc()
}
}
pub fn on_assignment_out_of_view(&self) {
if let Some(metrics) = &self.0 {
metrics.assignments_received_result.with_label_values(&["outofview"]).inc()
}
}
pub fn on_assignment_good_known(&self) {
if let Some(metrics) = &self.0 {
metrics.assignments_received_result.with_label_values(&["goodknown"]).inc()
}
}
pub fn on_assignment_bad(&self) {
if let Some(metrics) = &self.0 {
metrics.assignments_received_result.with_label_values(&["bad"]).inc()
}
}
pub fn on_assignment_duplicatevoting(&self) {
if let Some(metrics) = &self.0 {
metrics
.assignments_received_result
.with_label_values(&["duplicatevoting"])
.inc()
}
}
pub fn on_assignment_far(&self) {
if let Some(metrics) = &self.0 {
metrics.assignments_received_result.with_label_values(&["far"]).inc()
}
}
pub(crate) fn time_awaiting_approval_voting(
&self,
) -> Option<prometheus::prometheus::HistogramTimer> {
@@ -167,6 +295,26 @@ impl MetricsTrait for Metrics {
).buckets(vec![0.0001, 0.0004, 0.0016, 0.0064, 0.0256, 0.1024, 0.4096, 1.6384, 3.2768, 4.9152, 6.5536,]))?,
registry,
)?,
assignments_received_result: prometheus::register(
prometheus::CounterVec::new(
prometheus::Opts::new(
"polkadot_parachain_assignments_received_result",
"Result of a processed assignement",
),
&["status"]
)?,
registry,
)?,
approvals_received_result: prometheus::register(
prometheus::CounterVec::new(
prometheus::Opts::new(
"polkadot_parachain_approvals_received_result",
"Result of a processed approval",
),
&["status"]
)?,
registry,
)?,
};
Ok(Metrics(Some(metrics)))
}
File diff suppressed because it is too large Load Diff