Batch vote import in dispute-distribution (#5894)

* Start work on batching in dispute-distribution.

* Guide work.

* More guide changes. Still very much WIP.

* Finish guide changes.

* Clarification

* Adjust argument about slashing.

* WIP: Add constants to receiver.

* Maintain order of disputes.

* dispute-distribuion sender Rate limit.

* Cleanup

* WIP: dispute-distribution receiver.

- [ ] Rate limiting
- [ ] Batching

* WIP: Batching.

* fmt

* Update `PeerQueues` to maintain more invariants.

* WIP: Batching.

* Small cleanup

* Batching logic.

* Some integration work.

* Finish.

Missing: Tests

* Typo.

* Docs.

* Report missing metric.

* Doc pass.

* Tests for waiting_queue.

* Speed up some crypto by 10x.

* Fix redundant import.

* Add some tracing.

* Better sender rate limit

* Some tests.

* Tests

* Add logging to rate limiter

* Update roadmap/implementers-guide/src/node/disputes/dispute-distribution.md

Co-authored-by: Tsvetomir Dimitrov <tsvetomir@parity.io>

* Update roadmap/implementers-guide/src/node/disputes/dispute-distribution.md

Co-authored-by: Tsvetomir Dimitrov <tsvetomir@parity.io>

* Update node/network/dispute-distribution/src/receiver/mod.rs

Co-authored-by: Tsvetomir Dimitrov <tsvetomir@parity.io>

* Review feedback.

* Also log peer in log messages.

* Fix indentation.

* waker -> timer

* Guide improvement.

* Remove obsolete comment.

* waker -> timer

* Fix spell complaints.

* Fix Cargo.lock

Co-authored-by: Tsvetomir Dimitrov <tsvetomir@parity.io>
This commit is contained in:
Robert Klotzner
2022-10-04 18:02:05 +02:00
committed by GitHub
parent a64cc4a860
commit 938bc96a2c
17 changed files with 1772 additions and 458 deletions
@@ -15,6 +15,13 @@ This design should result in a protocol that is:
## Protocol
Distributing disputes needs to be a reliable protocol. We would like to make as
sure as possible that our vote got properly delivered to all concerned
validators. For this to work, this subsystem won't be gossip based, but instead
will use a request/response protocol for application level confirmations. The
request will be the payload (the actual votes/statements), the response will
be the confirmation. See [below][#wire-format].
### Input
[`DisputeDistributionMessage`][DisputeDistributionMessage]
@@ -107,16 +114,7 @@ struct VotesResponse {
}
```
## Functionality
Distributing disputes needs to be a reliable protocol. We would like to make as
sure as possible that our vote got properly delivered to all concerned
validators. For this to work, this subsystem won't be gossip based, but instead
will use a request/response protocol for application level confirmations. The
request will be the payload (the actual votes/statements), the response will
be the confirmation. See [above][#wire-format].
### Starting a Dispute
## Starting a Dispute
A dispute is initiated once a node sends the first `DisputeRequest` wire message,
which must contain an "invalid" vote and a "valid" vote.
@@ -132,7 +130,7 @@ conflicting votes available, hence we have a valid dispute. Nodes will still
need to check whether the disputing votes are somewhat current and not some
stale ones.
### Participating in a Dispute
## Participating in a Dispute
Upon receiving a `DisputeRequest` message, a dispute distribution will trigger the
import of the received votes via the dispute coordinator
@@ -144,13 +142,13 @@ except that if the local node deemed the candidate valid, the `SendDispute`
message will contain a valid vote signed by our node and will contain the
initially received `Invalid` vote.
Note, that we rely on the coordinator to check availability for spam protection
(see below).
Note, that we rely on `dispute-coordinator` to check validity of a dispute for spam
protection (see below).
### Sending of messages
## Sending of messages
Starting and participating in a dispute are pretty similar from the perspective
of dispute distribution. Once we receive a `SendDispute` message we try to make
of dispute distribution. Once we receive a `SendDispute` message, we try to make
sure to get the data out. We keep track of all the parachain validators that
should see the message, which are all the parachain validators of the session
where the dispute happened as they will want to participate in the dispute. In
@@ -159,114 +157,185 @@ session (which might be the same or not and may change during the dispute).
Those authorities will not participate in the dispute, but need to see the
statements so they can include them in blocks.
We keep track of connected parachain validators and authorities and will issue
warnings in the logs if connected nodes are less than two thirds of the
corresponding sets. We also only consider a message transmitted, once we
received a confirmation message. If not, we will keep retrying getting that
message out as long as the dispute is deemed alive. To determine whether a
dispute is still alive we will issue a
### Reliability
We only consider a message transmitted, once we received a confirmation message.
If not, we will keep retrying getting that message out as long as the dispute is
deemed alive. To determine whether a dispute is still alive we will ask the
`dispute-coordinator` for a list of all still active disputes via a
`DisputeCoordinatorMessage::ActiveDisputes` message before each retry run. Once
a dispute is no longer live, we will clean up the state accordingly.
### Reception & Spam Considerations
### Order
Because we are not forwarding foreign statements, spam is less of an issue in
comparison to gossip based systems. Rate limiting should be implemented at the
substrate level, see
[#7750](https://github.com/paritytech/substrate/issues/7750). Still we should
make sure that it is not possible via spamming to prevent a dispute concluding
or worse from getting noticed.
We assume `SendDispute` messages are coming in an order of importance, hence
`dispute-distribution` will make sure to send out network messages in the same
order, even on retry.
Considered attack vectors:
### Rate Limit
1. Invalid disputes (candidate does not exist) could make us
run out of resources. E.g. if we recorded every statement, we could run out
of disk space eventually.
2. An attacker can just flood us with notifications on any notification
protocol, assuming flood protection is not effective enough, our unbounded
buffers can fill up and we will run out of memory eventually.
3. An attacker could participate in a valid dispute, but send its votes multiple
times.
4. Attackers could spam us at a high rate with invalid disputes. Our incoming
queue of requests could get monopolized by those malicious requests and we
won't be able to import any valid disputes and we could run out of resources,
if we tried to process them all in parallel.
For spam protection (see below), we employ an artificial rate limiting on sending
out messages in order to not hit the rate limit at the receiving side, which
would result in our messages getting dropped and our reputation getting reduced.
For tackling 1, we make sure to not occupy resources before we don't know a
candidate is available. So we will not record statements to disk until we
recovered availability for the candidate or know by some other means that the
dispute is legit.
## Reception
For 2, we will pick up on any dispute on restart, so assuming that any realistic
memory filling attack will take some time, we should be able to participate in a
dispute under such attacks.
As we shall see the receiving side is mostly about handling spam and ensuring
the dispute-coordinator learns about disputes as fast as possible.
Importing/discarding redundant votes should be pretty quick, so measures with
regards to 4 should suffice to prevent 3, from doing any real harm.
Goals for the receiving side:
For 4, full monopolization of the incoming queue should not be possible assuming
substrate handles incoming requests in a somewhat fair way. Still we want some
defense mechanisms, at the very least we need to make sure to not exhaust
resources.
1. Get new disputes to the dispute-coordinator as fast as possible, so
prioritization can happen properly.
2. Batch votes per disputes as much as possible for good import performance.
3. Prevent malicious nodes exhausting node resources by sending lots of messages.
4. Prevent malicious nodes from sending so many messages/(fake) disputes,
preventing us from concluding good ones.
5. Limit ability of malicious nodes of delaying the vote import due to batching
logic.
The dispute coordinator will notify us on import about unavailable candidates or
otherwise invalid imports and we can disconnect from such peers/decrease their
reputation drastically. This alone should get us quite far with regards to queue
monopolization, as availability recovery is expected to fail relatively quickly
for unavailable data.
Goal 1 and 2 seem to be conflicting, but an easy compromise is possible: When
learning about a new dispute, we will import the vote immediately, making the
dispute coordinator aware and also getting immediate feedback on the validity.
Then if valid we can batch further incoming votes, with less time constraints as
the dispute-coordinator already knows about the dispute.
Still if those spam messages come at a very high rate, we might still run out of
resources if we immediately call `DisputeCoordinatorMessage::ImportStatements`
on each one of them. Secondly with our assumption of 1/3 dishonest validators,
getting rid of all of them will take some time, depending on reputation timeouts
some of them might even be able to reconnect eventually.
Goal 3 and 4 are obviously very related and both can easily be solved via rate
limiting as we shall see below. Rate limits should already be implemented at the
substrate level, but [are not](https://github.com/paritytech/substrate/issues/7750)
at the time of writing. But even if they were, the enforced substrate limits would
likely not be configurable and thus would still be to high for our needs as we can
rely on the following observations:
To mitigate those issues we will process dispute messages with a maximum
parallelism `N`. We initiate import processes for up to `N` candidates in
parallel. Once we reached `N` parallel requests we will start back pressuring on
the incoming requests. This saves us from resource exhaustion.
1. Each honest validator will only send one message (apart from duplicates on
timeout) per candidate/dispute.
2. An honest validator needs to fully recover availability and validate the
candidate for casting a vote.
To reduce impact of malicious nodes further, we can keep track from which nodes the
currently importing statements came from and will drop requests from nodes that
already have imports in flight.
With these two observations, we can conclude that honest validators will usually
not send messages at a high rate. We can therefore enforce conservative rate
limits and thus minimize harm spamming malicious nodes can have.
Honest nodes are not expected to send dispute statements at a high rate, but
even if they did:
Before we dive into how rate limiting solves all spam issues elegantly, let's
discuss that honest behaviour further:
- we will import at least the first one and if it is valid it will trigger a
dispute, preventing finality.
- Chances are good that the first sent candidate from a peer is indeed the
oldest one (if they differ in age at all).
- for the dropped request any honest node will retry sending.
- there will be other nodes notifying us about that dispute as well.
- honest votes have a speed advantage on average. Apart from the very first
dispute statement for a candidate, which might cause the availability recovery
process, imports of honest votes will be super fast, while for spam imports
they will always take some time as we have to wait for availability to fail.
What about session changes? Here we might have to inform a new validator set of
lots of already existing disputes at once.
So this general rate limit, that we drop requests from same peers if they come
faster than we can import the statements should not cause any problems for
honest nodes and is in their favor.
With observation 1) and a rate limit that is per peer, we are still good:
Size of `N`: The larger `N` the better we can handle distributed flood attacks
(see previous paragraph), but we also get potentially more availability recovery
processes happening at the same time, which slows down the individual processes.
And we rather want to have one finish quickly than lots slowly at the same time.
On the other hand, valid disputes are expected to be rare, so if we ever exhaust
`N` it is very likely that this is caused by spam and spam recoveries don't cost
too much bandwidth due to empty responses.
Let's assume a rate limit of one message per 200ms per sender. This means 5
messages from each validator per second. 5 messages means 5 disputes!
Conclusively, we will be able to conclude 5 disputes per second - no matter what
malicious actors are doing. This is assuming dispute messages are sent ordered,
but even if not perfectly ordered: On average it will be 5 disputes per second.
Considering that an attacker would need to attack many nodes in parallel to have
any effect, an `N` of 10 seems to be a good compromise. For honest requests, most
of those imports will likely concern the same candidate, and for dishonest ones
we get to disconnect from up to ten colluding adversaries at a time.
This is good enough! All those disputes are valid ones and will result in
slashing and disabling of validators. Let's assume all of them conclude `valid`,
and we disable validators only after 100 raised concluding valid disputes, we
would still start disabling misbehaving validators in only 20 seconds.
For the size of the channel for incoming requests: Due to dropping of repeated
requests from same nodes we can make the channel relatively large without fear
of lots of spam requests sitting there wasting our time, even after we already
blocked a peer. For valid disputes, incoming requests can become bursty. On the
other hand we will also be very quick in processing them. A channel size of 100
requests seems plenty and should be able to handle bursts adequately.
One could also think that in addition participation is expected to take longer,
which means on average we can import/conclude disputes faster than they are
generated - regardless of dispute spam. Unfortunately this is not necessarily
true: There might be parachains with very light load where recovery and
validation can be accomplished very quickly - maybe faster than we can import
those disputes.
This is probably an argument for not imposing a too low rate limit, although the
issue is more general: Even without any rate limit, if an attacker generates
disputes at a very high rate, nodes will be having trouble keeping participation
up, hence the problem should be mitigated at a [more fundamental
layer](https://github.com/paritytech/polkadot/issues/5898).
For nodes that have been offline for a while, the same argument as for session
changes holds, but matters even less: We assume 2/3 of nodes to be online, so
even if the worst case 1/3 offline happens and they could not import votes fast
enough (as argued above, they in fact can) it would not matter for consensus.
### Rate Limiting
As suggested previously, rate limiting allows to mitigate all threats that come
from malicious actors trying to overwhelm the system in order to get away without
a slash, when it comes to dispute-distribution. In this section we will explain
how in greater detail.
The idea is to open a queue with limited size for each peer. We will process
incoming messages as fast as we can by doing the following:
1. Check that the sending peer is actually a valid authority - otherwise drop
message and decrease reputation/disconnect.
2. Put message on the peer's queue, if queue is full - drop it.
Every `RATE_LIMIT` seconds (or rather milliseconds), we pause processing
incoming requests to go a full circle and process one message from each queue.
Processing means `Batching` as explained in the next section.
### Batching
To achieve goal 2 we will batch incoming votes/messages together before passing
them on as a single batch to the `dispute-coordinator`. To adhere to goal 1 as
well, we will do the following:
1. For an incoming message, we check whether we have an existing batch for that
candidate, if not we import directly to the dispute-coordinator, as we have
to assume this is concerning a new dispute.
2. We open a batch and start collecting incoming messages for that candidate,
instead of immediately forwarding.
4. We keep collecting votes in the batch until we receive less than
`MIN_KEEP_BATCH_ALIVE_VOTES` unique votes in the last `BATCH_COLLECTING_INTERVAL`. This is
important to accommodate for goal 5 and also 3.
5. We send the whole batch to the dispute-coordinator.
This together with rate limiting explained above ensures we will be able to
process valid disputes: We can limit the number of simultaneous existing batches
to some high value, but can be rather certain that this limit will never be
reached - hence we won't drop valid disputes:
Let's assume `MIN_KEEP_BATCH_ALIVE_VOTES` is 10, `BATCH_COLLECTING_INTERVAL`
is `500ms` and above `RATE_LIMIT` is `100ms`. 1/3 of validators are malicious,
so for 1000 this means around 330 malicious actors worst case.
All those actors can send a message every `100ms`, that is 10 per second. This
means at the begining of an attack they can open up around 3300 batches. Each
containing two votes. So memory usage is still negligible. In reality it is even
less, as we also demand 10 new votes to trickle in per batch in order to keep it
alive, every `500ms`. Hence for the first second, each batch requires 20 votes
each. Each message is 2 votes, so this means 10 messages per batch. Hence to
keep those batches alive 10 attackers are needed for each batch. This reduces
the number of opened batches by a factor of 10: So we only have 330 batches in 1
second - each containing 20 votes.
The next second: In order to further grow memory usage, attackers have to
maintain 10 messages per batch and second. Number of batches equals the number
of attackers, each has 10 messages per second, all are needed to maintain the
batches in memory. Therefore we have a hard cap of around 330 (number of
malicious nodes) open batches. Each can be filled with number of malicious
actor's votes. So 330 batches with each 330 votes: Let's assume approximately 100
bytes per signature/vote. This results in a worst case memory usage of 330 * 330
* 100 ~= 10 MiB.
For 10_000 validators, we are already in the Gigabyte range, which means that
with a validator set that large we might want to be more strict with the rate limit or
require a larger rate of incoming votes per batch to keep them alive.
For a thousand validators a limit on batches of around 1000 should never be
reached in practice. Hence due to rate limiting we have a very good chance to
not ever having to drop a potential valid dispute due to some resource limit.
Further safe guards are possible: The dispute-coordinator actually
confirms/denies imports. So once we receive a denial by the dispute-coordinator
for the initial imported votes, we can opt into flushing the batch immediately
and importing the votes. This swaps memory usage for more CPU usage, but if that
import is deemed invalid again we can immediately decrease the reputation of the
sending peers, so this should be a net win. For the time being we punt on this
for simplicity.
Instead of filling batches to maximize memory usage, attackers could also try to
overwhelm the dispute coordinator by only sending votes for new candidates all
the time. This attack vector is mitigated also by above rate limit and
decreasing the peer's reputation on denial of the invalid imports by the
coordinator.
### Node Startup