WIP: CI: add spellcheck (#3421)

* CI: add spellcheck

* revert me

* CI: explicit command for spellchecker

* spellcheck: edit misspells

* CI: run spellcheck on diff

* spellcheck: edits

* spellcheck: edit misspells

* spellcheck: add rules

* spellcheck: mv configs

* spellcheck: more edits

* spellcheck: chore

* spellcheck: one more thing

* spellcheck: and another one

* spellcheck: seems like it doesn't get to an end

* spellcheck: new words after rebase

* spellcheck: new words appearing out of nowhere

* chore

* review edits

* more review edits

* more edits

* wonky behavior

* wonky behavior 2

* wonky behavior 3

* change git behavior

* spellcheck: another bunch of new edits

* spellcheck: new words are koming out of nowhere

* CI: finding the master

* CI: fetching master implicitly

* CI: undebug

* new errors

* a bunch of new edits

* and some more

* Update node/core/approval-voting/src/approval_db/v1/mod.rs

Co-authored-by: Andronik Ordian <write@reusable.software>

* Update xcm/xcm-executor/src/assets.rs

Co-authored-by: Andronik Ordian <write@reusable.software>

* Apply suggestions from code review

Co-authored-by: Andronik Ordian <write@reusable.software>

* Suggestions from the code review

* CI: scan only changed files

Co-authored-by: Andronik Ordian <write@reusable.software>
This commit is contained in:
Denis Pisarev
2021-07-14 19:22:58 +02:00
committed by GitHub
parent f6305d29be
commit fc253e6e4d
239 changed files with 927 additions and 761 deletions
@@ -21,33 +21,34 @@ There may be multiple competing blocks all ending the availability phase for a p
```dot process
digraph {
label = "Block data FSM\n\n\n";
labelloc = "t";
rankdir="LR";
label = "Block data FSM\n\n\n";
labelloc = "t";
rankdir="LR";
st [label = "Stored"; shape = circle]
inc [label = "Included"; shape = circle]
fin [label = "Finalized"; shape = circle]
prn [label = "Pruned"; shape = circle]
st [label = "Stored"; shape = circle]
inc [label = "Included"; shape = circle]
fin [label = "Finalized"; shape = circle]
prn [label = "Pruned"; shape = circle]
st -> inc [label = "Block\nincluded"]
st -> prn [label = "Stored block\ntimed out"]
inc -> fin [label = "Block\nfinalized"]
inc -> st [label = "Competing blocks\nfinalized"]
fin -> prn [label = "Block keep time\n(1 day + 1 hour) elapsed"]
st -> inc [label = "Block\nincluded"]
st -> prn [label = "Stored block\ntimed out"]
inc -> fin [label = "Block\nfinalized"]
inc -> st [label = "Competing blocks\nfinalized"]
fin -> prn [label = "Block keep time\n(1 day + 1 hour) elapsed"]
}
```
## Database Schema
We use an underlying Key-Value database where we assume we have the following operations available:
* `write(key, value)`
* `read(key) -> Option<value>`
* `iter_with_prefix(prefix) -> Iterator<(key, value)>` - gives all keys and values in lexicographical order where the key starts with `prefix`.
- `write(key, value)`
- `read(key) -> Option<value>`
- `iter_with_prefix(prefix) -> Iterator<(key, value)>` - gives all keys and values in lexicographical order where the key starts with `prefix`.
We use this database to encode the following schema:
```
```rust
("available", CandidateHash) -> Option<AvailableData>
("chunk", CandidateHash, u32) -> Option<ErasureChunk>
("meta", CandidateHash) -> Option<CandidateMeta>
@@ -56,7 +57,7 @@ We use this database to encode the following schema:
("prune_by_time", Timestamp, CandidateHash) -> Option<()>
```
Timestamps are the wall-clock seconds since unix epoch. Timestamps and block numbers are both encoded as big-endian so lexicographic order is ascending.
Timestamps are the wall-clock seconds since Unix epoch. Timestamps and block numbers are both encoded as big-endian so lexicographic order is ascending.
The meta information that we track per-candidate is defined as the `CandidateMeta` struct
@@ -88,84 +89,87 @@ Additionally, there is exactly one `prune_by_time` entry which holds the candida
Input: [`AvailabilityStoreMessage`][ASM]
Output:
- [`RuntimeApiMessage`][RAM]
- [`RuntimeApiMessage`][RAM]
## Functionality
For each head in the `activated` list:
- Load all ancestors of the head back to the finalized block so we don't miss anything if import notifications are missed. If a `StoreChunk` message is received for a candidate which has no entry, then we will prematurely lose the data.
- Note any new candidates backed in the head. Update the `CandidateMeta` for each. If the `CandidateMeta` does not exist, create it as `Unavailable` with the current timestamp. Register a `"prune_by_time"` entry based on the current timestamp + 1 hour.
- Note any new candidate included in the head. Update the `CandidateMeta` for each, performing a transition from `Unavailable` to `Unfinalized` if necessary. That includes removing the `"prune_by_time"` entry. Add the head hash and number to the state, if unfinalized. Add an `"unfinalized"` entry for the block and candidate.
- The `CandidateEvent` runtime API can be used for this purpose.
- Load all ancestors of the head back to the finalized block so we don't miss anything if import notifications are missed. If a `StoreChunk` message is received for a candidate which has no entry, then we will prematurely lose the data.
- Note any new candidates backed in the head. Update the `CandidateMeta` for each. If the `CandidateMeta` does not exist, create it as `Unavailable` with the current timestamp. Register a `"prune_by_time"` entry based on the current timestamp + 1 hour.
- Note any new candidate included in the head. Update the `CandidateMeta` for each, performing a transition from `Unavailable` to `Unfinalized` if necessary. That includes removing the `"prune_by_time"` entry. Add the head hash and number to the state, if unfinalized. Add an `"unfinalized"` entry for the block and candidate.
- The `CandidateEvent` runtime API can be used for this purpose.
On `OverseerSignal::BlockFinalized(finalized)` events:
- for each key in `iter_with_prefix("unfinalized")`
- Stop if the key is beyond `("unfinalized, finalized)`
- For each block number f that we encounter, load the finalized hash for that block.
- The state of each `CandidateMeta` we encounter here must be `Unfinalized`, since we loaded the candidate from an `"unfinalized"` key.
- For each candidate that we encounter under `f` and the finalized block hash,
- Update the `CandidateMeta` to have `State::Finalized`. Remove all `"unfinalized"` entries from the old `Unfinalized` state.
- Register a `"prune_by_time"` entry for the candidate based on the current time + 1 day + 1 hour.
- For each candidate that we encounter under `f` which is not under the finalized block hash,
- Remove all entries under `f` in the `Unfinalized` state.
- If the `CandidateMeta` has state `Unfinalized` with an empty list of blocks, downgrade to `Unavailable` and re-schedule pruning under the timestamp + 1 hour. We do not prune here as the candidate still may be included in a descendent of the finalized chain.
- Remove all `"unfinalized"` keys under `f`.
- Update last_finalized = finalized.
- for each key in `iter_with_prefix("unfinalized")`
- Stop if the key is beyond `("unfinalized, finalized)`
- For each block number f that we encounter, load the finalized hash for that block.
- The state of each `CandidateMeta` we encounter here must be `Unfinalized`, since we loaded the candidate from an `"unfinalized"` key.
- For each candidate that we encounter under `f` and the finalized block hash,
- Update the `CandidateMeta` to have `State::Finalized`. Remove all `"unfinalized"` entries from the old `Unfinalized` state.
- Register a `"prune_by_time"` entry for the candidate based on the current time + 1 day + 1 hour.
- For each candidate that we encounter under `f` which is not under the finalized block hash,
- Remove all entries under `f` in the `Unfinalized` state.
- If the `CandidateMeta` has state `Unfinalized` with an empty list of blocks, downgrade to `Unavailable` and re-schedule pruning under the timestamp + 1 hour. We do not prune here as the candidate still may be included in a descendant of the finalized chain.
- Remove all `"unfinalized"` keys under `f`.
- Update `last_finalized` = finalized.
This is roughly `O(n * m)` where n is the number of blocks finalized since the last update, and `m` is the number of parachains.
On `QueryAvailableData` message:
- Query `("available", candidate_hash)`
- Query `("available", candidate_hash)`
This is `O(n)` in the size of the data, which may be large.
On `QueryDataAvailability` message:
- Query whether `("meta", candidate_hash)` exists and `data_available == true`.
- Query whether `("meta", candidate_hash)` exists and `data_available == true`.
This is `O(n)` in the size of the metadata which is small.
On `QueryChunk` message:
- Query `("chunk", candidate_hash, index)`
- Query `("chunk", candidate_hash, index)`
This is `O(n)` in the size of the data, which may be large.
On `QueryAllChunks` message:
- Query `("meta", candidate_hash)`. If `None`, send an empty response and return.
- For all `1` bits in the `chunks_stored`, query `("chunk", candidate_hash, index)`. Ignore but warn on errors, and return a vector of all loaded chunks.
On `QueryChunkAvailability message:
- Query `("meta", candidate_hash)`. If `None`, send an empty response and return.
- For all `1` bits in the `chunks_stored`, query `("chunk", candidate_hash, index)`. Ignore but warn on errors, and return a vector of all loaded chunks.
- Query whether `("meta", candidate_hash)` exists and the bit at `index` is set.
On `QueryChunkAvailability` message:
- Query whether `("meta", candidate_hash)` exists and the bit at `index` is set.
This is `O(n)` in the size of the metadata which is small.
On `StoreChunk` message:
- If there is a `CandidateMeta` under the candidate hash, set the bit of the erasure-chunk in the `chunks_stored` bitfield to `1`. If it was not `1` already, write the chunk under `("chunk", candidate_hash, chunk_index)`.
- If there is a `CandidateMeta` under the candidate hash, set the bit of the erasure-chunk in the `chunks_stored` bitfield to `1`. If it was not `1` already, write the chunk under `("chunk", candidate_hash, chunk_index)`.
This is `O(n)` in the size of the chunk.
On `StoreAvailableData` message:
- If there is no `CandidateMeta` under the candidate hash, create it with `State::Unavailable(now)`. Load the `CandidateMeta` otherwise.
- Store `data` under `("available", candidate_hash)` and set `data_available` to true.
- Store each chunk under `("chunk", candidate_hash, index)` and set every bit in `chunks_stored` to `1`.
- If there is no `CandidateMeta` under the candidate hash, create it with `State::Unavailable(now)`. Load the `CandidateMeta` otherwise.
- Store `data` under `("available", candidate_hash)` and set `data_available` to true.
- Store each chunk under `("chunk", candidate_hash, index)` and set every bit in `chunks_stored` to `1`.
This is `O(n)` in the size of the data as the aggregate size of the chunks is proportional to the data.
Every 5 minutes, run a pruning routine:
- for each key in `iter_with_prefix("prune_by_time")`:
- If the key is beyond ("prune_by_time", now), return.
- Remove the key.
- Extract `candidate_hash` from the key.
- Load and remove the `("meta", candidate_hash)`
- For each erasure chunk bit set, remove `("chunk", candidate_hash, bit_index)`.
- If `data_available`, remove `("available", candidate_hash)
- for each key in `iter_with_prefix("prune_by_time")`:
- If the key is beyond `("prune_by_time", now)`, return.
- Remove the key.
- Extract `candidate_hash` from the key.
- Load and remove the `("meta", candidate_hash)`
- For each erasure chunk bit set, remove `("chunk", candidate_hash, bit_index)`.
- If `data_available`, remove `("available", candidate_hash)`
This is O(n * m) in the amount of candidates and average size of the data stored. This is probably the most expensive operation but does not need
to be run very often.
@@ -193,7 +197,7 @@ Basically we need to test the correctness of data flow through state FSMs descri
- Wait until the data should have been pruned.
- The data is no longer available.
- Forkfulness of the relay chain is taken into account
- Fork-awareness of the relay chain is taken into account
- Block `B1` is added to the store.
- Block `B2` is added to the store.
- Notify the subsystem that both `B1` and `B2` were included in different leafs of relay chain.