Commit Graph

74 Commits

Author SHA1 Message Date
Andrei Sandu 8a6e9ef189 Introduce subsystem benchmarking tool (#2528)
This tool makes it easy to run parachain consensus stress/performance
testing on your development machine or in CI.

## Motivation
The parachain consensus node implementation spans across many modules
which we call subsystems. Each subsystem is responsible for a small part
of logic of the parachain consensus pipeline, but in general the most
load and performance issues are localized in just a few core subsystems
like `availability-recovery`, `approval-voting` or
`dispute-coordinator`. In the absence of such a tool, we would run large
test nets to load/stress test these parts of the system. Setting up and
making sense of the amount of data produced by such a large test is very
expensive, hard to orchestrate and is a huge development time sink.

## PR contents
- CLI tool 
- Data Availability Read test
- reusable mockups and components needed so far
- Documentation on how to get started

### Data Availability Read test

An overseer is built with using a real `availability-recovery` susbsytem
instance while dependent subsystems like `av-store`, `network-bridge`
and `runtime-api` are mocked. The network bridge will emulate all the
network peers and their answering to requests.

The test is going to be run for a number of blocks. For each block it
will generate send a “RecoverAvailableData” request for an arbitrary
number of candidates. We wait for the subsystem to respond to all
requests before moving to the next block.
At the same time we collect the usual subsystem metrics and task CPU
metrics and show some nice progress reports while running.

### Here is how the CLI looks like:

```
[2023-11-28T13:06:27Z INFO  subsystem_bench::core::display] n_validators = 1000, n_cores = 20, pov_size = 5120 - 5120, error = 3, latency = Some(PeerLatency { min_latency: 1ms, max_latency: 100ms })
[2023-11-28T13:06:27Z INFO  subsystem-bench::availability] Generating template candidate index=0 pov_size=5242880
[2023-11-28T13:06:27Z INFO  subsystem-bench::availability] Created test environment.
[2023-11-28T13:06:27Z INFO  subsystem-bench::availability] Pre-generating 60 candidates.
[2023-11-28T13:06:30Z INFO  subsystem-bench::core] Initializing network emulation for 1000 peers.
[2023-11-28T13:06:30Z INFO  subsystem-bench::availability] Current block 1/3
[2023-11-28T13:06:30Z INFO  substrate_prometheus_endpoint] 〽️ Prometheus exporter started at 127.0.0.1:9999
[2023-11-28T13:06:30Z INFO  subsystem_bench::availability] 20 recoveries pending
[2023-11-28T13:06:37Z INFO  subsystem_bench::availability] Block time 6262ms
[2023-11-28T13:06:37Z INFO  subsystem-bench::availability] Sleeping till end of block (0ms)
[2023-11-28T13:06:37Z INFO  subsystem-bench::availability] Current block 2/3
[2023-11-28T13:06:37Z INFO  subsystem_bench::availability] 20 recoveries pending
[2023-11-28T13:06:43Z INFO  subsystem_bench::availability] Block time 6369ms
[2023-11-28T13:06:43Z INFO  subsystem-bench::availability] Sleeping till end of block (0ms)
[2023-11-28T13:06:43Z INFO  subsystem-bench::availability] Current block 3/3
[2023-11-28T13:06:43Z INFO  subsystem_bench::availability] 20 recoveries pending
[2023-11-28T13:06:49Z INFO  subsystem_bench::availability] Block time 6194ms
[2023-11-28T13:06:49Z INFO  subsystem-bench::availability] Sleeping till end of block (0ms)
[2023-11-28T13:06:49Z INFO  subsystem_bench::availability] All blocks processed in 18829ms
[2023-11-28T13:06:49Z INFO  subsystem_bench::availability] Throughput: 102400 KiB/block
[2023-11-28T13:06:49Z INFO  subsystem_bench::availability] Block time: 6276 ms
[2023-11-28T13:06:49Z INFO  subsystem_bench::availability] 
    
    Total received from network: 415 MiB
    Total sent to network: 724 KiB
    Total subsystem CPU usage 24.00s
    CPU usage per block 8.00s
    Total test environment CPU usage 0.15s
    CPU usage per block 0.05s
```

### Prometheus/Grafana stack in action
<img width="1246" alt="Screenshot 2023-11-28 at 15 11 10"
src="https://github.com/paritytech/polkadot-sdk/assets/54316454/eaa47422-4a5e-4a3a-aaef-14ca644c1574">
<img width="1246" alt="Screenshot 2023-11-28 at 15 12 01"
src="https://github.com/paritytech/polkadot-sdk/assets/54316454/237329d6-1710-4c27-8f67-5fb11d7f66ea">
<img width="1246" alt="Screenshot 2023-11-28 at 15 12 38"
src="https://github.com/paritytech/polkadot-sdk/assets/54316454/a07119e8-c9f1-4810-a1b3-f1b7b01cf357">

---------

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
2023-12-14 12:57:17 +02:00
Alin Dima 689b9d91c7 cumulus-pov-recovery: check pov_hash instead of reencoding data (#2287)
Collators were previously reencoding the available data and checking the
erasure root.
Replace that with just checking the PoV hash, which consumes much less
CPU and takes less time.

We also don't need to check the `PersistedValidationData` hash, as
collators don't use it.

Reason:
https://github.com/paritytech/polkadot-sdk/issues/575#issuecomment-1806572230

After systematic chunks recovery is merged, collators will no longer do
any reed-solomon encoding/decoding, which has proven to be a great CPU
consumer.

Signed-off-by: alindima <alin@parity.io>
2023-11-14 10:37:41 +02:00
Alexandru Gheorghe 3069b0af39 make polkadot die graciously (#2056)
While investigating some db migrations that make the node startup fail,
I noticed that the node wasn't exiting and that the log file were
growing exponentially, until my whole system was freezing and that makes
it really hard to actually find why it was failing in the first place.

E.g:
```
 ls -lh /tmp/zombie-01a04c2a2c0265d85f6440cf01c0f44a_-51319-uyggzuD4wEpV/bob.log
 32,6G oct 27 11:16 /tmp/zombie-01a04c2a2c0265d85f6440cf01c0f44a_-51319-uyggzuD4wEpV/bob.log
```

This was happening because the following errors were being printed
continously without the subsystem main loop exiting:

From dispute-coordinator:
```
WARN tokio-runtime-worker parachain::dispute-coordinator: error=Subsystem(Generated(Context("Signal channel is terminated and empty.")))
```

From availability recovery:
```
Erasure task channel closed. Node shutting down ?
```

Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
2023-10-27 13:50:30 +02:00
Alin Dima 6f00edbc55 Refactor availability-recovery strategies (#1457)
Refactors availability-recovery strategies to allow for easily adding
new hotpaths and failover mechanisms.

The new interface allows for chaining multiple `RecoveryStrategy`-es
together, to cleanly express the relationship between them and share
state and code where neccessary/possible:

This was done in order to aid in implementing new hotpaths like
[systematic chunks
recovery](https://github.com/paritytech/polkadot-sdk/issues/598) and
[fetching from approval
checkers](https://github.com/paritytech/polkadot-sdk/issues/575).

Thanks to this design, intermediate state can be shared between the
strategies. For example, if the systematic chunks recovery retrieved
less than the needed amount of chunks, pass them over to the next
FetchChunks strategy, which will only need to recover the remaining
number of chunks.

Draft example of how a systematic chunk recovery strategy would look:
https://github.com/paritytech/polkadot-sdk/commit/667d870bdf1470525d66c13929d5eac7249dd995
(notice how easy it was to add and reuse code)

Note that this PR doesn't itself add any new strategy, it should fully
preserve backwards compatiblity in terms of functionality. Follow-up PRs
to add new strategies will come.
2023-09-20 15:56:43 +03:00
ordian c168a77e26 deps: replace lru with schnellru (#1217)
* deps: replace lru with schnellru

* bring the peace to the galaxy
2023-08-28 19:04:11 +02:00
Oliver Tale-Yazdi 342d720573 Use same fmt and clippy configs as in Substrate (#7611)
* Use same rustfmt.toml as Substrate

Signed-off-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io>

* format format file

Signed-off-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io>

* Format with new config

Signed-off-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io>

* Add Substrate Clippy config

Signed-off-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io>

* Print Clippy version in CI

Otherwise its difficult to reproduce locally.

Signed-off-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io>

* Make fmt happy

Signed-off-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io>

* Update node/core/pvf/src/error.rs

Co-authored-by: Tsvetomir Dimitrov <tsvetomir@parity.io>

* Update node/core/pvf/src/error.rs

Co-authored-by: Tsvetomir Dimitrov <tsvetomir@parity.io>

---------

Signed-off-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io>
Co-authored-by: Tsvetomir Dimitrov <tsvetomir@parity.io>
2023-08-14 14:29:29 +00:00
Andrei Sandu a0814490d2 availability-recovery: move cpu burners in blocking tasks (#7417)
* Move expensive computations to blocking thread

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

* fix test

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

* add internal error and fix dependent subystems

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

* fmt

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

* fix test fix

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

* minor refactor and TODOs

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

* Impl Feedback for Review

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

* review feedback

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

* More docs

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

* add some example timings in comments

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

---------

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
2023-07-04 09:50:49 +00:00
Andrei Sandu 02d3fd025d availability recovery: measure re-encoding time (#7409)
* Measure re-encoding time

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

* fix build

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

---------

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
2023-06-21 17:02:57 +03:00
Andrei Sandu 2ca3750f0f Prefer fetching small PoVs from backing group (#7173)
* impl QueryChunkSize

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

* QueryChunkSize message

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

* enable fetching from backing group for small pov

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

* review feedback

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

* Refactor `bypass_availability_store`

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

* review feedback

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

---------

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
2023-05-05 09:56:54 +00:00
Sebastian Kunert 0ae0393042 Add option to skip av-store requests in availability-recovery-subsystem (#7131)
* Allow to skip availability-store

* Update node/network/availability-recovery/src/lib.rs

Co-authored-by: Michal Kucharczyk <1728078+michalkucharczyk@users.noreply.github.com>

---------

Co-authored-by: Michal Kucharczyk <1728078+michalkucharczyk@users.noreply.github.com>
2023-04-28 10:13:04 +00:00
s0me0ne-unkn0wn 64660ee8d2 Remove years from copyright notes (#7034)
* Happy New Year!

* Remove year entierly

Co-authored-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io>

* Remove years from copyright notice in the entire repo

---------

Co-authored-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io>
2023-04-08 20:38:35 +00:00
s0me0ne-unkn0wn 1cb1d03c08 Re-export current primitives in crate root (#6487)
* Re-export current primitives in crate root

* Add missing exports

* restart CI
2023-01-11 11:28:12 +00:00
alexgparity 9ea14e66c8 Clippyfy (#6341)
* Add clippy config and remove .cargo from gitignore

* first fixes

* Clippyfied

* Add clippy CI job

* comment out rusty-cachier

* minor

* fix ci

* remove DAG from check-dependent-project

* add DAG to clippy

Co-authored-by: alvicsam <alvicsam@gmail.com>
2022-11-30 08:34:06 +00:00
Marcin S d53513ff66 Fixes "for loop over an Option" warnings (#6291)
Was seeing these warnings when running `cargo check --all`:

```
warning: for loop over an `Option`. This is more readably written as an `if let` statement
    --> node/core/approval-voting/src/lib.rs:1147:21
     |
1147 |             for activated in update.activated {
     |                              ^^^^^^^^^^^^^^^^
     |
     = note: `#[warn(for_loops_over_fallibles)]` on by default
help: to check pattern in a loop use `while let`
     |
1147 |             while let Some(activated) = update.activated {
     |             ~~~~~~~~~~~~~~~         ~~~
help: consider using `if let` to clear intent
     |
1147 |             if let Some(activated) = update.activated {
     |             ~~~~~~~~~~~~         ~~~
```

My guess is that `activated` used to be a SmallVec or similar, as is
`deactivated`. It was changed to an `Option`, the `for` still compiled (it's
technically correct, just weird), and the compiler didn't catch it until now.
2022-11-15 09:58:26 -05:00
Boluwatife Bakre 8eb1f4617f Use a more typesafe approach for managing indexed data (#6150)
* Fix for issue #2403

* Nightly fmt

* Quick documentation fixes

* Default Implementation

* iter() function integrated

* Implemented iter functionalities

* Fmt

* small change

* updates node-network

* updates in dispute-coordinator

* Updates

* benchmarking fix

* minor fix

* test fixes in runtime api

* Update primitives/src/v2/mod.rs

Co-authored-by: Andronik <write@reusable.software>

* Update primitives/src/v2/mod.rs

Co-authored-by: Andronik <write@reusable.software>

* Update primitives/src/v2/mod.rs

Co-authored-by: Andronik <write@reusable.software>

* Update primitives/src/v2/mod.rs

Co-authored-by: Andronik <write@reusable.software>

* Update primitives/src/v2/mod.rs

Co-authored-by: Andronik <write@reusable.software>

* Removal of [index], shorting of FromIterator, Renaming of GroupValidators to ValidatorGroups

* Removal of ops import

* documentation fixes for spell check

* implementation of generic type

* Refactoring

* Test and documentation fixes

* minor test fix

* minor test fix

* minor test fix

* Update node/network/statement-distribution/src/lib.rs

Co-authored-by: Andronik <write@reusable.software>

* Update primitives/src/v2/mod.rs

Co-authored-by: Andronik <write@reusable.software>

* Update primitives/src/v2/mod.rs

Co-authored-by: Andronik <write@reusable.software>

* removed IterMut

* Update node/core/dispute-coordinator/src/import.rs

Co-authored-by: Andronik <write@reusable.software>

* Update node/core/dispute-coordinator/src/initialized.rs

Co-authored-by: Andronik <write@reusable.software>

* Update primitives/src/v2/mod.rs

Co-authored-by: Andronik <write@reusable.software>

* fmt

* IterMut

* documentation update

Co-authored-by: Andronik <write@reusable.software>

* minor adjustments and new TypeIndex trait

* spelling fix

* TypeIndex fix

Co-authored-by: Andronik <write@reusable.software>
2022-10-22 08:39:11 +00:00
Andronik befaec4cee availability-recovery: use IfDisconnected::TryConnect for chunks (#6081)
* availability-recovery: use `IfDisconnected::TryConnect` for chunks

* fix tests
2022-10-18 13:15:49 +00:00
dependabot[bot] a64cc4a860 Bump lru from 0.7.8 to 0.8.0 (#6060)
* Bump lru from 0.7.8 to 0.8.0

Bumps [lru](https://github.com/jeromefroe/lru-rs) from 0.7.8 to 0.8.0.
- [Release notes](https://github.com/jeromefroe/lru-rs/releases)
- [Changelog](https://github.com/jeromefroe/lru-rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/jeromefroe/lru-rs/compare/0.7.8...0.8.0)

---
updated-dependencies:
- dependency-name: lru
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* Change `LruCache` paramerter to `NonZeroUsize`

* Change type of `session_cache_lru_size` to `NonZeroUsize`

* Add expects instead of unwrap

Co-authored-by: Bastian Köcher <info@kchr.de>

* Use match to get rid of expects

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Sebastian Kunert <skunert49@gmail.com>
Co-authored-by: Bastian Köcher <info@kchr.de>
2022-10-04 11:28:21 +00:00
Robert Klotzner 548b4c6c71 Validate chunks from disk in availability-recovery (#6078)
* Don't use corrupted chunks from disk.

Otherwise we would be going to dispute the candidate and get slashed.

* Add tests
2022-09-29 14:16:12 +02:00
Bernhard Schuster 3240cb5e4d split NetworkBridge into two subsystems (#5616)
* foo

* rolling session window

* fixup

* remove use statemetn

* fmt

* split NetworkBridge into two subsystems

Pending cleanup

* split

* chore: reexport OrchestraError as OverseerError

* chore: silence warnings

* fixup tests

* chore: add default timenout of 30s to subsystem test helper ctx handle

* single item channel

* fixins

* fmt

* cleanup

* remove dead code

* remove sync bounds again

* wire up shared state

* deal with some FIXMEs

* use distinct tags

Co-authored-by: Andrei Sandu <54316454+sandreim@users.noreply.github.com>

* use tag

Co-authored-by: Andrei Sandu <54316454+sandreim@users.noreply.github.com>

* address naming

tx and rx are common in networking and also have an implicit meaning regarding networking
compared to incoming and outgoing which are already used with subsystems themselvesq

* remove unused sync oracle

* remove unneeded state

* fix tests

* chore: fmt

* do not try to register twice

* leak Metrics type

Co-authored-by: Andrei Sandu <54316454+sandreim@users.noreply.github.com>
Co-authored-by: Andronik <write@reusable.software>
2022-07-12 16:22:36 +00:00
Bernhard Schuster 450ca2baca overseer becomes orchestra (#5542)
* rename overseer-gen to orchestra

Also drop `gum` and use `tracing`.

* make orchestra compile as standalone

* introduce Spawner trait to split from sp_core

Finalizes the independence of orchestra from polkadot-overseer

* slip of the pen

* other fixins

* remove unused import

* Update node/overseer/orchestra/proc-macro/src/impl_builder.rs

Co-authored-by: Vsevolod Stakhov <vsevolod.stakhov@parity.io>

* Update node/overseer/orchestra/proc-macro/src/impl_builder.rs

Co-authored-by: Vsevolod Stakhov <vsevolod.stakhov@parity.io>

* orchestra everywhere

* leaky data

* Bump scale-info from 2.1.1 to 2.1.2 (#5552)

Bumps [scale-info](https://github.com/paritytech/scale-info) from 2.1.1 to 2.1.2.
- [Release notes](https://github.com/paritytech/scale-info/releases)
- [Changelog](https://github.com/paritytech/scale-info/blob/master/CHANGELOG.md)
- [Commits](https://github.com/paritytech/scale-info/compare/v2.1.1...v2.1.2)

---
updated-dependencies:
- dependency-name: scale-info
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Add missing markdown code block delimiter (#5555)

* bitfield-signing: remove util::jobs usage  (#5523)

* Switch to pooling copy-on-write instantiation strategy for WASM (companion for Substrate#11232) (#5337)

* Switch to pooling copy-on-write instantiation strategy for WASM

* Fix compilation of `polkadot-test-service`

* Update comments

* Move `max_memory_size` to `Semantics`

* Rename `WasmInstantiationStrategy` to `WasmtimeInstantiationStrategy`

* Update a safety comment

* update lockfile for {"substrate"}

Co-authored-by: parity-processbot <>

* Fix build

Co-authored-by: Vsevolod Stakhov <vsevolod.stakhov@parity.io>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Malte Kliemann <mail@maltekliemann.com>
Co-authored-by: Chris Sosnin <48099298+slumber@users.noreply.github.com>
Co-authored-by: Koute <koute@users.noreply.github.com>
2022-05-19 13:42:02 +01:00
Bernhard Schuster 511891dcce refactor+feat: allow subsystems to send only declared messages, generate graphviz (#5314)
Closes #3774
Closes #3826
2022-05-12 17:39:05 +02:00
Bernhard Schuster d437a33e0b polkadot-node-subsystem package rename mish mash cleanup (#5502)
* unify to polkadot-node-subsystem{,-test-helpers}

* chore: fmt
2022-05-11 15:32:38 +00:00
Robert Klotzner 8dbc4d8a6e Reduce log verbosity (#5440)
* Reduce log verbosity

* Update node/network/availability-recovery/src/lib.rs

Co-authored-by: Andronik <write@reusable.software>

Co-authored-by: Andronik <write@reusable.software>
2022-05-03 14:04:07 +02:00
Vsevolod Stakhov 3f0e73d320 Add some additional logging to availability recovery subsystem (#5345) 2022-04-21 18:15:02 +01:00
asynchronous rob fc4b04db20 Prepare for network protocol version upgrades (#5084)
* explicitly tag network requests with version

* fmt

* make PeerSet more aware of versioning

* some generalization of the network bridge to support upgrades

* walk back some renaming

* walk back some version stuff

* extract version from fallback

* remove V1 from NetworkBridgeUpdate

* add accidentally-removed timer

* implement focusing for versioned messages

* fmt

* fix up network bridge & tests

* remove inaccurate version check in bridge

* remove some TODO [now]s

* fix fallout in statement distribution

* fmt

* fallout in gossip-support

* fix fallout in collator-protocol

* fix fallout in bitfield-distribution

* fix fallout in approval-distribution

* fmt

* use never!

* fmt
2022-04-21 16:34:59 +00:00
Robert Klotzner 4ca9691dcf Better metrics for availability-recovery (#5249)
* Better metrics.

- Fix time metric
- Add counters

* Typo

* Better docs.
2022-04-04 14:31:08 +00:00
Qinxuan Chen 74078d8eb9 Comanion for substrate#11136 (#5218)
* Comanion for substrate#11136

Signed-off-by: koushiro <koushiro.cqx@gmail.com>

* revert changes in bridge

Signed-off-by: koushiro <koushiro.cqx@gmail.com>
2022-04-04 11:13:34 +02:00
Bernhard Schuster d309a24e50 observability: add two more timers (#5124)
* add two more timers

* Update node/network/availability-recovery/src/metrics.rs

* Try to improve comments spelling

* Cargo fmt iteration

Co-authored-by: Vsevolod Stakhov <vsevolod.stakhov@parity.io>
2022-03-15 15:23:05 +00:00
Bernhard Schuster d631f1dea8 observability: tracing gum, automatically cross ref traceID (#5079)
* add some gum

* bump expander

* gum

* fix all remaining issues

* last fixup

* Update node/gum/proc-macro/src/lib.rs

Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>

* change

* netowrk

* fixins

* chore

* allow optional fmt str + args, prep for expr as kv field

* tracing -> gum rename fallout

* restrict further

* allow multiple levels of field accesses

* another round of docs and a slip of the pen

* update ADR

* fixup lock fiel

* use target: instead of target=

* minors

* fix

* chore

* Update node/gum/README.md

Co-authored-by: Andrei Sandu <54316454+sandreim@users.noreply.github.com>

Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
Co-authored-by: Andrei Sandu <54316454+sandreim@users.noreply.github.com>
2022-03-15 11:05:16 +00:00
Robert Habermeier 49f7e5cce4 Finish migration to v2 primitives (#5037)
* remove v0 primitives from polkadot-primitives

* first pass: remove v0

* fix fallout in erasure-coding

* remove v1 primitives, consolidate to v2

* the great import update

* update runtime_api_impl_v1 to v2 as well

* guide: add `Version` request for runtime API

* add version query to runtime API

* reintroduce OldV1SessionInfo in a limited way
2022-03-09 14:01:13 -06:00
Bernhard Schuster d946582707 fatality based errors (#4448)
* seed commit for fatality based errors

* fatality

* first draft of fatality

* cleanup

* differnt approach

* simplify

* first working version for enums, with documentation

* add split

* fix simple split test case

* extend README.md

* update fatality impl

* make tests passed

* apply fatality to first subsystem

* fatality fixes

* use fatality in a subsystem

* fix subsystemg

* fixup proc macro

* fix/test: log::*! do not execute when log handler is missing

* fix spelling

* rename Runtime2 to something sane

* allow nested split with `forward` annotations

* add free license

* enable and fixup all tests

* use external fatality

Makes this more reviewable.

* bump fatality dep

Avoid duplicate expander compilations.

* migrate availability distribution

* more fatality usage

* chore: bump fatality to 0.0.6

* fixup remaining subsystems

* chore: fmt

* make cargo spellcheck happy

* remove single instance of `#[fatal(false)]`

* last quality sweep

* fixup
2022-02-25 17:25:26 +00:00
Robert Klotzner f2bdd99532 Add some docs to prevent a time loop. (#4702)
* Add some docs to prevent a time loop.

* Review remarks.
2022-01-13 08:15:13 +00:00
Andronik Ordian b342ae11d3 session-info: add new fields + migration (#4545)
* session_info: v2 + migration

* use primitives::v2

* use polkadot_primitives::v2

* impl primitives::v2

* fix approval-voting tests

* fix other tests

* hook storage migration up

* backwards compat (1)

* backwards compat (2)

* fmt

* fix tests

* FMT

* do not reexport v1 in v2

* fmt

* set storage version to 1

Co-authored-by: Javier Viola <javier@parity.io>
2021-12-27 08:01:30 +00:00
Robert Klotzner 34339c6805 Don't cache unavailable results. (#4509) 2021-12-10 23:52:20 +01:00
Andronik Ordian fa1080a03a req/resp: use IfDisconnected::ImmediateError (#4253)
* req/resp: use IfDisconnected::ImmediateError

* remove outdated comments

* fmt
2021-11-12 17:01:52 +00:00
sandreim b0f89bbfbc Per subsystem CPU usage tracking (#4239)
* SubsystemContext: add subsystem name str

Signed-off-by: Andrei Sandu <sandu.andrei@gmail.com>

* Overseer builder proc macro changes

* initilize SubsystemContext name field.
* Add subsystem name in TaskKind::launch_task()

Signed-off-by: Andrei Sandu <sandu.andrei@gmail.com>

* Update ToOverseer enum

Signed-off-by: Andrei Sandu <sandu.andrei@gmail.com>

* Assign subsystem names to orphan tasks

Signed-off-by: Andrei Sandu <sandu.andrei@gmail.com>

* cargo fmt

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

* SubsystemContext: add subsystem name str

Signed-off-by: Andrei Sandu <sandu.andrei@gmail.com>

* Overseer builder proc macro changes

* initilize SubsystemContext name field.
* Add subsystem name in TaskKind::launch_task()

Signed-off-by: Andrei Sandu <sandu.andrei@gmail.com>

* Update ToOverseer enum

Signed-off-by: Andrei Sandu <sandu.andrei@gmail.com>

* Assign subsystem names to orphan tasks

Signed-off-by: Andrei Sandu <sandu.andrei@gmail.com>

* cargo fmt

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

* Rebase changes for new spawn() group param

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

* Add subsystem constat in JobTrait

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

* Add subsystem string

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

* Fix tests

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

* Fix spawn() calls

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

* cargo fmt

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

* Fix

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

* Fix tests

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

* fix

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

* Fix more tests

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

* Address PR review feedback #1

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

* Address PR review round 2

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

* Fixes
- remove JobTrait::Subsystem
- fix tests

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

* update Cargo.lock

Co-authored-by: Andronik Ordian <write@reusable.software>
2021-11-11 18:53:37 +00:00
Bernhard Schuster edac78d03c availability recovery type name clarifications (#4203)
* minor changes

* fmt

* rename to expressive types

* chore: fixup

* chore: remove `Data` prefixes

* address review comments

* guide items

* sourcer -> source, add `FromValdiators` suffix
2021-11-08 13:43:23 +00:00
Robert Klotzner a14b667723 Fix flaky availability-recovery test (#3812)
* Increase timeout in tests.

Fixes #3798

* Fix timeout.
2021-09-08 12:56:45 +00:00
Lldenaurois 2bd84151ed Add tests and modify as_vec implementation (#3715)
* Add tests and modify as_vec implementation

* Address feedback

* fix typo in test
2021-09-06 13:24:04 +02:00
Robert Klotzner ffcde1e5e7 Fixes/improvements for disputes (#3753)
* More debugging output.

* Fix chain selection in case of disputes.

* Fix flaky test.
2021-09-01 14:25:56 -05:00
Robert Klotzner e56efb82d9 Further improved availability recovery (#3711)
* WiP.

* Things compile.

* cargo fmt

* Passing tests + fix warnings.

* Metrics for availability recovery.

* Basic test.

* Fix typos and actually check for overflow.

* cargo fmt

* Register metrics.

* More tests.

* Fix warning.

* cargo +nightly fmt

* Fix metrics

* Get rid of unsafe.

* tabify

* spellcheck

Co-authored-by: Andronik Ordian <write@reusable.software>
Co-authored-by: Bastian Köcher <info@kchr.de>
2021-08-27 18:59:23 +02:00
Lldenaurois 9b45483cb1 backing-availability-audit: Move ErasureChunk Proof to BoundedVec (#3626)
* backing-availability-audit: Move ErasureChunk Proof to BoundedVec

* WIP

* Touch up

* Fix spelling mistake

* Address Feedback
2021-08-24 12:50:33 -04:00
Robert Klotzner 489a8e6da1 Fill up requests slots via launch_parallel_requests (#3681)
in case waiting for the next response takes too long.
2021-08-24 15:05:25 +02:00
Robert Klotzner d6abe70c06 Better logs. (#3650) 2021-08-19 20:07:59 +02:00
Robert Klotzner 55154a8d37 Remove request multiplexer (#3624)
* WIP: Get rid of request multiplexer.

* WIP

* Receiver for handling of incoming requests.

* Get rid of useless `Fault` abstraction.

The things the type system let us do are not worth getting abstracted in
its own type. Instead error handling is going to be merely a pattern.

* Make most things compile again.

* Port availability distribution away from request multiplexer.

* Formatting.

* Port dispute distribution over.

* Fixup statement distribution.

* Handle request directly in collator protocol.

+ Only allow fatal errors at top level.

* Use direct request channel for availability recovery.

* Finally get rid of request multiplexer

Fixes #2842 and paves the way for more back pressure possibilities.

* Fix overseer and statement distribution tests.

* Fix collator protocol and network bridge tests.

* Fix tests in availability recovery.

* Fix availability distribution tests.

* Fix dispute distribution tests.

* Add missing dependency

* Typos.

* Review remarks.

* More remarks.
2021-08-12 13:11:36 +02:00
Sergei Shulepov 68c03f66f3 Mass replace ,); pattern (#3580)
This is an artifact left by rustfmt which is not dare to remove the
comma being conservative.
2021-08-05 19:53:17 +02:00
Shawn Tabrizi ff5d56fb76 cargo +nightly fmt (#3540)
* cargo +nightly fmt

* add cargo-fmt check to ci

* update ci

* fmt

* fmt

* skip macro

* ignore bridges
2021-08-02 10:47:33 +00:00
Bernhard Schuster 3c9104daff refactor overseer into proc-macro based pattern (#2962) 2021-07-08 21:09:26 +02:00
Robert Klotzner f293fb1025 Fix busy loops. (#3392) 2021-07-01 08:44:13 +02:00
Andronik Ordian ffc6f7c731 make ctx.spawn blocking (#3337)
* make spawn sync

* improve error type
2021-06-21 20:43:40 -05:00