Commit Graph

543 Commits

Author SHA1 Message Date
Bastian Köcher 8cb049a75a Fixes bug that collator wasn't sending Declare message (#1895)
* Fixes bug that collator wasn't sending `Declare` message

* Apply suggestions from code review

I'm an idiot
2020-11-01 13:40:53 +01:00
Bastian Köcher f82de7b993 Adds test parachain adder collator (#1864)
* start working on building the real overseer

Unfortunately, this fails to compile right now due to an upstream
failure to compile which is probably brought on by a recent upgrade
to rustc v1.47.

* fill in AllSubsystems internal constructors

* replace fn make_metrics with Metrics::attempt_to_register

* update to account for #1740

* remove Metrics::register, rename Metrics::attempt_to_register

* add 'static bounds to real_overseer type params

* pass authority_discovery and network_service to real_overseer

It's not straightforwardly obvious that this is the best way to handle
the case when there is no authority discovery service, but it seems
to be the best option available at the moment.

* select a proper database configuration for the availability store db

* use subdirectory for av-store database path

* apply Basti's patch which avoids needing to parameterize everything on Block

* simplify path extraction

* get all tests to compile

* Fix Prometheus double-registry error

for debugging purposes, added this to node/subsystem-util/src/lib.rs:472-476:

```rust
Some(registry) => Self::try_register(registry).map_err(|err| {
	eprintln!("PrometheusError calling {}::register: {:?}", std::any::type_name::<Self>(), err);
	err
}),
```

That pointed out where the registration was failing, which led to
this fix. The test still doesn't pass, but it now fails in a new
and different way!

* authorities must have authority discovery, but not necessarily overseer handlers

* fix broken SpawnedSubsystem impls

detailed logging determined that using the `Box::new` style of
future generation, the `self.run` method was never being called,
leading to dropped receivers / closed senders for those subsystems,
causing the overseer to shut down immediately.

This is not the final fix needed to get things working properly,
but it's a good start.

* use prometheus properly

Prometheus lets us register simple counters, which aren't very
interesting. It also allows us to register CounterVecs, which are.
With a CounterVec, you can provide a set of labels, which can
later be used to filter the counts.

We were using them wrong, though. This pattern was repeated in a
variety of places in the code:

```rust
// panics with an cardinality mismatch
let my_counter = register(CounterVec::new(opts, &["succeeded", "failed"])?, registry)?;
my_counter.with_label_values(&["succeeded"]).inc()
```

The problem is that the labels provided in the constructor are not
the set of legal values which can be annotated, but a set of individual
label names which can have individual, arbitrary values.

This commit fixes that.

* get av-store subsystem to actually run properly and not die on first signal

* typo fix: incomming -> incoming

* don't disable authority discovery in test nodes

* Fix rococo-v1 missing session keys

* Update node/core/av-store/Cargo.toml

* try dummying out av-store on non-full-nodes

* overseer and subsystems are required only for full nodes

* Reduce the amount of warnings on browser target

* Fix two more warnings

* InclusionInherent should actually have an Inherent module on rococo

* Ancestry: don't return genesis' parent hash

* Update Cargo.lock

* fix broken test

* update test script: specify chainspec as script argument

* Apply suggestions from code review

Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>

* Update node/service/src/lib.rs

Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>

* node/service/src/lib: Return error via ? operator

* post-merge blues

* add is_collator flag

* prevent occasional av-store test panic

* simplify fix; expand application

* run authority_discovery in Role::Discover when collating

* distinguish between proposer closed channel errors

* add IsCollator enum, remove is_collator CLI flag

* improve formatting

* remove nop loop

* Fix some stuff

* Adds test parachain adder collator

* Add sudo to Rococo, change session length to 30 seconds and some renaming

* Update to the latest changes on master

* Some fixes

* Fix compilation

* Update parachain/test-parachains/adder/collator/src/lib.rs

Co-authored-by: Sergei Shulepov <sergei@parity.io>

* Review comments

* Downgrade transaction version

* Fixes

* MOARE

* Register notification protocols

* utils: remove unused error

* av-store: more resilient to some errors

* address review nits

* address more review nits

Co-authored-by: Peter Goodspeed-Niklaus <peter.r.goodspeedniklaus@gmail.com>
Co-authored-by: Andronik Ordian <write@reusable.software>
Co-authored-by: Fedor Sakharov <fedor.sakharov@gmail.com>
Co-authored-by: Robert Habermeier <robert@Roberts-MBP.lan1>
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
Co-authored-by: Max Inden <mail@max-inden.de>
Co-authored-by: Sergey Shulepov <s.pepyakin@gmail.com>
Co-authored-by: Sergei Shulepov <sergei@parity.io>
2020-10-31 17:01:46 +00:00
Bastian Köcher a4b92cd3b6 Make sure validator discovery works with a delayed peer to validator mapping (#1886)
* Make sure validator discovery works with a delayed peer to validator mapping

Currently the implementation checks on connect of a peer if this peer is
a validator by asking the authority discovery. It can now happen that
the authority discovery is not yet aware that a given peer is an
authority. This can for example happen on start up of the node.

This pr changes the behavior, to make it possible to later associate a
peer to a validator id. Instead of just storing the connected
validators, we now store all connected peers with a vector of associated
validator ids. When we get a request to connect to a given given set of
validators, we start by checking the connected peers. If we didn't find
a validator id in the connected peers, we ask the authority discovery
for the peerid of a given authority id. When the returned peerid is part
of our connected peers set, we cache and return the authority id.

* Update node/network/bridge/Cargo.toml

Co-authored-by: Pierre Krieger <pierre.krieger1708@gmail.com>

* Update node/network/bridge/src/validator_discovery.rs

Co-authored-by: Pierre Krieger <pierre.krieger1708@gmail.com>

* Update `Cargo.lock`

Co-authored-by: Pierre Krieger <pierre.krieger1708@gmail.com>
2020-10-29 17:43:34 +01:00
Fedor Sakharov 935fcd1666 Change SpawnedSubsystem type to log subsystem errors (#1878)
* Change SpawnedSubsystem type to log subsystem errors

* Remove clone
2020-10-28 20:57:06 +01:00
Sergei Shulepov 9903bca259 Downward Message Processing implementation (#1859)
* DMP: data structures and plumbing

* DMP: Implement DMP logic in the router module

DMP: Integrate DMP parts into the inclusion module

* DMP: Introduce the max size limit for the size of a downward message

* DMP: Runtime API for accessing inbound messages

* OCD

Small clean ups

* DMP: fix the naming of the error

* DMP: add caution about a non-existent recipient
2020-10-28 10:41:42 +00:00
Peter Goodspeed-Niklaus 1a25c41277 start working on building the real overseer (#1795)
* start working on building the real overseer

Unfortunately, this fails to compile right now due to an upstream
failure to compile which is probably brought on by a recent upgrade
to rustc v1.47.

* fill in AllSubsystems internal constructors

* replace fn make_metrics with Metrics::attempt_to_register

* update to account for #1740

* remove Metrics::register, rename Metrics::attempt_to_register

* add 'static bounds to real_overseer type params

* pass authority_discovery and network_service to real_overseer

It's not straightforwardly obvious that this is the best way to handle
the case when there is no authority discovery service, but it seems
to be the best option available at the moment.

* select a proper database configuration for the availability store db

* use subdirectory for av-store database path

* apply Basti's patch which avoids needing to parameterize everything on Block

* simplify path extraction

* get all tests to compile

* Fix Prometheus double-registry error

for debugging purposes, added this to node/subsystem-util/src/lib.rs:472-476:

```rust
Some(registry) => Self::try_register(registry).map_err(|err| {
	eprintln!("PrometheusError calling {}::register: {:?}", std::any::type_name::<Self>(), err);
	err
}),
```

That pointed out where the registration was failing, which led to
this fix. The test still doesn't pass, but it now fails in a new
and different way!

* authorities must have authority discovery, but not necessarily overseer handlers

* fix broken SpawnedSubsystem impls

detailed logging determined that using the `Box::new` style of
future generation, the `self.run` method was never being called,
leading to dropped receivers / closed senders for those subsystems,
causing the overseer to shut down immediately.

This is not the final fix needed to get things working properly,
but it's a good start.

* use prometheus properly

Prometheus lets us register simple counters, which aren't very
interesting. It also allows us to register CounterVecs, which are.
With a CounterVec, you can provide a set of labels, which can
later be used to filter the counts.

We were using them wrong, though. This pattern was repeated in a
variety of places in the code:

```rust
// panics with an cardinality mismatch
let my_counter = register(CounterVec::new(opts, &["succeeded", "failed"])?, registry)?;
my_counter.with_label_values(&["succeeded"]).inc()
```

The problem is that the labels provided in the constructor are not
the set of legal values which can be annotated, but a set of individual
label names which can have individual, arbitrary values.

This commit fixes that.

* get av-store subsystem to actually run properly and not die on first signal

* typo fix: incomming -> incoming

* don't disable authority discovery in test nodes

* Fix rococo-v1 missing session keys

* Update node/core/av-store/Cargo.toml

* try dummying out av-store on non-full-nodes

* overseer and subsystems are required only for full nodes

* Reduce the amount of warnings on browser target

* Fix two more warnings

* InclusionInherent should actually have an Inherent module on rococo

* Ancestry: don't return genesis' parent hash

* Update Cargo.lock

* fix broken test

* update test script: specify chainspec as script argument

* Apply suggestions from code review

Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>

* Update node/service/src/lib.rs

Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>

* node/service/src/lib: Return error via ? operator

* post-merge blues

* add is_collator flag

* prevent occasional av-store test panic

* simplify fix; expand application

* run authority_discovery in Role::Discover when collating

* distinguish between proposer closed channel errors

* add IsCollator enum, remove is_collator CLI flag

* improve formatting

* remove nop loop

* Fix some stuff

Co-authored-by: Andronik Ordian <write@reusable.software>
Co-authored-by: Bastian Köcher <git@kchr.de>
Co-authored-by: Fedor Sakharov <fedor.sakharov@gmail.com>
Co-authored-by: Robert Habermeier <robert@Roberts-MBP.lan1>
Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
Co-authored-by: Max Inden <mail@max-inden.de>
2020-10-28 10:26:50 +00:00
Bernhard Schuster f345123748 introduce errors with info (#1834) 2020-10-27 08:10:03 +01:00
Fedor Sakharov ec49d81307 Availability store pruning (#1820)
* Initial commit

* Move tests to separate module

* Move timestamps to the newtype

* Change idx name

* Use Duration for consts and update chunk records

* Ordering::Equal

* Update node/core/av-store/src/lib.rs

Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>

* put_ methods do the array sorting

* Fix get_block_number

* Change StoreChunk message type and relay parent method

* Add chunk tests

* Fix block number computation for StoreChunk

* Duration instead of u64 everywhere

* Add a clarifying comment

Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
2020-10-24 06:03:57 +00:00
Andronik Ordian e62b300f47 Companion PR for #7247 (incremental priority group updates) (#1800)
* validator discovery: use incremental updates for priority_group

* validator discovery: fix compilation

* validator discovery: remove Sync bound on Net

* "Update Substrate"

Co-authored-by: parity-processbot <>
2020-10-09 14:34:17 +00:00
Rakan Alhneiti bd75a4ce18 Update to work with async keystore – Companion PR for #7000 (#1740)
* Fix keystore types

* Use SyncCryptoStorePtr

* Borrow keystore

* Fix unused imports

* Fix polkadot service

* Fix bitfield-distribution tests

* Fix indentation

* Fix backing tests

* Fix tests

* Fix provisioner tests

* Removed SyncCryptoStorePtr

* Fix services

* Address PR feedback

* Address PR feedback - 2

* Update CryptoStorePtr imports to be from sp_keystore

* Typo

* Fix CryptoStore import

* Document the reason behind using filesystem keystore

* Remove VALIDATORS

* Fix duplicate dependency

* Mark sp-keystore as optional

* Fix availability distribution

* Fix call to sign_with

* Fix keystore usage

* Remove tokio and fix parachains Cargo config

* Typos

* Fix keystore dereferencing

* Fix CryptoStore import

* Fix provisioner

* Fix node backing

* Update services

* Cleanup dependencies

* Use sync_keystore

* Fix node service

* Fix node service - 2

* Fix node service - 3

* Rename CryptoStorePtr to SyncCryptoStorePtr

* "Update Substrate"

* Apply suggestions from code review

* Update node/core/backing/Cargo.toml

* Update primitives/src/v0.rs

Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>

* Fix wasm build

* Update Cargo.lock

Co-authored-by: parity-processbot <>
Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
2020-10-09 10:54:03 +00:00
Fedor Sakharov 7f4646505a Advertise to already connected validators (#1790)
* Advertise to already connected validators

* Merge the loops and check the view

* Extend a test to capture new logic

* Fix a comment

* Update node/network/collator-protocol/src/collator_side.rs

Co-authored-by: Andronik Ordian <write@reusable.software>

* Update comment

Co-authored-by: Andronik Ordian <write@reusable.software>
2020-10-07 10:42:08 +00:00
Andronik Ordian abf76d27dc collator: fix a typo (#1788)
* collator: fix a typo

* collator: fix more typos

* collator: fix even more typos
2020-10-06 13:00:18 +00:00
Andronik Ordian ca89e3edbe NetworkBridge: validator (authorities) discovery api (#1699)
* stupid, but it compiles

* redo

* cleanup

* add ValidatorDiscovery to msgs

* sketch network bridge code

* ConnectToAuthorities instead of validators

* more stuff

* cleanup

* more stuff

* complete ConnectToAuthoritiesState

* Update node/network/bridge/src/lib.rs

Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>

* Collator protocol subsystem (#1659)

* WIP

* The initial implementation of the collator side.

* Improve comments

* Multiple collation requests

* Add more tests and comments to validator side

* Add comments, remove dead code

* Apply suggestions from code review

Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>

* Fix build after suggested changes

* Also connect to the next validator group

* Remove a Future impl and move TimeoutExt to util

* Minor nits

* Fix build

* Change FetchCollations back to FetchCollation

* Try this

* Final fixes

* Fix build

Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>

* handle multiple in-flight connection requests

* handle cancelled requests

* Update node/core/runtime-api/src/lib.rs

Co-authored-by: Bernhard Schuster <bernhard@ahoi.io>

* redo it again

* more stuff

* redo it again

* update comments

* workaround Future is not Send

* fix trailing spaces

* clarify comments

* bridge: fix compilation in tests

* update more comments

* small fixes

* port collator protocol to new validator discovery api

* collator tests compile

* collator tests pass

* do not revoke a request when the stream receiver is closed

* make revoking opt-in

* fix is_fulfilled

* handle request revokation in collator

* tests

* wait for validator connections asyncronously

* fix compilation

* relabel my todos

* apply Fedor's patch

* resolve reconnection TODO

* resolve revoking TODO

* resolve channel capacity TODO

* resolve peer cloning TODO

* resolve peer disconnected TODO

* resolve PeerSet TODO

* wip tests

* more tests

* resolve Arc TODO

* rename pending to non_revoked

* one more test

* extract utility function into util crate

* fix compilation in tests

* Apply suggestions from code review

Co-authored-by: Fedor Sakharov <fedor.sakharov@gmail.com>

* revert pin_project removal

* fix while let loop

* Revert "revert pin_project removal"

This reverts commit ae7f529d8de982ef66c3007dd1ff74c6ddce80d2.

* fix compilation

* Update node/subsystem/src/messages.rs

* docs on pub items

* guide updates

* remove a TODO

* small guide update

* fix a typo

* link to the issue

* validator discovery: on_request docs

Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
Co-authored-by: Fedor Sakharov <fedor.sakharov@gmail.com>
Co-authored-by: Bernhard Schuster <bernhard@ahoi.io>
2020-10-06 13:34:57 +02:00
Bastian Köcher a4662104db Make collation an optional return (#1787)
This pr changes the collator interface function to return an optional
collation instead of a collation. This is required as the parachain
itself can fail to generate a valid collation for various reason. Now if
the collation fails it will return `None`.

Besides that the pr adds some `RuntimeDebug` derive for `ValidationData`
and removes some whitespaces.
2020-10-06 11:57:10 +02:00
Andronik Ordian 579614d127 implement remaining subsystem metrics (#1770)
* overseer metrics: messages relayed

* provisioner metrics: cosmetic changes

* candidate selection metrics: cosmetic changes

* availability bitfields metrics

* availability distribution metrics

* PoV distribution metrics

* statement-distribution: small simplification

* statement-distribution: extract log target into a const

* statement-distribution: metrics

* address review nits
2020-10-01 12:08:03 +02:00
Andronik Ordian de05bec4d6 move Metrics to utils (#1765) 2020-09-29 11:42:20 +00:00
Fedor Sakharov 9756b2d676 Register listeners in statement distribution (#1759)
* Register listeners in statement distribution

* Review fixes
2020-09-28 13:55:59 +00:00
Fedor Sakharov 32cc9e7a44 Collator protocol followup (#1741)
* Metrics

* Dont punish late collations

* Fix metrics

* Update node/network/collator-protocol/src/lib.rs

Co-authored-by: Andronik Ordian <write@reusable.software>

* Change on_request arg to Result

Co-authored-by: Andronik Ordian <write@reusable.software>
2020-09-28 12:40:40 +00:00
Fedor Sakharov 98660cbd94 Collator protocol subsystem (#1659)
* WIP

* The initial implementation of the collator side.

* Improve comments

* Multiple collation requests

* Add more tests and comments to validator side

* Add comments, remove dead code

* Apply suggestions from code review

Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>

* Fix build after suggested changes

* Also connect to the next validator group

* Remove a Future impl and move TimeoutExt to util

* Minor nits

* Fix build

* Change FetchCollations back to FetchCollation

* Try this

* Final fixes

* Fix build

Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
2020-09-10 16:54:59 +03:00
Bernhard Schuster 4282a269bf chore: nits that were left in bitfield dist (#1665) 2020-08-31 17:34:57 +02:00
Max Inden 87f6afd862 node/network/bridge: Define protocol names as str (#1655)
* node/network/bridge: Define protocol names as str

* "Update Substrate"

Co-authored-by: parity-processbot <>
2020-08-28 15:51:50 +00:00
Fedor Sakharov cc19f13468 use own timeout in tests instead of smol-timeout (#1618) 2020-08-20 20:27:09 +03:00
Robert Habermeier 262574fc49 Implement validation data refactor (#1585)
* update primitives

* correct parent_head field

* make hrmp field pub

* refactor validation data: runtime

* refactor validation data: messages

* add arguments to full_validation_data runtime API

* port runtime API

* mostly port over candidate validation

* remove some parameters from ValidationParams

* guide: update candidate validation

* update candidate outputs

* update ValidationOutputs in primitives

* port over candidate validation

* add a new test for no-transient behavior

* update util runtime API wrappers

* candidate backing

* fix missing imports

* change some fields of validation data around

* runtime API impl

* update candidate validation

* fix backing tests

* grumbles from review

* fix av-store tests

* fix some more crates

* fix provisioner tests

* fix availability distribution tests

* port collation-generation to new validation data

* fix overseer tests

* Update roadmap/implementers-guide/src/node/utility/candidate-validation.md

Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>

Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
2020-08-18 14:41:40 +02:00
Andronik Ordian e7ead40255 initial prometheus metrics (#1536)
* service-new: cosmetic changes

* overseer: draft of prometheus metrics

* metrics: update active_leaves metrics

* metrics: extract into functions

* metrics: resolve XXX

* metrics: it's ugly, but it works

* Bump Substrate

* metrics: move a bunch of code around

* Bumb substrate again

* metrics: fix a warning

* fix a warning in runtime

* metrics: statements signed

* metrics: statements impl RegisterMetrics

* metrics: refactor Metrics trait

* metrics: add Metrics assoc type to JobTrait

* metrics: move Metrics trait to util

* metrics: fix overseer

* metrics: fix backing

* metrics: fix candidate validation

* metrics: derive Default

* metrics: docs

* metrics: add stubs for other subsystems

* metrics: add more stubs and fix compilation

* metrics: fix doctest

* metrics: move to subsystem

* metrics: fix candidate validation

* metrics: bitfield signing

* metrics: av store

* metrics: chain API

* metrics: runtime API

* metrics: stub for avad

* metrics: candidates seconded

* metrics: ok I gave up

* metrics: provisioner

* metrics: remove a clone by requiring Metrics: Sync

* metrics: YAGNI

* metrics: remove another TODO

* metrics: for later

* metrics: add parachain_ prefix

* metrics: s/signed_statement/signed_statements

* utils: add a comment for job metrics

* metrics: address review comments

* metrics: oops

* metrics: make sure to save files before commit 😅

* use _total suffix for requests metrics

Co-authored-by: Max Inden <mail@max-inden.de>

* metrics: add tests for overseer

* update Cargo.lock

* overseer: add a test for CollationGeneration

* collation-generation: impl metrics

* collation-generation: use kebab-case for name

* collation-generation: add a constructor

Co-authored-by: Gav Wood <gavin@parity.io>
Co-authored-by: Ashley Ruglys <ashley.ruglys@gmail.com>
Co-authored-by: Max Inden <mail@max-inden.de>
2020-08-18 11:18:54 +02:00
Robert Habermeier a6b1d91d6e Network bridge refactoring impl (#1537)
* update networking types

* port over overseer-protocol message types

* Add the collation protocol to network bridge

* message sending

* stub for ConnectToValidators

* add some helper traits and methods to protocol types

* add collator protocol message

* leaves-updating

* peer connection and disconnection

* add utilities for dispatching multiple events

* implement message handling

* add an observedrole enum with equality and no sentry nodes

* derive partial-eq on network bridge event

* add PartialEq impls for network message types

* add Into implementation for observedrole

* port over existing network bridge tests

* add some more tests

* port bitfield distribution

* port over bitfield distribution tests

* add codec indices

* port PoV distribution

* port over PoV distribution tests

* port over statement distribution

* port over statement distribution tests

* update overseer and service-new

* address review comments

* port availability distribution

* port over availability distribution tests
2020-08-12 11:16:28 +00:00
Bernhard Schuster 4bdfd02f93 impl availability distribution
Closes #1237
2020-08-10 15:02:30 +02:00
Peter Goodspeed-Niklaus 9fda6cb416 break out subsystem-util and subsystem-test-helpers into individual crates (#1553)
* break out subsystem-util and subsystem-test-helpers into individual crates

* cause all packages to check successfully
2020-08-07 16:51:36 +02:00
Bastian Köcher 8f3317d056 Update scale codec to latest version to fix bug in future rustc version (#1491)
* Update scale codec to latest version to fix bug in future rustc version

Companion of: https://github.com/paritytech/substrate/pull/6746

* 'Update substrate'

Co-authored-by: parity-processbot <>
2020-07-28 20:43:38 +00:00
Ashley 7c7b02ece0 Companion PR for Various small improvements to service construction.. (#1472)
* Initial commit

Forked at: 1ed17cd467
Parent branch: origin/master

* Refactor

* Refactor

* Remove macro

* WIP

Forked at: 1ed17cd467
Parent branch: origin/master

* CLEANUP

Forked at: 1ed17cd467
Parent branch: origin/master

* small fix

* fix for browser

* Switch branch

* Rewrite service builds

* Update branch

* Fix sp-core branch

* Switch branch back and update

Co-authored-by: Cecile Tonglet <cecile.tonglet@cecton.com>
2020-07-28 20:18:11 +02:00
Robert Habermeier c8cdfbfd17 Add new Runtime API messages and make runtime API request fallible (#1485)
* polkadot-subsystem: update runtime API message types

* update all networking subsystems to use fallible runtime APIs

* fix bitfield-signing and make it use new runtime APIs

* port candidate-backing to handle runtime API errors and new types

* remove old runtime API messages

* remove unused imports

* fix grumbles

* fix backing tests
2020-07-28 18:02:39 +00:00
Fedor Sakharov 32a20a178c Availability store subsystem (#1404)
* Initial commit

* WIP

* Make atomic transactions

* Remove pruning code

* Fix build and add a Nop to bridge

* Fixes from review

* Move config struct around for clarity

* Rename constructor and warn on missing docs

* Fix a test and rename a message

* Fix some more reviews

* Obviously failed to rebase cleanly
2020-07-27 14:13:02 +03:00
Peter Goodspeed-Niklaus 106bd929ce add ActiveLeavesUpdate, remove StartWork, StopWork (#1458)
* add ActiveLeavesUpdate, remove StartWork, StopWork

* replace StartWork, StopWork in subsystem crate tests

* mechanically update OverseerSignal in other modules

* convert overseer to take advantage of new multi-hash update abilities

Note: this does not yet convert the tests; some of the tests now freeze:

test tests::overseer_start_stop_works ... test tests::overseer_start_stop_works has been running for over 60 seconds
test tests::overseer_finalize_works ... test tests::overseer_finalize_works has been running for over 60 seconds

* fix broken overseer tests

* manually impl PartialEq for ActiveLeavesUpdate, rm trait Equivalent

This cleans up the code a bit and makes it easier in the future to
do the right thing when comparing ALUs.

* use target in all network bridge logging

* reduce spamming of  and
2020-07-27 10:39:52 +02:00
Bastian Köcher fa598f176b Companion for #6726 (#1469)
* Companion for #6726

* Spaces

* 'Update substrate'

Co-authored-by: parity-processbot <>
2020-07-26 13:16:09 +00:00
Bernhard Schuster a1c704d446 implement bitfield distribution subsystem (#1368)
* feat bitfield distribution

* feat bitfield distribution part 2

* pair programming with rustc & cargo

* lets go

* move bitfield-distribution to the node/network folder

* shape shifting

* lunchtime

* ignore the two fn recursion for now

* step by step

* triplesteps

* bandaid commit

* unordered futures magic

* chore

* reword markdown

* clarify

* lacks abortable processing impl details

* slimify

* fix: warnings and avoid ctx.clone() improve comments

* review comments

* fix details

* make sure outgoing messages are tracked

* fix name

* fix subsystem

* partial test impl

* relax context bounds

* test

* X

* X

* initial test

* fix relay_message not tracked when origin is self

* fix/guide: grammar

Co-authored-by: Robert Habermeier <rphmeier@gmail.com>

* work around missing Eq+PartialEq

* fix: add missing message to provisioner

* unify per_job to job_data

* fix/review: part one

* fix/review: more grumbles

* fix/review: track incoming messages per peer

* fix/review: extract fn, avoid nested matches

* fix/review: more tests, simplify test

* fix/review: extend tests to cover more cases

* chore/rename: Tracker -> ProtocolState

* chore check and comment rewording

* feat test: invalid peer message

* remove ignored test cases and unused macros

* fix master merge fallout + warnings

Co-authored-by: Robert Habermeier <rphmeier@gmail.com>
2020-07-23 15:46:22 +03:00
Peter Goodspeed-Niklaus 5cfcc8446c Add test suite and minor refinements to the utility subsystem (#1403)
* get conclude signal working properly; don't allocate a vector

* wip: add test suite / example / explanation for using utility subsystem

Unfortunately, the test fails right now for reasons which seem
very odd. Just have to keep poking at it.

* explicitly import everything

* fix subsystem-util test

The root problem here was two-fold:

- there was a circular dependency from subsystem -> test-helpers/subsystem ->
  subsystem
- cfg(test) doesn't propagate between crates

The solution: move the subsystem test helpers into a sub-module
within subsystem. Publicly export them from the previous location
so no other code breaks.

Doing this has an additional benefit: it ensures that no production
code can ever accidentally use the subsystem helpers, as they are compile-
gated on cfg(test).

* fully commit to moving test helpers into a subsystem module

* add some more tests

* get rid of log tests in favor of real error forwarding

It's not obvious whether we'll ever really want to chase down
these errors outside a testing context, but having the capability
won't hurt.

* fix issue which caused test to hang on osx

* only require that job errors are PartialEq when testing

also fix polkadot-node-core-backing tests

* get rid of any notion of partialeq

* rethink testing

Combine tests of starting and stopping job: leaving a test executor
with a job running was pretty clearly the cause of the sometimes-hang.

Also, add a timeout so tests _can't_ hang anymore; they just fail
after a while.

* rename fwd_errors -> forward_errors

* warn on error propagation failure

* fix unused import leftover from merge

* derive eq for subsystemerror
2020-07-20 20:35:14 -04:00
Robert Habermeier dddde219a2 Implement Runtime APIs (#1411)
* create a README on Runtime APIs

* add ParaId type

* write up runtime APIs

* more preamble

* rename

* rejig runtime APIs

* add occupied_since to `BlockNumber`

* skeleton crate for runtime API subsystem

* improve group_for_core

* improve docs on availability cores runtime API

* guide: freed -> free

* add primitives for runtime APIs

* create a v1 ParachainHost API trait

* guide: make validation code return `Option`al.

* skeleton runtime API helpers

* make parachain-host runtime-generic

* skeleton for most runtime API implementation functions

* guide: add runtime API helper methods

* implement new helpers of the inclusion module

* guide: remove retries check, as it is unneeded

* implement helpers for scheduler module for Runtime APIs

* clean up `validator_groups` implementation

* implement next_rotation_at and last_rotation_at

* guide: more helpers on GroupRotationInfo

* almost finish implementing runtime APIs

* add explicit block parameter to runtime API fns

* guide: generalize number parameter

* guide: add group_responsible to occupied-core

* update primitives due to guide changes

* finishing touches on runtime API implementation; squash warnings

* break out runtime API impl to separate file

* add tests for next_up logic

* test group rotation info

* point to filed TODO

* remove unused TODO [now]

* indentation

* guide: para -> para_id

* rename para field to para_id for core meta

* remove reference to outdated AvailabilityCores type

* add an event in `inclusion` for candidates being included or timing out

* guide: candidate events

* guide: adjust language

* Candidate events type from guide and adjust inclusion event

* implement `candidate_events` runtime API

* fix runtime test compilation

* max -> min

* fix typos

* guide: add `RuntimeAPIRequest::CandidateEvents`
2020-07-18 16:01:51 -04:00
Fedor Sakharov 5624bd8bf4 Use SpawnNamed instead of Spawn in Overseer (#1430)
* Use SpawnNamed instead of Spawn in Overseer

* reexport SpawnNamed and fix doc tests

* Fix deps
2020-07-17 20:04:02 +03:00
Robert Habermeier 3b13cd9a85 Refactor primitives (#1383)
* create a v1 primitives module

* Improve guide on availability types

* punctuate

* new parachains runtime uses new primitives

* tests of new runtime now use new primitives

* add ErasureChunk to guide

* export erasure chunk from v1 primitives

* subsystem crate uses v1 primitives

* node-primitives uses new v1 primitives

* port overseer to new primitives

* new-proposer uses v1 primitives (no ParachainHost anymore)

* fix no-std compilation for primitives

* service-new uses v1 primitives

* network-bridge uses new primitives

* statement distribution uses v1 primitives

* PoV distribution uses v1 primitives; add PoV::hash fn

* move parachain to v0

* remove inclusion_inherent module and place into v1

* remove everything from primitives crate root

* remove some unused old types from v0 primitives

* point everything else at primitives::v0

* squanch some warns up

* add RuntimeDebug import to no-std as well

* port over statement-table and validation

* fix final errors in validation and node-primitives

* add dummy Ord impl to committed candidate receipt

* guide: update CandidateValidationMessage

* add primitive for validationoutputs

* expand CandidateValidationMessage further

* bikeshed

* add some impls to omitted-validation-data and available-data

* expand CandidateValidationMessage

* make erasure-coding generic over v1/v0

* update usages of erasure-coding

* implement commitments.hash()

* use Arc<Pov> for CandidateValidation

* improve new erasure-coding method names

* fix up candidate backing

* update docs a bit

* fix most tests and add short-circuiting to make_pov_available

* fix remainder of candidate backing tests

* squanching warns

* squanch it up

* some fallout

* overseer fallout

* free from polkadot-test-service hell
2020-07-09 21:23:03 -04:00
Robert Habermeier 151d73af5b Implement PoV Distribution Subsystem (#1344)
* introduce candidatedescriptor type

* add PoVDistribution message type

* loosen bound on PoV Distribution to account for equivocations

* re-export some types from the messages module

* begin PoV Distribution subsystem

* remove redundant index from PoV distribution

* define state machine for pov distribution

* handle overseer signals

* set up control flow

* remove `ValidatorStatement` section

* implement PoV fetching

* implement distribution logic

* add missing `

* implement some network bridge event handlers

* stub for message processing, handle our view change

* control flow for handling messages

* handle `awaiting` message

* handle any incoming PoVs and redistribute

* actually provide a subsystem implementation

* remove set-builder notation

* begin testing PoV distribution

* test that we send awaiting messages only to peers with same view

* ensure we distribute awaited PoVs to peers on view changes

* test that peers can complete fetch and are rewarded

* test some reporting logic

* ensure peer is reported for flooding

* test punishing peers diverging from awaited protocol

* test that we eagerly complete peers' awaited PoVs based on what we receive

* test that we prune the awaited set after receiving

* expand pov-distribution in guide to match a change I made

* remove unneeded import
2020-07-08 18:15:39 -04:00
Robert Habermeier ac8e1ca206 Implement the Statement Distribution Subsystem (#1326)
* set up data types and control flow for statement distribution

* add some set-like methods to View

* implement sending to peers

* start fixing equivocation handling

* Add a section to the statement distribution subsystem on equivocations and flood protection

* fix typo and amend wording

* implement flood protection

* have peer knowledge tracker follow when peer first learns about a candidate

* send dependents after circulating

* add another TODO

* trigger send in one more place

* refactors from review

* send new statements to candidate backing

* instantiate active head data with runtime API values

* track our view changes and peer view changes

* apply a benefit to peers who send us statements we want

* remove unneeded TODO

* add some comments and improve Hash implementation

* start tests and fix `note_statement`

* test active_head seconding logic

* test that the per-peer tracking logic works

* test per-peer knowledge tracker

* test that peer view updates lead to messages being sent

* test statement circulation

* address review comments

* have view set methods return references
2020-07-06 11:16:17 -04:00
Robert Habermeier 2a3e607d14 Subsystem::start takes self by-value (#1325)
* Subsystem::start takes self by-value

* fix doc-test compilation
2020-06-30 15:16:37 -04:00
Robert Habermeier d16e7485d4 Implement Network Bridge (#1280)
* network bridge skeleton

* move some primitives around and add debug impls

* protocol registration glue & abstract network interface

* add send_msgs to subsystemctx

* select logic

* transform different events into actions and handle

* implement remaining network bridge state machine

* start test skeleton

* make network methods asynchronous

* extract subsystem out to subsystem crate

* port over overseer to subsystem context trait

* fix minimal example

* fix overseer doc test

* update network-bridge crate

* write a subsystem test-helpers crate

* write a network test helper for network-bridge

* set up (broken) view test

* Revamp network to be more async-friendly and not require Sync

* fix spacing

* fix test compilation

* insert side-channel for actions

* Add some more message types to AllMessages

* introduce a test harness

* add some tests

* ensure service compiles and passes tests

* fix typo

* fix service-new compilation

* Subsystem test helpers send messages synchronously

* remove smelly action inspector

* remove superfluous let binding

* fix warnings

* Update node/network/bridge/src/lib.rs

Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>

* fix compilation

Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
2020-06-30 11:21:40 -04:00
Robert Habermeier 9d5eae6ea3 establish new node folder for overseer, messages, and subsystems (#1200)
* establish new `node` folder for overseer, messages, and subsystems

* extract message types from overseer crate

* remove doc links
2020-06-05 10:48:35 -04:00