Commit Graph

12 Commits

Author SHA1 Message Date
Bastian Köcher 8cb049a75a Fixes bug that collator wasn't sending Declare message (#1895)
* Fixes bug that collator wasn't sending `Declare` message

* Apply suggestions from code review

I'm an idiot
2020-11-01 13:40:53 +01:00
Bastian Köcher f82de7b993 Adds test parachain adder collator (#1864)
* start working on building the real overseer

Unfortunately, this fails to compile right now due to an upstream
failure to compile which is probably brought on by a recent upgrade
to rustc v1.47.

* fill in AllSubsystems internal constructors

* replace fn make_metrics with Metrics::attempt_to_register

* update to account for #1740

* remove Metrics::register, rename Metrics::attempt_to_register

* add 'static bounds to real_overseer type params

* pass authority_discovery and network_service to real_overseer

It's not straightforwardly obvious that this is the best way to handle
the case when there is no authority discovery service, but it seems
to be the best option available at the moment.

* select a proper database configuration for the availability store db

* use subdirectory for av-store database path

* apply Basti's patch which avoids needing to parameterize everything on Block

* simplify path extraction

* get all tests to compile

* Fix Prometheus double-registry error

for debugging purposes, added this to node/subsystem-util/src/lib.rs:472-476:

```rust
Some(registry) => Self::try_register(registry).map_err(|err| {
	eprintln!("PrometheusError calling {}::register: {:?}", std::any::type_name::<Self>(), err);
	err
}),
```

That pointed out where the registration was failing, which led to
this fix. The test still doesn't pass, but it now fails in a new
and different way!

* authorities must have authority discovery, but not necessarily overseer handlers

* fix broken SpawnedSubsystem impls

detailed logging determined that using the `Box::new` style of
future generation, the `self.run` method was never being called,
leading to dropped receivers / closed senders for those subsystems,
causing the overseer to shut down immediately.

This is not the final fix needed to get things working properly,
but it's a good start.

* use prometheus properly

Prometheus lets us register simple counters, which aren't very
interesting. It also allows us to register CounterVecs, which are.
With a CounterVec, you can provide a set of labels, which can
later be used to filter the counts.

We were using them wrong, though. This pattern was repeated in a
variety of places in the code:

```rust
// panics with an cardinality mismatch
let my_counter = register(CounterVec::new(opts, &["succeeded", "failed"])?, registry)?;
my_counter.with_label_values(&["succeeded"]).inc()
```

The problem is that the labels provided in the constructor are not
the set of legal values which can be annotated, but a set of individual
label names which can have individual, arbitrary values.

This commit fixes that.

* get av-store subsystem to actually run properly and not die on first signal

* typo fix: incomming -> incoming

* don't disable authority discovery in test nodes

* Fix rococo-v1 missing session keys

* Update node/core/av-store/Cargo.toml

* try dummying out av-store on non-full-nodes

* overseer and subsystems are required only for full nodes

* Reduce the amount of warnings on browser target

* Fix two more warnings

* InclusionInherent should actually have an Inherent module on rococo

* Ancestry: don't return genesis' parent hash

* Update Cargo.lock

* fix broken test

* update test script: specify chainspec as script argument

* Apply suggestions from code review

Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>

* Update node/service/src/lib.rs

Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>

* node/service/src/lib: Return error via ? operator

* post-merge blues

* add is_collator flag

* prevent occasional av-store test panic

* simplify fix; expand application

* run authority_discovery in Role::Discover when collating

* distinguish between proposer closed channel errors

* add IsCollator enum, remove is_collator CLI flag

* improve formatting

* remove nop loop

* Fix some stuff

* Adds test parachain adder collator

* Add sudo to Rococo, change session length to 30 seconds and some renaming

* Update to the latest changes on master

* Some fixes

* Fix compilation

* Update parachain/test-parachains/adder/collator/src/lib.rs

Co-authored-by: Sergei Shulepov <sergei@parity.io>

* Review comments

* Downgrade transaction version

* Fixes

* MOARE

* Register notification protocols

* utils: remove unused error

* av-store: more resilient to some errors

* address review nits

* address more review nits

Co-authored-by: Peter Goodspeed-Niklaus <peter.r.goodspeedniklaus@gmail.com>
Co-authored-by: Andronik Ordian <write@reusable.software>
Co-authored-by: Fedor Sakharov <fedor.sakharov@gmail.com>
Co-authored-by: Robert Habermeier <robert@Roberts-MBP.lan1>
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
Co-authored-by: Max Inden <mail@max-inden.de>
Co-authored-by: Sergey Shulepov <s.pepyakin@gmail.com>
Co-authored-by: Sergei Shulepov <sergei@parity.io>
2020-10-31 17:01:46 +00:00
Fedor Sakharov 935fcd1666 Change SpawnedSubsystem type to log subsystem errors (#1878)
* Change SpawnedSubsystem type to log subsystem errors

* Remove clone
2020-10-28 20:57:06 +01:00
Peter Goodspeed-Niklaus 1a25c41277 start working on building the real overseer (#1795)
* start working on building the real overseer

Unfortunately, this fails to compile right now due to an upstream
failure to compile which is probably brought on by a recent upgrade
to rustc v1.47.

* fill in AllSubsystems internal constructors

* replace fn make_metrics with Metrics::attempt_to_register

* update to account for #1740

* remove Metrics::register, rename Metrics::attempt_to_register

* add 'static bounds to real_overseer type params

* pass authority_discovery and network_service to real_overseer

It's not straightforwardly obvious that this is the best way to handle
the case when there is no authority discovery service, but it seems
to be the best option available at the moment.

* select a proper database configuration for the availability store db

* use subdirectory for av-store database path

* apply Basti's patch which avoids needing to parameterize everything on Block

* simplify path extraction

* get all tests to compile

* Fix Prometheus double-registry error

for debugging purposes, added this to node/subsystem-util/src/lib.rs:472-476:

```rust
Some(registry) => Self::try_register(registry).map_err(|err| {
	eprintln!("PrometheusError calling {}::register: {:?}", std::any::type_name::<Self>(), err);
	err
}),
```

That pointed out where the registration was failing, which led to
this fix. The test still doesn't pass, but it now fails in a new
and different way!

* authorities must have authority discovery, but not necessarily overseer handlers

* fix broken SpawnedSubsystem impls

detailed logging determined that using the `Box::new` style of
future generation, the `self.run` method was never being called,
leading to dropped receivers / closed senders for those subsystems,
causing the overseer to shut down immediately.

This is not the final fix needed to get things working properly,
but it's a good start.

* use prometheus properly

Prometheus lets us register simple counters, which aren't very
interesting. It also allows us to register CounterVecs, which are.
With a CounterVec, you can provide a set of labels, which can
later be used to filter the counts.

We were using them wrong, though. This pattern was repeated in a
variety of places in the code:

```rust
// panics with an cardinality mismatch
let my_counter = register(CounterVec::new(opts, &["succeeded", "failed"])?, registry)?;
my_counter.with_label_values(&["succeeded"]).inc()
```

The problem is that the labels provided in the constructor are not
the set of legal values which can be annotated, but a set of individual
label names which can have individual, arbitrary values.

This commit fixes that.

* get av-store subsystem to actually run properly and not die on first signal

* typo fix: incomming -> incoming

* don't disable authority discovery in test nodes

* Fix rococo-v1 missing session keys

* Update node/core/av-store/Cargo.toml

* try dummying out av-store on non-full-nodes

* overseer and subsystems are required only for full nodes

* Reduce the amount of warnings on browser target

* Fix two more warnings

* InclusionInherent should actually have an Inherent module on rococo

* Ancestry: don't return genesis' parent hash

* Update Cargo.lock

* fix broken test

* update test script: specify chainspec as script argument

* Apply suggestions from code review

Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>

* Update node/service/src/lib.rs

Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>

* node/service/src/lib: Return error via ? operator

* post-merge blues

* add is_collator flag

* prevent occasional av-store test panic

* simplify fix; expand application

* run authority_discovery in Role::Discover when collating

* distinguish between proposer closed channel errors

* add IsCollator enum, remove is_collator CLI flag

* improve formatting

* remove nop loop

* Fix some stuff

Co-authored-by: Andronik Ordian <write@reusable.software>
Co-authored-by: Bastian Köcher <git@kchr.de>
Co-authored-by: Fedor Sakharov <fedor.sakharov@gmail.com>
Co-authored-by: Robert Habermeier <robert@Roberts-MBP.lan1>
Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
Co-authored-by: Max Inden <mail@max-inden.de>
2020-10-28 10:26:50 +00:00
Bernhard Schuster f345123748 introduce errors with info (#1834) 2020-10-27 08:10:03 +01:00
Fedor Sakharov 7f4646505a Advertise to already connected validators (#1790)
* Advertise to already connected validators

* Merge the loops and check the view

* Extend a test to capture new logic

* Fix a comment

* Update node/network/collator-protocol/src/collator_side.rs

Co-authored-by: Andronik Ordian <write@reusable.software>

* Update comment

Co-authored-by: Andronik Ordian <write@reusable.software>
2020-10-07 10:42:08 +00:00
Andronik Ordian abf76d27dc collator: fix a typo (#1788)
* collator: fix a typo

* collator: fix more typos

* collator: fix even more typos
2020-10-06 13:00:18 +00:00
Andronik Ordian ca89e3edbe NetworkBridge: validator (authorities) discovery api (#1699)
* stupid, but it compiles

* redo

* cleanup

* add ValidatorDiscovery to msgs

* sketch network bridge code

* ConnectToAuthorities instead of validators

* more stuff

* cleanup

* more stuff

* complete ConnectToAuthoritiesState

* Update node/network/bridge/src/lib.rs

Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>

* Collator protocol subsystem (#1659)

* WIP

* The initial implementation of the collator side.

* Improve comments

* Multiple collation requests

* Add more tests and comments to validator side

* Add comments, remove dead code

* Apply suggestions from code review

Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>

* Fix build after suggested changes

* Also connect to the next validator group

* Remove a Future impl and move TimeoutExt to util

* Minor nits

* Fix build

* Change FetchCollations back to FetchCollation

* Try this

* Final fixes

* Fix build

Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>

* handle multiple in-flight connection requests

* handle cancelled requests

* Update node/core/runtime-api/src/lib.rs

Co-authored-by: Bernhard Schuster <bernhard@ahoi.io>

* redo it again

* more stuff

* redo it again

* update comments

* workaround Future is not Send

* fix trailing spaces

* clarify comments

* bridge: fix compilation in tests

* update more comments

* small fixes

* port collator protocol to new validator discovery api

* collator tests compile

* collator tests pass

* do not revoke a request when the stream receiver is closed

* make revoking opt-in

* fix is_fulfilled

* handle request revokation in collator

* tests

* wait for validator connections asyncronously

* fix compilation

* relabel my todos

* apply Fedor's patch

* resolve reconnection TODO

* resolve revoking TODO

* resolve channel capacity TODO

* resolve peer cloning TODO

* resolve peer disconnected TODO

* resolve PeerSet TODO

* wip tests

* more tests

* resolve Arc TODO

* rename pending to non_revoked

* one more test

* extract utility function into util crate

* fix compilation in tests

* Apply suggestions from code review

Co-authored-by: Fedor Sakharov <fedor.sakharov@gmail.com>

* revert pin_project removal

* fix while let loop

* Revert "revert pin_project removal"

This reverts commit ae7f529d8de982ef66c3007dd1ff74c6ddce80d2.

* fix compilation

* Update node/subsystem/src/messages.rs

* docs on pub items

* guide updates

* remove a TODO

* small guide update

* fix a typo

* link to the issue

* validator discovery: on_request docs

Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
Co-authored-by: Fedor Sakharov <fedor.sakharov@gmail.com>
Co-authored-by: Bernhard Schuster <bernhard@ahoi.io>
2020-10-06 13:34:57 +02:00
Bastian Köcher a4662104db Make collation an optional return (#1787)
This pr changes the collator interface function to return an optional
collation instead of a collation. This is required as the parachain
itself can fail to generate a valid collation for various reason. Now if
the collation fails it will return `None`.

Besides that the pr adds some `RuntimeDebug` derive for `ValidationData`
and removes some whitespaces.
2020-10-06 11:57:10 +02:00
Andronik Ordian de05bec4d6 move Metrics to utils (#1765) 2020-09-29 11:42:20 +00:00
Fedor Sakharov 32cc9e7a44 Collator protocol followup (#1741)
* Metrics

* Dont punish late collations

* Fix metrics

* Update node/network/collator-protocol/src/lib.rs

Co-authored-by: Andronik Ordian <write@reusable.software>

* Change on_request arg to Result

Co-authored-by: Andronik Ordian <write@reusable.software>
2020-09-28 12:40:40 +00:00
Fedor Sakharov 98660cbd94 Collator protocol subsystem (#1659)
* WIP

* The initial implementation of the collator side.

* Improve comments

* Multiple collation requests

* Add more tests and comments to validator side

* Add comments, remove dead code

* Apply suggestions from code review

Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>

* Fix build after suggested changes

* Also connect to the next validator group

* Remove a Future impl and move TimeoutExt to util

* Minor nits

* Fix build

* Change FetchCollations back to FetchCollation

* Try this

* Final fixes

* Fix build

Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
2020-09-10 16:54:59 +03:00