Commit Graph

16 Commits

Author SHA1 Message Date
Bernhard Schuster a5fe710cc6 initial jaeger integration (#2047)
Co-authored-by: Pierre Krieger <pierre.krieger1708@gmail.com>
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
2020-12-11 17:38:55 +01:00
Robert Habermeier 414acdfc54 small improvements for parachains consensus (#2040)
* introduce a waiting period before selecting candidates and bitfields

* add network_bridge=debug tracing for rep

* change to 2.5s timeout in proposer

* pass timeout to proposer

* move timeout back to provisioner

* grumbles

* Update node/core/provisioner/src/lib.rs

* Fix nitpicks

* Fix bug

Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
Co-authored-by: Bastian Köcher <git@kchr.de>
2020-11-30 22:52:30 +01:00
André Silva 700b40679c proposer: guard all provisioner data work with timeout (#2026) 2020-11-27 19:39:22 +01:00
dependabot[bot] 748cbfd820 Bump tracing from 0.1.21 to 0.1.22 (#2001)
Bumps [tracing](https://github.com/tokio-rs/tracing) from 0.1.21 to 0.1.22.
- [Release notes](https://github.com/tokio-rs/tracing/releases)
- [Commits](https://github.com/tokio-rs/tracing/compare/tracing-0.1.21...tracing-0.1.22)

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2020-11-24 09:44:27 +01:00
Andronik Ordian 69b103b1d5 overseer: send_msg should not return an error (#1995)
* send_message should not return an error

* Apply suggestions from code review

Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>

* s/send_logging_error/send_and_log_error

Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
2020-11-23 12:42:14 +01:00
Peter Goodspeed-Niklaus e49989971d Add tracing support to node (#1940)
* drop in tracing to replace log

* add structured logging to trace messages

* add structured logging to debug messages

* add structured logging to info messages

* add structured logging to warn messages

* add structured logging to error messages

* normalize spacing and Display vs Debug

* add instrumentation to the various 'fn run'

* use explicit tracing module throughout

* fix availability distribution test

* don't double-print errors

* remove further redundancy from logs

* fix test errors

* fix more test errors

* remove unused kv_log_macro

* fix unused variable

* add tracing spans to collation generation

* add tracing spans to av-store

* add tracing spans to backing

* add tracing spans to bitfield-signing

* add tracing spans to candidate-selection

* add tracing spans to candidate-validation

* add tracing spans to chain-api

* add tracing spans to provisioner

* add tracing spans to runtime-api

* add tracing spans to availability-distribution

* add tracing spans to bitfield-distribution

* add tracing spans to network-bridge

* add tracing spans to collator-protocol

* add tracing spans to pov-distribution

* add tracing spans to statement-distribution

* add tracing spans to overseer

* cleanup
2020-11-20 12:02:04 +01:00
Andronik Ordian 0a8a607a58 update most of the dependencies (#1946)
* update tiny-keccak to 0.2

* update deps except bitvec and shared_memory

* fix some warning after futures upgrade

* remove useless package rename caused by bug in cargo-upgrade

* revert parity-util-mem *

* remove unused import

* cargo update

* remove all renames on parity-scale-codec

* remove the leftovers

* remove unused dep
2020-11-17 11:16:31 +01:00
Bastian Köcher 04da99de58 Moare fixes for parachains (#1911)
* Moare fixes for parachains

- Sending data to a job should always contain a relay parent. Done this
for the provisioner
- Fixed the `select_availability_bitfields` function. It was assuming we
have one core per validator, while we only have one core per parachain.
- Drive by async "rewrite" in proposer

* Make tests compile

* Update primitives/src/v1.rs

Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>

Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
2020-11-03 18:16:47 +01:00
Peter Goodspeed-Niklaus 1a25c41277 start working on building the real overseer (#1795)
* start working on building the real overseer

Unfortunately, this fails to compile right now due to an upstream
failure to compile which is probably brought on by a recent upgrade
to rustc v1.47.

* fill in AllSubsystems internal constructors

* replace fn make_metrics with Metrics::attempt_to_register

* update to account for #1740

* remove Metrics::register, rename Metrics::attempt_to_register

* add 'static bounds to real_overseer type params

* pass authority_discovery and network_service to real_overseer

It's not straightforwardly obvious that this is the best way to handle
the case when there is no authority discovery service, but it seems
to be the best option available at the moment.

* select a proper database configuration for the availability store db

* use subdirectory for av-store database path

* apply Basti's patch which avoids needing to parameterize everything on Block

* simplify path extraction

* get all tests to compile

* Fix Prometheus double-registry error

for debugging purposes, added this to node/subsystem-util/src/lib.rs:472-476:

```rust
Some(registry) => Self::try_register(registry).map_err(|err| {
	eprintln!("PrometheusError calling {}::register: {:?}", std::any::type_name::<Self>(), err);
	err
}),
```

That pointed out where the registration was failing, which led to
this fix. The test still doesn't pass, but it now fails in a new
and different way!

* authorities must have authority discovery, but not necessarily overseer handlers

* fix broken SpawnedSubsystem impls

detailed logging determined that using the `Box::new` style of
future generation, the `self.run` method was never being called,
leading to dropped receivers / closed senders for those subsystems,
causing the overseer to shut down immediately.

This is not the final fix needed to get things working properly,
but it's a good start.

* use prometheus properly

Prometheus lets us register simple counters, which aren't very
interesting. It also allows us to register CounterVecs, which are.
With a CounterVec, you can provide a set of labels, which can
later be used to filter the counts.

We were using them wrong, though. This pattern was repeated in a
variety of places in the code:

```rust
// panics with an cardinality mismatch
let my_counter = register(CounterVec::new(opts, &["succeeded", "failed"])?, registry)?;
my_counter.with_label_values(&["succeeded"]).inc()
```

The problem is that the labels provided in the constructor are not
the set of legal values which can be annotated, but a set of individual
label names which can have individual, arbitrary values.

This commit fixes that.

* get av-store subsystem to actually run properly and not die on first signal

* typo fix: incomming -> incoming

* don't disable authority discovery in test nodes

* Fix rococo-v1 missing session keys

* Update node/core/av-store/Cargo.toml

* try dummying out av-store on non-full-nodes

* overseer and subsystems are required only for full nodes

* Reduce the amount of warnings on browser target

* Fix two more warnings

* InclusionInherent should actually have an Inherent module on rococo

* Ancestry: don't return genesis' parent hash

* Update Cargo.lock

* fix broken test

* update test script: specify chainspec as script argument

* Apply suggestions from code review

Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>

* Update node/service/src/lib.rs

Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>

* node/service/src/lib: Return error via ? operator

* post-merge blues

* add is_collator flag

* prevent occasional av-store test panic

* simplify fix; expand application

* run authority_discovery in Role::Discover when collating

* distinguish between proposer closed channel errors

* add IsCollator enum, remove is_collator CLI flag

* improve formatting

* remove nop loop

* Fix some stuff

Co-authored-by: Andronik Ordian <write@reusable.software>
Co-authored-by: Bastian Köcher <git@kchr.de>
Co-authored-by: Fedor Sakharov <fedor.sakharov@gmail.com>
Co-authored-by: Robert Habermeier <robert@Roberts-MBP.lan1>
Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
Co-authored-by: Max Inden <mail@max-inden.de>
2020-10-28 10:26:50 +00:00
Bastian Köcher 1a3780aa79 Fix proposer factory prometheus registration (#1862)
The proposer wasn't registered in prometheus and thus, we could not see
its metrics.
2020-10-27 21:55:12 +01:00
Bernhard Schuster f345123748 introduce errors with info (#1834) 2020-10-27 08:10:03 +01:00
Cecile Tonglet fd138d4adb Companion PR for substrate #7328 (#1825)
* WIP

* WIP

* WIP

* WIP

* WIP

* WIP

* Update node/service/src/lib.rs

Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>

* WIP

* "Update Substrate"

Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
Co-authored-by: parity-processbot <>
2020-10-21 15:32:06 +00:00
Andronik Ordian 8eecdd61b8 proposer: wait for a hash to be in the active-leaves set (#1616)
* overseer: add ExternalRequest to Event

* proposer: wait for the hash to be activated

* update comments

* overseer: handle unbounded growth of listeners map

* overseer: fix compilation

* overseer: clean up dead listeners

* overseer: cosmetic changes

* overseer: cosmetic changes t.2

* overseer: add debug_assertions

* overseer: fix formatting
2020-08-20 13:43:36 +00:00
Bastian Köcher 8f3317d056 Update scale codec to latest version to fix bug in future rustc version (#1491)
* Update scale codec to latest version to fix bug in future rustc version

Companion of: https://github.com/paritytech/substrate/pull/6746

* 'Update substrate'

Co-authored-by: parity-processbot <>
2020-07-28 20:43:38 +00:00
Robert Habermeier 3b13cd9a85 Refactor primitives (#1383)
* create a v1 primitives module

* Improve guide on availability types

* punctuate

* new parachains runtime uses new primitives

* tests of new runtime now use new primitives

* add ErasureChunk to guide

* export erasure chunk from v1 primitives

* subsystem crate uses v1 primitives

* node-primitives uses new v1 primitives

* port overseer to new primitives

* new-proposer uses v1 primitives (no ParachainHost anymore)

* fix no-std compilation for primitives

* service-new uses v1 primitives

* network-bridge uses new primitives

* statement distribution uses v1 primitives

* PoV distribution uses v1 primitives; add PoV::hash fn

* move parachain to v0

* remove inclusion_inherent module and place into v1

* remove everything from primitives crate root

* remove some unused old types from v0 primitives

* point everything else at primitives::v0

* squanch some warns up

* add RuntimeDebug import to no-std as well

* port over statement-table and validation

* fix final errors in validation and node-primitives

* add dummy Ord impl to committed candidate receipt

* guide: update CandidateValidationMessage

* add primitive for validationoutputs

* expand CandidateValidationMessage further

* bikeshed

* add some impls to omitted-validation-data and available-data

* expand CandidateValidationMessage

* make erasure-coding generic over v1/v0

* update usages of erasure-coding

* implement commitments.hash()

* use Arc<Pov> for CandidateValidation

* improve new erasure-coding method names

* fix up candidate backing

* update docs a bit

* fix most tests and add short-circuiting to make_pov_available

* fix remainder of candidate backing tests

* squanching warns

* squanch it up

* some fallout

* overseer fallout

* free from polkadot-test-service hell
2020-07-09 21:23:03 -04:00
Peter Goodspeed-Niklaus f2104562d8 implement custom proposer (#1320)
* network bridge skeleton

* move some primitives around and add debug impls

* protocol registration glue & abstract network interface

* add send_msgs to subsystemctx

* select logic

* transform different events into actions and handle

* implement remaining network bridge state machine

* start test skeleton

* make network methods asynchronous

* extract subsystem out to subsystem crate

* port over overseer to subsystem context trait

* fix minimal example

* fix overseer doc test

* update network-bridge crate

* write a subsystem test-helpers crate

* write a network test helper for network-bridge

* set up (broken) view test

* Revamp network to be more async-friendly and not require Sync

* fix spacing

* fix test compilation

* insert side-channel for actions

* Add some more message types to AllMessages

* introduce a test harness

* impl ProvideInherent for InclusionInherent

* reduce import churn; correct expect message

* move inclusion inherent identifier into primitives

It's not clear precisely why this is desired, but it's a pattern
I've seen in several places, so I'm going this to be on the
safe side. Worst case, we can revert this commit pretty easily.

* bump kusama spec_version to placate CI

* copy sc_basic_authorship::{ProposerFactory, Proposer}

We have from the problem description:

> This Proposer will require an OverseerHandle to make requests via.

That's next on the plate.

* use polkadot custom proposer instead of basic-authorship one

* add some tests

* ensure service compiles and passes tests

* fix typo

* fix service-new compilation

* Subsystem test helpers send messages synchronously

* remove smelly action inspector

* remove superfluous let binding

* fix warnings

* add license header

* empty commit; maybe github will notice the one with changes

* Update node/network/bridge/src/lib.rs

Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>

* add sanity check to only include valid inherents

* stub: encapsulate block production mechanics instead of copying them

The goal is to end up with something like what's in
validation::block_production::*, which encapsulates
basic block production mechanics. This is a better idea than
just straight-up copying those mechanics.

* partial implementation of propose fn

Doesn't actually compile yet; need to bring in some other
commits to ensure ProvisionerMessage is a thing, and also
figure out how to get the block hash given the current
context.

* fix compilation

* clear a few more compile errors

* finish fn propose

* broken: add timeout to proposal

* add timeout to proposal

* guide: provisioner is responsible for selecting parachain candidates

* implement ProvisionerMessage::RequestInherentData & update fn propose

* impl CreateProposer::init; clean up

* impl std::error::Error for Error

* document error-handling rationale

* cause polkadot-service-new to compile correctly

* Move potentially-blocking call from fn init -> fn propose

This means that we can wrap the delayed call into the same
timeout check used elsewhere.

* document struct Proposer

* extract provisioner data fetch

This satisfies two requirements:

- only applies the timeout to actually fetching the provisioner data,
  not to constructing the block after
- simplifies the problem of injecting default data if we could not
  get the real provisioner data in time.

Co-authored-by: Robert Habermeier <rphmeier@gmail.com>
Co-authored-by: Gavin Wood <gavin@parity.io>
2020-07-05 19:22:52 +02:00