Commit Graph

57 Commits

Author SHA1 Message Date
Bernhard Schuster b7a05fd40b [runtime] follow up relay chain cleanups (#4657)
* fix miscalculation of remaining weight

* rename a var

* move out enforcing filtering by dropping inherents

* prepare for dispute statement validity check being split off

* refactor

* refactor, only check disputes we actually want to include

* more refactor and documentation

* refactor and minimize inherent checks

* chore: warnings

* fix a few tests

* fix dedup regression

* fix

* more asserts in tests

* remove some asserts

* chore: fmt

* skip signatures checks, some more

* undo unwatend changes

* Update runtime/parachains/src/paras_inherent/mod.rs

Co-authored-by: sandreim <54316454+sandreim@users.noreply.github.com>

* cleanups, checking CheckedDisputeStatments makes no sense

* integrity, if called create_inherent_inner, it shall do the checks, and not rely on enter_inner

* review comments

* use from impl rather than into

* remove outdated comment

* adjust tests accordingly

* assure no weight is lost

* address review comments

* remove unused import

* split error into two and document

* use assurance, O(n)

* Revert "adjust tests accordingly"

This reverts commit 3cc9a3c449f82db38cea22c48f4a21876603374b.

* fix comment

* fix sorting

* comment

Co-authored-by: sandreim <54316454+sandreim@users.noreply.github.com>
2022-01-20 11:00:29 +00:00
Bernhard Schuster 2457c26a08 enable disputes for known chains, except for polkadot (#4464)
* enable disputes, for all known chains but polkadot

* chore: fmt

* don't propagate disputes either

* review

* remove disputes feature

* remove superfluous line

* Update node/service/src/lib.rs

Co-authored-by: Andronik Ordian <write@reusable.software>

* fixup

* allow being a dummy

* rialto

* add an enum, to make things work better

* overseer

* fix test

* comments

* move condition out

* excess arg

Co-authored-by: Andronik Ordian <write@reusable.software>
2021-12-17 17:50:48 +00:00
Bernhard Schuster fad55b95fa dispute statements node side limiting (#4541) 2021-12-17 15:50:26 +01:00
Bernhard Schuster a558ee0b57 naming consistency (#4539) 2021-12-16 18:16:41 +01:00
Robert Klotzner bd5721fbf5 Revert loop prevention (#4472)
* Provisioner: Only include and sign bitfields on fresh leaves.
2021-12-13 12:20:49 +01:00
Pierre Krieger 77d70ac0d3 Companion PR for removing Prometheus metrics prefix (#3623)
* Companion PR for removing Prometheus metrics prefix

* Was missing some metrics

* Fix missing renames

* Fix test

* Fixes

* Update test

* Update Substrate

* Second time

* remove prefix from intergration test for zombienet

* update zombienet image

* Update Substrate

Co-authored-by: Bastian Köcher <info@kchr.de>
Co-authored-by: Javier Viola <pepoviola@gmail.com>
2021-12-10 15:24:54 +01:00
Bernhard Schuster 0f1a9fb1eb remove Default from CandidateDescriptor (#4484)
* remove Default from CandidateHash

* Apply suggestions from code review

Co-authored-by: Andronik Ordian <write@reusable.software>

* chore: fmt

* remove backed candidate default

* Partial migration away from CandidateReceipt::default

* Remove more CandidateReceipt defaults

* fmt

* Mostly remove CommittedCandidateReceipt default usage

* Remove CommittedCandidateReceipt

* Remove more Defaults from polakdot primitives v1 + fmt

* Remove more Default from polkadot primites v1

* WIP trying to get overseer example + tests to compile

* feat: add primitives test helpers

* reduce deps of helper

* update primitive helpers

* make candidate validation compile

* fixup cargo lock

* make av-store compile

* fixup disputes coordinator tests

* test: fixup backing

* test: fixup approval voting

* fixup bitfield signing

* test: fixup runtime-api

* test: fixup availability dist

* foxi[ pverseer test]

* remove some Defaults, remove bounds from `dummy`

All `fn dummy` in primitives need to be removed anyways.
This aids in the transition.

* it's a test helper, so always use std

* test: fixup parachains runtime tests

Excluding benches.

* fix keyring

* fix paras runtime properly, no more default

* Remove fn dummy() usage from approval voting

* Move TestCandidateBuilder out of av store to test helpers

* Make candidate validation tests pass

* Make most dispute coirdinator tests pass

* Make provisioner tests work

* Make availability recovery tests work with test helpers

* Update polkadot-collator-protocol tests

* Update statement distribution tests

* Update polkadot overseer examples and tests

* Derive default for validation code so we don't break unrelated things

* Make para runtime test pass (no bench)

* Some more work

* chore: cargo fmt

* cargo fix

* avoid some Default::default

* fixup dispute coordinator test

* remove unused crate deps

* remove Default::default wherever possible, replace by dummy_* for the most part

* chore: cargo fmt

* Remove some warnings

* Remove CommittedCandidateReceipt dummy

* Remove CandidateReceipt dummy

* Remove CandidateDescriptor dummy

* Remove commented out code

* Fix para runtime tests

* chore: nightly

* Some updates to the builder

* Dynamically adjust mock head data size

* Make dispute cooridinator tests work

* Fix test candidate_backing_reorders_votes work

* +nightly-2021-10-29 fmt

* Spelling and remove a default use in builder

* Various clean up

* More small updates

* fmt

* More small updates

* Doc comments for test helpers

* cargo run --quiet --release --features=runtime-benchmarks -- benchmark --chain=kusama-dev --steps=50 --repeat=20 --pallet=runtime_parachains::paras_inherent --extrinsic=* --execution=wasm --wasm-execution=compiled --heap-pages=4096 --header=./file_header.txt --output=./runtime/kusama/src/weights/runtime_parachains_paras_inherent.rs

* cargo run --quiet --release --features=runtime-benchmarks -- benchmark --chain=polkadot-dev --steps=50 --repeat=20 --pallet=runtime_parachains::paras_inherent --extrinsic=* --execution=wasm --wasm-execution=compiled --heap-pages=4096 --header=./file_header.txt --output=./runtime/polkadot/src/weights/runtime_parachains_paras_inherent.rs

* Update lib.rs

* review comments

* fix warnings

* fix test by using correct candidate receipt relay parent

Co-authored-by: Andronik Ordian <write@reusable.software>
Co-authored-by: emostov <32168567+emostov@users.noreply.github.com>
Co-authored-by: Parity Bot <admin@parity.io>
Co-authored-by: Gavin Wood <gavin@parity.io>
2021-12-10 12:12:07 +00:00
Lldenaurois 2243121401 Revert "remove provisioner checks (#4254)" (#4375)
* Revert "remove provisioner checks (#4254)"

This reverts commit d5d916a915.

* Remove TODO in implementer's guide
2021-11-30 11:01:55 +00:00
sandreim e4e22f405d Fix Provisioner dispute metrics naming (#4374)
* update dispute metric names

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

* update

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
2021-11-26 11:31:22 +01:00
sandreim 66aed0d0a4 fix provisioner metric docs (#4359)
Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
2021-11-24 15:12:08 +01:00
sandreim e08b0fb506 Add Provisioner dispute metrics (#4352)
* Metrics for InherentDataProvider

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

* Integrate metrics

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

* more changes

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

* fmt

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

* fix

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

* avoid naming confusion

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

* Move to Provisioner.

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

* Add metric documentation

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
2021-11-24 10:35:46 +00:00
Bernhard Schuster d5d916a915 remove provisioner checks (#4254)
* chore/provisioner: move metrics to a separate module

* avoid the duplicate names

* reduce all checks

* fixup tests

* Update node/core/provisioner/src/lib.rs

Co-authored-by: Zeke Mostov <z.mostov@gmail.com>

* chore: fmt

* chore: spellcheck

* doc

* remove the enum anti-pattern

* guide update - remove all the responsibilities

* add another trivial check

* Update node/core/provisioner/src/metrics.rs

Co-authored-by: Andronik Ordian <write@reusable.software>

* Update roadmap/implementers-guide/src/node/utility/provisioner.md

Co-authored-by: Andronik Ordian <write@reusable.software>

* Update node/core/provisioner/src/metrics.rs

Co-authored-by: Andronik Ordian <write@reusable.software>

Co-authored-by: Zeke Mostov <z.mostov@gmail.com>
Co-authored-by: Andronik Ordian <write@reusable.software>
2021-11-19 18:15:59 +00:00
sandreim b0f89bbfbc Per subsystem CPU usage tracking (#4239)
* SubsystemContext: add subsystem name str

Signed-off-by: Andrei Sandu <sandu.andrei@gmail.com>

* Overseer builder proc macro changes

* initilize SubsystemContext name field.
* Add subsystem name in TaskKind::launch_task()

Signed-off-by: Andrei Sandu <sandu.andrei@gmail.com>

* Update ToOverseer enum

Signed-off-by: Andrei Sandu <sandu.andrei@gmail.com>

* Assign subsystem names to orphan tasks

Signed-off-by: Andrei Sandu <sandu.andrei@gmail.com>

* cargo fmt

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

* SubsystemContext: add subsystem name str

Signed-off-by: Andrei Sandu <sandu.andrei@gmail.com>

* Overseer builder proc macro changes

* initilize SubsystemContext name field.
* Add subsystem name in TaskKind::launch_task()

Signed-off-by: Andrei Sandu <sandu.andrei@gmail.com>

* Update ToOverseer enum

Signed-off-by: Andrei Sandu <sandu.andrei@gmail.com>

* Assign subsystem names to orphan tasks

Signed-off-by: Andrei Sandu <sandu.andrei@gmail.com>

* cargo fmt

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

* Rebase changes for new spawn() group param

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

* Add subsystem constat in JobTrait

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

* Add subsystem string

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

* Fix tests

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

* Fix spawn() calls

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

* cargo fmt

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

* Fix

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

* Fix tests

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

* fix

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

* Fix more tests

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

* Address PR review feedback #1

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

* Address PR review round 2

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

* Fixes
- remove JobTrait::Subsystem
- fix tests

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

* update Cargo.lock

Co-authored-by: Andronik Ordian <write@reusable.software>
2021-11-11 18:53:37 +00:00
Shawn Tabrizi ff5d56fb76 cargo +nightly fmt (#3540)
* cargo +nightly fmt

* add cargo-fmt check to ci

* update ci

* fmt

* fmt

* skip macro

* ignore bridges
2021-08-02 10:47:33 +00:00
Lldenaurois 2d66b8f256 Dispute Coordinator: Batch queries (#3459)
* disputes: Allow batch queries in dispute-coordinator

This commit moves to batch queries when responding to QueryCandidateVotes
messages. This simplifies the code in the provisioner and dispute-coordinator
by no longer requiring to make use of a FuturesOrdered when awaiting multiple
quries. Instead, the provisioner need only request the batch itself.

* node/approval-voting: Address Feedback to fail on query element missing.

* Address feedback

* Fix implementer's guide
2021-07-12 20:06:14 -05:00
Robert Habermeier e512222749 Wire up candidate backing, approval-voting to disputes (#3348)
* add a from_backing_statement to SignedDisputeStatement

* inform dispute coordinator of all backing statements

* add dispute coordinator message to backing tests

* send positive dispute statement with every approval

* issue disputes when encountering invalid candidates.

* try to fix flaky test for CI (passed locally)

* guide: keep track of concluded-positive disputes until pruned

* guide: block implications

* guide: new dispute inherent flow

* mostly implement recency changes for dispute coordinator

* add a clock to dispute coordinator

* adjust DB tests

* fix and add new dispute coordinator tests

* provisioner: select disputes

* import all validators' approvals

* address nit: refactor backing statement submission

* gracefully handle disconnected dispute coordinator

* remove `review` comment

* fix up old_tests

* fix approval-voting compilation

* fix backing compilation

* use known-leaves in WaitForActivation

* follow-up test fixing

* add back allow(dead_code)
2021-07-09 21:15:51 +00:00
Andronik Ordian 7b054b3c77 cleanup stream polls (#3397)
* metered-channel: remove dead code

* we don't need no fuse

* even more
2021-07-02 10:23:26 +02:00
Andronik Ordian cbeb7d0afd tabify tests (#3220)
* tabify tests

* move mod tests; up
2021-06-12 15:39:18 +02:00
Andronik Ordian 29b531f4ec remove tracing::intrument annotations (#3197)
* remove tracing::intrument annotations

* remove unused param and leftover

* more leftovers
2021-06-09 10:35:18 +00:00
Andronik Ordian 2ff5c9b995 tests: use future::join instead of future::select (#2813)
* tests/av-store: use future::join instead of future::select

* tests/backing: use future::join instead of future::select

* tests/provisioner: use future::join instead of future::select

* tests/av-dist: use future::join instead of future::select

* tests/av-recovery: use future::join instead of future::select

* tests/bridge: use future::join instead of future::select

* tests/collator-protocol: use future::join instead of future::select

* tests/stmt-dist: use future::join instead of future::select

* fix tests
2021-04-05 18:30:27 +02:00
Robert Habermeier 0794f69306 Add dispute types and change InclusionInherent to ParasInherent (#2791)
* dispute types

* add Debug to dispute primitives in std and InherentData

* use ParachainsInherentData on node-side

* change inclusion_inherent to paras_inherent

* RuntimeDebug

* add type parameter to PersistedValidationData users

* fix test client

* spaces

* fix collation-generation test

* fix provisioner tests

* remove references to inclusion inherent
2021-04-01 18:23:27 +02:00
Robert Habermeier 8ebbe19d10 Split NetworkBridge and break cycles with Unbounded (#2736)
* overseer: pass messages directly between subsystems

* test that message is held on to

* Update node/overseer/src/lib.rs

Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>

* give every subsystem an unbounded sender too

* remove metered_channel::name

1. we don't provide good names
2. these names are never used anywhere

* unused mut

* remove unnecessary &mut

* subsystem unbounded_send

* remove unused MaybeTimer

We have channel size metrics that serve the same purpose better now and the implementation of message timing was pretty ugly.

* remove comment

* split up senders and receivers

* update metrics

* fix tests

* fix test subsystem context

* use SubsystemSender in jobs system now

* refactor of awful jobs code

* expose public `run` on JobSubsystem

* update candidate backing to new jobs & use unbounded

* bitfield signing

* candidate-selection

* provisioner

* approval voting: send unbounded for assignment/approvals

* async not needed

* begin bridge split

* split up network tasks into background worker

* port over network bridge

* Update node/network/bridge/src/lib.rs

Co-authored-by: Andronik Ordian <write@reusable.software>

* rename ValidationWorkerNotifications

Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
Co-authored-by: Andronik Ordian <write@reusable.software>
2021-03-29 01:18:53 +02:00
Bernhard Schuster ea6294fa79 restructure polkadot-node-jaeger (#2642) 2021-03-19 16:51:16 +01:00
Robert Habermeier 94d50afd4e Backing and collator protocol traces including para-id (#2620)
* improve backing/provisioner spans

* span for collation requests

* add para_id to unbacked candidate spans

* differentiate validation-construction and find-assignment in selection

* better find-assignment spans

* organize unbacked-candidate spans directly under job root

* Update node/core/provisioner/src/lib.rs

Co-authored-by: Andronik Ordian <write@reusable.software>

Co-authored-by: Andronik Ordian <write@reusable.software>
2021-03-14 16:51:14 +00:00
Andronik Ordian baa691deb1 prefix parachain log targets with parachain:: (#2600)
* prefix parachain log targets with parachain::

* even more consistent
2021-03-10 17:07:56 +01:00
Robert Klotzner 48409e5548 Request based availability distribution (#2423)
* WIP

* availability distribution, still very wip.

Work on the requesting side of things.

* Some docs on what I intend to do.

* Checkpoint of session cache implementation

as I will likely replace it with something smarter.

* More work, mostly on cache

and getting things to type check.

* Only derive MallocSizeOf and Debug for std.

* availability-distribution: Cache feature complete.

* Sketch out logic in `FetchTask` for actual fetching.

- Compile fixes.
- Cleanup.

* Format cleanup.

* More format fixes.

* Almost feature complete `fetch_task`.

Missing:

- Check for cancel
- Actual querying of peer ids.

* Finish FetchTask so far.

* Directly use AuthorityDiscoveryId in protocol and cache.

* Resolve `AuthorityDiscoveryId` on sending requests.

* Rework fetch_task

- also make it impossible to check the wrong chunk index.
- Export needed function in validator_discovery.

* From<u32> implementation for `ValidatorIndex`.

* Fixes and more integration work.

* Make session cache proper lru cache.

* Use proper lru cache.

* Requester finished.

* ProtocolState -> Requester

Also make sure to not fetch our own chunk.

* Cleanup + fixes.

* Remove unused functions

- FetchTask::is_finished
- SessionCache::fetch_session_info

* availability-distribution responding side.

* Cleanup + Fixes.

* More fixes.

* More fixes.

adder-collator is running!

* Some docs.

* Docs.

* Fix reporting of bad guys.

* Fix tests

* Make all tests compile.

* Fix test.

* Cleanup + get rid of some warnings.

* state -> requester

* Mostly doc fixes.

* Fix test suite.

* Get rid of now redundant message types.

* WIP

* Rob's review remarks.

* Fix test suite.

* core.relay_parent -> leaf for session request.

* Style fix.

* Decrease request timeout.

* Cleanup obsolete errors.

* Metrics + don't fail on non fatal errors.

* requester.rs -> requester/mod.rs

* Panic on invalid BadValidator report.

* Fix indentation.

* Use typed default timeout constant.

* Make channel size 0, as each sender gets one slot anyways.

* Fix incorrect metrics initialization.

* Fix build after merge.

* More fixes.

* Hopefully valid metrics names.

* Better metrics names.

* Some tests that already work.

* Slightly better docs.

* Some more tests.

* Fix network bridge test.
2021-02-26 11:58:07 -06:00
Robert Habermeier 3300b53306 Approval Checking Improvements Omnibus (#2480)
* add tracing to approval voting

* notify if session info is not working

* add dispute period to chain specs

* propagate genesis session to parachains runtime

* use `on_genesis_session`

* protect against zero cores in computation

* tweak voting rule to be based off of best and add logs

* genesis configuration should use VRF slots only

* swallow more keystore errors

* add some docs

* make validation-worker args non-optional and update clap

* better tracing for bitfield signing and provisioner

* pass amount of bits in bitfields to inclusion instead of recomputing

* debug -> warn for some logs

* better tracing for availability recovery

* a little av-store tracing

* bridge: forward availability recovery messages

* add missing try_from impl

* some more tracing

* improve approval distribution tracing

* guide: hold onto pending approval messages until NewBlocks

* Hold onto pending approval messages until NewBlocks

* guide: adjust comment

* process all actions for one wakeup at a time

* vec

* fix network bridge test

* replace randomness-collective-flip with Babe

* remove PairNotFound
2021-02-23 14:12:28 -06:00
Bastian Köcher 2584c121fb Substrate companion for #8163 (#2492)
* Substrate companion for #8163

https://github.com/paritytech/substrate/pull/8163

* "Update Substrate"

Co-authored-by: parity-processbot <>
2021-02-22 14:52:43 +00:00
Bernhard Schuster 49c6aa9a76 feat/jaeger: more spans, more stages (#2477)
* feat/jaeger: more spans, more stages

Stage numbers are still arbitrarily picked.

* feat/jaeger: additional spans

* chore/spellcheck: improve the dictionary

* fix/jaeger JaegerSpan -> jaeger::Span
2021-02-19 14:19:43 +00:00
Robert Habermeier 59e2a810bb remove unused RequestBlockAuthorshipData (#2455) 2021-02-17 11:54:51 -06:00
Robert Habermeier 9e525e296d Allow at most one candidate with a code upgrade in a block (#2456)
* guide: max one candidate with upgrade per block

* max one candidate with upgrade per block

* ignore on runtime level too
2021-02-17 10:42:06 -06:00
Guillaume Thiolliere 29f12f3f48 Upgrade codec to 2.0 and bitvec to 0.20 (companion) (#2343)
* upgrade codec and bitvec

* "Update Substrate"

Co-authored-by: parity-processbot <>
2021-01-29 14:35:45 +01:00
Bastian Köcher 5be092894e Add one Jaeger span per relay parent (#2196)
* Add one Jaeger span per relay parent

This adds one Jaeger span per relay parent, instead of always creating
new spans per relay parent. This should improve the UI view, because
subsystems are now grouped below one common span.

* Fix doc tests

* Replace `PerLeaveSpan` to `PerLeafSpan`

* More renaming

* Moare

* Update node/subsystem/src/lib.rs

Co-authored-by: Andronik Ordian <write@reusable.software>

* Skip the spans

* Increase `spec_version`

Co-authored-by: Andronik Ordian <write@reusable.software>
2021-01-05 15:09:25 +01:00
Robert Habermeier 38276b08a4 Add candidate info to OccupiedCore (#2134)
* guide: add candidate information to OccupiedCore

* add descriptor and hash to occupied core type

* guide: add candidate hash to inclusion

* runtime: return candidate info in core state

* bitfield signing: stop querying runtime as much

* minimize going to runtime in availability distribution

* fix availability distribution tests

* guide: remove para ID from Occupied core

* get all crates compiling
2020-12-18 11:20:37 +03:00
Robert Habermeier fa7cced58d adjust span names (#2135)
* adjust span names

* fix compile
2020-12-17 16:11:37 -05:00
Bernhard Schuster a5fe710cc6 initial jaeger integration (#2047)
Co-authored-by: Pierre Krieger <pierre.krieger1708@gmail.com>
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
2020-12-11 17:38:55 +01:00
Bastian Köcher d7047578e9 Fix tests on master (#2080)
Because of a bug in the test script, we didn't stopped CI when the main
tests are failed.
2020-12-07 14:47:39 +00:00
Peter Goodspeed-Niklaus e7e9605f87 do not store backed candidates in the provisioner (#1909)
* guide: non-semantic changes

* guide: update per the issue description

* GetBackedCandidates operates on multiple hashes now

* GetBackedCandidates still needs a relay parent

* implement changes specified in guide

* distinguish between various occasions for canceled oneshots

* add tracing info to getbackedcandidates

* REVERT ME: add tracing messages for GetBackedCandidates

Note that these messages are only sometimes actually passed on to the
candidate backing subsystem, with the consequence that it is
unexpectedly frequent that the provisioner fails to create its
provisionable data.

* REVERT ME: more tracing logging

* REVERT ME: log when CandidateBackingJob receives any message at all

* REVERT ME: log when send_msg sends a message to a job

* fix candidate-backing tests

* streamline GetBackedCandidates

This uses table.attested_candidate instead of table.get_candidate, because
it's not obvious how to get a BackedCandidate from just a
CommittedCandidateReceipt.

* REVERT ME: more logging tracing job lifespans

* promote warning about job premature demise

* don't terminate CandiateBackingJob::run_loop in event of failure to process message

* Revert "REVERT ME: more logging tracing job lifespans"

This reverts commit 7365f2fb3dec988d95cfcd317eba75587fe7fd16.

* Revert "REVERT ME: log when send_msg sends a message to a job"

This reverts commit 58e46aad038e6517d6d56390c8be65b046a21884.

* Revert "REVERT ME: log when CandidateBackingJob receives any message at all"

This reverts commit 0d6f38413c7c66b5e9e81dabc587906fa9f82656.

* Revert "REVERT ME: more tracing logging"

This reverts commit 675fd2628e84d1596965280e7314155ef21b28e6.

* Revert "REVERT ME: add tracing messages for GetBackedCandidates"

This reverts commit e09e156493430b33b6c8ab4b5cedb3f2f91afd51.

* formatting

* add logging message to CandidateBackingJob::run_loop start

* REVERT ME: add tracing to candidate-backing job creation

* run candidatebacking loop even if no assignment

* use unique error variants for each canceled oneshot

* Revert "REVERT ME: add tracing to candidate-backing job creation"

This reverts commit 8ce5f4f0bd7186dade134b118751480f72ea1fd6.

* try_runtime_api more to reduce silent exits

* add sanity check that returned backed candidates preserve ordering

* remove redundant err attribute
2020-12-04 11:24:59 +01:00
Robert Habermeier 414acdfc54 small improvements for parachains consensus (#2040)
* introduce a waiting period before selecting candidates and bitfields

* add network_bridge=debug tracing for rep

* change to 2.5s timeout in proposer

* pass timeout to proposer

* move timeout back to provisioner

* grumbles

* Update node/core/provisioner/src/lib.rs

* Fix nitpicks

* Fix bug

Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
Co-authored-by: Bastian Köcher <git@kchr.de>
2020-11-30 22:52:30 +01:00
Bastian Köcher 536dceb4f6 Simplify subsystem jobs (#2037)
* Simplify subsystem jobs

This pr simplifies the subsystem jobs interface. Instead of requiring an
extra message that is used to signal that a job should be ended, a job
now ends when the receiver returns `None`. Besides that it changes the
interface to enforce that messages to a job provide a relay parent.

* Drop ToJobTrait

* Remove FromJob

We always convert this message to FromJobCommand anyway.
2020-11-30 17:01:26 +01:00
Robert Habermeier d6307a4978 allow jobs to spawn sub-tasks (#2030)
* allow jobs to spawn sub-tasks

* fix fallout in subsytems
2020-11-28 15:12:43 -05:00
Peter Goodspeed-Niklaus 0a5bc82529 Add Prometheus timers to the subsystems (#1923)
* reexport prometheus-super for ease of use of other subsystems

* add some prometheus timers for collation generation subsystem

* add timing metrics to av-store

* add metrics to candidate backing

* add timing metric to bitfield signing

* add timing metrics to candidate selection

* add timing metrics to candidate-validation

* add timing metrics to chain-api

* add timing metrics to provisioner

* add timing metrics to runtime-api

* add timing metrics to availability-distribution

* add timing metrics to bitfield-distribution

* add timing metrics to collator protocol: collator side

* add timing metrics to collator protocol: validator side

* fix candidate validation test failures

* add timing metrics to pov distribution

* add timing metrics to statement-distribution

* use substrate_prometheus_endpoint prometheus reexport instead of prometheus_super

* don't include JOB_DELAY in bitfield-signing metrics

* give adder-collator ability to easily export its genesis-state and validation code

* wip: adder-collator pushbutton script

* don't attempt to register the adder-collator automatically

Instead, get these values with

```sh
target/release/adder-collator export-genesis-state
target/release/adder-collator export-genesis-wasm
```

And then register the parachain on https://polkadot.js.org/apps/?rpc=ws%3A%2F%2F127.0.0.1%3A9944#/explorer

To collect prometheus data, after running the script, create `prometheus.yml` per the instructions
at https://www.notion.so/paritytechnologies/Setting-up-Prometheus-locally-835cb3a9df7541a781c381006252b5ff
and then run:

```sh
docker run -v `pwd`/prometheus.yml:/etc/prometheus/prometheus.yml:z --network host prom/prometheus
```

Demonstrates that data makes it across to prometheus, though it is likely to be useful in the future
to tweak the buckets.

* Update parachain/test-parachains/adder/collator/src/cli.rs

Co-authored-by: Andronik Ordian <write@reusable.software>

* use the grandpa-pause parameter

* skip metrics in tracing instrumentation

* remove unnecessary grandpa_pause cli param

Co-authored-by: Andronik Ordian <write@reusable.software>
2020-11-20 15:04:51 +01:00
Peter Goodspeed-Niklaus e49989971d Add tracing support to node (#1940)
* drop in tracing to replace log

* add structured logging to trace messages

* add structured logging to debug messages

* add structured logging to info messages

* add structured logging to warn messages

* add structured logging to error messages

* normalize spacing and Display vs Debug

* add instrumentation to the various 'fn run'

* use explicit tracing module throughout

* fix availability distribution test

* don't double-print errors

* remove further redundancy from logs

* fix test errors

* fix more test errors

* remove unused kv_log_macro

* fix unused variable

* add tracing spans to collation generation

* add tracing spans to av-store

* add tracing spans to backing

* add tracing spans to bitfield-signing

* add tracing spans to candidate-selection

* add tracing spans to candidate-validation

* add tracing spans to chain-api

* add tracing spans to provisioner

* add tracing spans to runtime-api

* add tracing spans to availability-distribution

* add tracing spans to bitfield-distribution

* add tracing spans to network-bridge

* add tracing spans to collator-protocol

* add tracing spans to pov-distribution

* add tracing spans to statement-distribution

* add tracing spans to overseer

* cleanup
2020-11-20 12:02:04 +01:00
Andronik Ordian 2cde7732da more resilient subsystems (#1908)
* backing: extract log target

* bitfield-signing: extract log target

* utils: fix a typo

* provisioner: extract log target

* candidate selection: remove unused error variant

* bitfield-distribution: change the return type of run

* pov-distribution: extract log target

* collator-protocol: simplify runtime request

* collation-generation: do not exit early on error

* collation-generation: do not exit on double init

* collator-protocol: do not exit on errors and rename LOG_TARGET

* collator-protocol: a workaround for ununused imports warning

* Update node/network/bitfield-distribution/src/lib.rs

* collation-generation: elevate warn! to error!

* collator-protocol: fix imports

* post merge fix

* fix compilation
2020-11-05 14:22:41 +01:00
Robert Habermeier 7c84e38809 fix bitfield selection (#1913) 2020-11-04 12:12:19 -05:00
Bastian Köcher 04da99de58 Moare fixes for parachains (#1911)
* Moare fixes for parachains

- Sending data to a job should always contain a relay parent. Done this
for the provisioner
- Fixed the `select_availability_bitfields` function. It was assuming we
have one core per validator, while we only have one core per parachain.
- Drive by async "rewrite" in proposer

* Make tests compile

* Update primitives/src/v1.rs

Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>

Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
2020-11-03 18:16:47 +01:00
Robert Habermeier 1f48725c54 Backing fixes (#1897)
* Commit my changes

* some backing fixes

* indentation

* fix backing tests

* tweak includability rules

* comment

* Update node/core/backing/src/lib.rs

Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>

* Update node/core/backing/src/lib.rs

Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>

* Update node/core/backing/src/lib.rs

Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>

* Update node/core/backing/src/lib.rs

Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>

Co-authored-by: Bastian Köcher <git@kchr.de>
Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
2020-11-02 18:37:50 +00:00
Peter Goodspeed-Niklaus 1a25c41277 start working on building the real overseer (#1795)
* start working on building the real overseer

Unfortunately, this fails to compile right now due to an upstream
failure to compile which is probably brought on by a recent upgrade
to rustc v1.47.

* fill in AllSubsystems internal constructors

* replace fn make_metrics with Metrics::attempt_to_register

* update to account for #1740

* remove Metrics::register, rename Metrics::attempt_to_register

* add 'static bounds to real_overseer type params

* pass authority_discovery and network_service to real_overseer

It's not straightforwardly obvious that this is the best way to handle
the case when there is no authority discovery service, but it seems
to be the best option available at the moment.

* select a proper database configuration for the availability store db

* use subdirectory for av-store database path

* apply Basti's patch which avoids needing to parameterize everything on Block

* simplify path extraction

* get all tests to compile

* Fix Prometheus double-registry error

for debugging purposes, added this to node/subsystem-util/src/lib.rs:472-476:

```rust
Some(registry) => Self::try_register(registry).map_err(|err| {
	eprintln!("PrometheusError calling {}::register: {:?}", std::any::type_name::<Self>(), err);
	err
}),
```

That pointed out where the registration was failing, which led to
this fix. The test still doesn't pass, but it now fails in a new
and different way!

* authorities must have authority discovery, but not necessarily overseer handlers

* fix broken SpawnedSubsystem impls

detailed logging determined that using the `Box::new` style of
future generation, the `self.run` method was never being called,
leading to dropped receivers / closed senders for those subsystems,
causing the overseer to shut down immediately.

This is not the final fix needed to get things working properly,
but it's a good start.

* use prometheus properly

Prometheus lets us register simple counters, which aren't very
interesting. It also allows us to register CounterVecs, which are.
With a CounterVec, you can provide a set of labels, which can
later be used to filter the counts.

We were using them wrong, though. This pattern was repeated in a
variety of places in the code:

```rust
// panics with an cardinality mismatch
let my_counter = register(CounterVec::new(opts, &["succeeded", "failed"])?, registry)?;
my_counter.with_label_values(&["succeeded"]).inc()
```

The problem is that the labels provided in the constructor are not
the set of legal values which can be annotated, but a set of individual
label names which can have individual, arbitrary values.

This commit fixes that.

* get av-store subsystem to actually run properly and not die on first signal

* typo fix: incomming -> incoming

* don't disable authority discovery in test nodes

* Fix rococo-v1 missing session keys

* Update node/core/av-store/Cargo.toml

* try dummying out av-store on non-full-nodes

* overseer and subsystems are required only for full nodes

* Reduce the amount of warnings on browser target

* Fix two more warnings

* InclusionInherent should actually have an Inherent module on rococo

* Ancestry: don't return genesis' parent hash

* Update Cargo.lock

* fix broken test

* update test script: specify chainspec as script argument

* Apply suggestions from code review

Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>

* Update node/service/src/lib.rs

Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>

* node/service/src/lib: Return error via ? operator

* post-merge blues

* add is_collator flag

* prevent occasional av-store test panic

* simplify fix; expand application

* run authority_discovery in Role::Discover when collating

* distinguish between proposer closed channel errors

* add IsCollator enum, remove is_collator CLI flag

* improve formatting

* remove nop loop

* Fix some stuff

Co-authored-by: Andronik Ordian <write@reusable.software>
Co-authored-by: Bastian Köcher <git@kchr.de>
Co-authored-by: Fedor Sakharov <fedor.sakharov@gmail.com>
Co-authored-by: Robert Habermeier <robert@Roberts-MBP.lan1>
Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
Co-authored-by: Max Inden <mail@max-inden.de>
2020-10-28 10:26:50 +00:00
Bernhard Schuster f345123748 introduce errors with info (#1834) 2020-10-27 08:10:03 +01:00
Rakan Alhneiti bd75a4ce18 Update to work with async keystore – Companion PR for #7000 (#1740)
* Fix keystore types

* Use SyncCryptoStorePtr

* Borrow keystore

* Fix unused imports

* Fix polkadot service

* Fix bitfield-distribution tests

* Fix indentation

* Fix backing tests

* Fix tests

* Fix provisioner tests

* Removed SyncCryptoStorePtr

* Fix services

* Address PR feedback

* Address PR feedback - 2

* Update CryptoStorePtr imports to be from sp_keystore

* Typo

* Fix CryptoStore import

* Document the reason behind using filesystem keystore

* Remove VALIDATORS

* Fix duplicate dependency

* Mark sp-keystore as optional

* Fix availability distribution

* Fix call to sign_with

* Fix keystore usage

* Remove tokio and fix parachains Cargo config

* Typos

* Fix keystore dereferencing

* Fix CryptoStore import

* Fix provisioner

* Fix node backing

* Update services

* Cleanup dependencies

* Use sync_keystore

* Fix node service

* Fix node service - 2

* Fix node service - 3

* Rename CryptoStorePtr to SyncCryptoStorePtr

* "Update Substrate"

* Apply suggestions from code review

* Update node/core/backing/Cargo.toml

* Update primitives/src/v0.rs

Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>

* Fix wasm build

* Update Cargo.lock

Co-authored-by: parity-processbot <>
Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
2020-10-09 10:54:03 +00:00