Commit Graph

70 Commits

Author SHA1 Message Date
asynchronous rob a7780e0797 refactor grid topology to expose more info to subsystems (#6140)
* refactor grid topology to expose more info to subsystems

* fix grid_topology test

* fix overseer test

* Update node/network/protocol/src/grid_topology.rs

Co-authored-by: Vsevolod Stakhov <vsevolod.stakhov@parity.io>

* Update node/network/protocol/src/grid_topology.rs

Co-authored-by: Andronik <write@reusable.software>

* Update node/network/protocol/src/grid_topology.rs

Co-authored-by: Andronik <write@reusable.software>

* fix bug in populating topology

* fmt

Co-authored-by: Vsevolod Stakhov <vsevolod.stakhov@parity.io>
Co-authored-by: Andronik <write@reusable.software>
2022-10-12 23:30:12 +00:00
Dmitry Markin 13ea167bd7 Change validation & collation protocol names to include genesis hash & fork id (#5876) 2022-08-30 19:50:22 +03:00
Bernhard Schuster 3240cb5e4d split NetworkBridge into two subsystems (#5616)
* foo

* rolling session window

* fixup

* remove use statemetn

* fmt

* split NetworkBridge into two subsystems

Pending cleanup

* split

* chore: reexport OrchestraError as OverseerError

* chore: silence warnings

* fixup tests

* chore: add default timenout of 30s to subsystem test helper ctx handle

* single item channel

* fixins

* fmt

* cleanup

* remove dead code

* remove sync bounds again

* wire up shared state

* deal with some FIXMEs

* use distinct tags

Co-authored-by: Andrei Sandu <54316454+sandreim@users.noreply.github.com>

* use tag

Co-authored-by: Andrei Sandu <54316454+sandreim@users.noreply.github.com>

* address naming

tx and rx are common in networking and also have an implicit meaning regarding networking
compared to incoming and outgoing which are already used with subsystems themselvesq

* remove unused sync oracle

* remove unneeded state

* fix tests

* chore: fmt

* do not try to register twice

* leak Metrics type

Co-authored-by: Andrei Sandu <54316454+sandreim@users.noreply.github.com>
Co-authored-by: Andronik <write@reusable.software>
2022-07-12 16:22:36 +00:00
Bernhard Schuster 450ca2baca overseer becomes orchestra (#5542)
* rename overseer-gen to orchestra

Also drop `gum` and use `tracing`.

* make orchestra compile as standalone

* introduce Spawner trait to split from sp_core

Finalizes the independence of orchestra from polkadot-overseer

* slip of the pen

* other fixins

* remove unused import

* Update node/overseer/orchestra/proc-macro/src/impl_builder.rs

Co-authored-by: Vsevolod Stakhov <vsevolod.stakhov@parity.io>

* Update node/overseer/orchestra/proc-macro/src/impl_builder.rs

Co-authored-by: Vsevolod Stakhov <vsevolod.stakhov@parity.io>

* orchestra everywhere

* leaky data

* Bump scale-info from 2.1.1 to 2.1.2 (#5552)

Bumps [scale-info](https://github.com/paritytech/scale-info) from 2.1.1 to 2.1.2.
- [Release notes](https://github.com/paritytech/scale-info/releases)
- [Changelog](https://github.com/paritytech/scale-info/blob/master/CHANGELOG.md)
- [Commits](https://github.com/paritytech/scale-info/compare/v2.1.1...v2.1.2)

---
updated-dependencies:
- dependency-name: scale-info
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Add missing markdown code block delimiter (#5555)

* bitfield-signing: remove util::jobs usage  (#5523)

* Switch to pooling copy-on-write instantiation strategy for WASM (companion for Substrate#11232) (#5337)

* Switch to pooling copy-on-write instantiation strategy for WASM

* Fix compilation of `polkadot-test-service`

* Update comments

* Move `max_memory_size` to `Semantics`

* Rename `WasmInstantiationStrategy` to `WasmtimeInstantiationStrategy`

* Update a safety comment

* update lockfile for {"substrate"}

Co-authored-by: parity-processbot <>

* Fix build

Co-authored-by: Vsevolod Stakhov <vsevolod.stakhov@parity.io>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Malte Kliemann <mail@maltekliemann.com>
Co-authored-by: Chris Sosnin <48099298+slumber@users.noreply.github.com>
Co-authored-by: Koute <koute@users.noreply.github.com>
2022-05-19 13:42:02 +01:00
Vsevolod Stakhov 195b901cc9 Implement grid topology routing for the statement distribution subsystem (#5476)
* Move NewGossipTopology -> SessionGridTopology outside as this implementation is shared

* Add method to return peers difference between topologies

* Implement basic grid topology usage for the bitfield distribution

* Fix tests

* Oops, fix tests

* Add some tests for random routing

* Add a unit test for topology distribution

* Store the current and the previous topology to match sessions boundaries

* Update tests

* Update node/network/bitfield-distribution/src/lib.rs

Co-authored-by: Andronik <write@reusable.software>

* Update node/network/protocol/src/grid_topology.rs

Co-authored-by: Andronik <write@reusable.software>

* Update node/network/bitfield-distribution/src/lib.rs

Co-authored-by: Andronik <write@reusable.software>

* Add some debug

* Fix tests as HashSet order is undefined

* Move session bounded topology to the common code part

* Fix tests

* Allow to select routing by peer index

* Implement grid topology in the statement distribution subsystem

* Fix tests compilation

* Fix test

* Refactor API slightly

* Address review comments

* Reduce runtime error logging severity

* Update node/network/protocol/src/grid_topology.rs

Co-authored-by: Bernhard Schuster <bernhard@ahoi.io>

* Update node/network/bitfield-distribution/src/tests.rs

Co-authored-by: Bernhard Schuster <bernhard@ahoi.io>

* Fmt run

* Use named struct

* Fix logging stuff

* One more accidental fmt damage

* Increase active queue size and add metrics

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

* Revert "Increase active queue size and add metrics"

This reverts commit c4f48e8bded6dfeb9c62814ba2f8d815c34b04cf.

* Use validator index to choose the routing strategy

Noted by: @rphmeier

* Fix test after distribution logic fix

Co-authored-by: Andronik <write@reusable.software>
Co-authored-by: Bernhard Schuster <bernhard@ahoi.io>
Co-authored-by: Andrei Sandu <andrei-mihail@parity.io>
2022-05-18 12:27:06 +00:00
Bernhard Schuster 511891dcce refactor+feat: allow subsystems to send only declared messages, generate graphviz (#5314)
Closes #3774
Closes #3826
2022-05-12 17:39:05 +02:00
Bernhard Schuster d437a33e0b polkadot-node-subsystem package rename mish mash cleanup (#5502)
* unify to polkadot-node-subsystem{,-test-helpers}

* chore: fmt
2022-05-11 15:32:38 +00:00
Vsevolod Stakhov 673a32d968 Use grid topology for bitfileds distribution messages (#5389)
* Move NewGossipTopology -> SessionGridTopology outside as this implementation is shared

* Add method to return peers difference between topologies

* Implement basic grid topology usage for the bitfield distribution

* Fix tests

* Oops, fix tests

* Add some tests for random routing

* Add a unit test for topology distribution

* Store the current and the previous topology to match sessions boundaries

* Update tests

* Update node/network/bitfield-distribution/src/lib.rs

Co-authored-by: Andronik <write@reusable.software>

* Update node/network/protocol/src/grid_topology.rs

Co-authored-by: Andronik <write@reusable.software>

* Update node/network/bitfield-distribution/src/lib.rs

Co-authored-by: Andronik <write@reusable.software>

* Add some debug

* Fix tests as HashSet order is undefined

Co-authored-by: Andronik <write@reusable.software>
2022-05-06 12:24:11 +00:00
asynchronous rob fc4b04db20 Prepare for network protocol version upgrades (#5084)
* explicitly tag network requests with version

* fmt

* make PeerSet more aware of versioning

* some generalization of the network bridge to support upgrades

* walk back some renaming

* walk back some version stuff

* extract version from fallback

* remove V1 from NetworkBridgeUpdate

* add accidentally-removed timer

* implement focusing for versioned messages

* fmt

* fix up network bridge & tests

* remove inaccurate version check in bridge

* remove some TODO [now]s

* fix fallout in statement distribution

* fmt

* fallout in gossip-support

* fix fallout in collator-protocol

* fix fallout in bitfield-distribution

* fix fallout in approval-distribution

* fmt

* use never!

* fmt
2022-04-21 16:34:59 +00:00
asynchronous rob 79ecc53801 Reduce network bandwidth, improve parablock times: optimize approval-distribution (#5164)
* gossip-support: be explicit about dimensions

* some guide updates

* update network-bridge to distinguish x and y dimensions

* get everything to compile

* beginnings

* some TODOs

* polkadot runtime: use relevant_authorities

* make gossip topologies per-session

* better formatting

* gossip support: use current session validators

* expand in comment

* adjust tests and fix index bug

* add past/present/future connection test and clean up code

* fmt

* network bridge: updated types

* update protocols to new gossip topology message

* guide updates

* add session to BlockApprovalMeta

* add session to block info

* refactor knowledge and remove most unify logic

* start replacing gossip_peers with new SessionTopologies

* add routing information to message state

* add some utilities to SessionTopology

* implement new gossip topology logic

* re-implement unify_with_peer

* distribute assignments according to topology

* finish grid topology implementation

* refactor network bridge slightly

* issue connection requests on all past/present/future

* fmt

* address grumbles

* tighten invariants in unify_with_peer

* implement random propagation

* refactor: extract required routing adjustment logic

* some block-age logic

* aggressively propagate messages when finality is slow

* overhaul aggression system to have 3 levels

* add aggression metrics

* remove aggression L3

* reduce random circulation

* remove PeerData

* get approval tests compiling

* use btree_map in known_by to make deterministic

* Revert "use btree_map in known_by to make deterministic"

This reverts commit 330d65343a7bb6fe4dd0f24bd8dbc15c0cbdbd9d.

* test XY grid propagation

* remove stray println

* test unshared dimension propagation

* add random gossip check

* test unify_with_peer better

* test sending after getting gossip topology

* test L1 aggression on originator

* test L1 aggression for non-originators

* test non-originator aggression L2

* fnt

* ~spellcheck

* fix statement-distribution tests

* fix flaky test

* fix metrics typo

* re-send periodically

* test resending

* typo

Co-authored-by: Bernhard Schuster <bernhard@ahoi.io>

* add more metrics about apd messages

* add back unify_with_peer logs

* make Resend an enum

* be more explicit when resending

* fmt

* fix error

* add a TODO for refactoring

* remove debug metrics

* add some guide stuff

* fmt

* update runtime API in test-runtim

Co-authored-by: Bernhard Schuster <bernhard@ahoi.io>
2022-04-19 20:26:55 +02:00
Vsevolod Stakhov af94fc9534 Try to fix out of view statements (#5177)
This issue happens when some peer sends a good but already known Seconded statement and the statement-distribution code does not update the statements_received field in the peer_knowledge structure. Subsequently, a Valid statement causes out-of-view message that is incorrectly emitted and causes reputation lose.

This PR also introduces a concept of passing the specific pseudo-random generator to subsystems to make it easier to write deterministic tests. This functionality is not really necessary for the specific issue and unit test but it can be useful for other tests and subsystems.
2022-03-24 20:18:43 +00:00
Bernhard Schuster 6e3b5f1888 bitfield dist logging cleanup and enhancements (#5147)
* split metrics from bitfield signing

* cleanup all logging

* add a unit test for subset generation

* chore: add one more test to assert need is properly represented

* u8 as usize

* chore: overseer fixin

* fix test

* Update node/network/bitfield-distribution/src/metrics.rs

Co-authored-by: Andrei Sandu <54316454+sandreim@users.noreply.github.com>

* Update node/network/bitfield-distribution/src/metrics.rs

Co-authored-by: Andrei Sandu <54316454+sandreim@users.noreply.github.com>

* fallout from suggested rename

* consistency

Co-authored-by: Andrei Sandu <54316454+sandreim@users.noreply.github.com>
2022-03-18 13:44:38 +00:00
Bernhard Schuster d631f1dea8 observability: tracing gum, automatically cross ref traceID (#5079)
* add some gum

* bump expander

* gum

* fix all remaining issues

* last fixup

* Update node/gum/proc-macro/src/lib.rs

Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>

* change

* netowrk

* fixins

* chore

* allow optional fmt str + args, prep for expr as kv field

* tracing -> gum rename fallout

* restrict further

* allow multiple levels of field accesses

* another round of docs and a slip of the pen

* update ADR

* fixup lock fiel

* use target: instead of target=

* minors

* fix

* chore

* Update node/gum/README.md

Co-authored-by: Andrei Sandu <54316454+sandreim@users.noreply.github.com>

Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
Co-authored-by: Andrei Sandu <54316454+sandreim@users.noreply.github.com>
2022-03-15 11:05:16 +00:00
Robert Habermeier 49f7e5cce4 Finish migration to v2 primitives (#5037)
* remove v0 primitives from polkadot-primitives

* first pass: remove v0

* fix fallout in erasure-coding

* remove v1 primitives, consolidate to v2

* the great import update

* update runtime_api_impl_v1 to v2 as well

* guide: add `Version` request for runtime API

* add version query to runtime API

* reintroduce OldV1SessionInfo in a limited way
2022-03-09 14:01:13 -06:00
wigy e8cb6cdaac Companion to "Updating scale to v3" (#4958)
* Updating dependencies

* Adapting code to scale v3

* Upgrade bitvec to 1.0.0

* Fix bitvec arithmetics

* Update Cargo.lock

* Update sp-io

* Fixing the build

* Yanked scale-info 2.0.0

Co-authored-by: Bastian Köcher <info@kchr.de>
2022-02-25 13:07:06 +01:00
Pierre Krieger 77d70ac0d3 Companion PR for removing Prometheus metrics prefix (#3623)
* Companion PR for removing Prometheus metrics prefix

* Was missing some metrics

* Fix missing renames

* Fix test

* Fixes

* Update test

* Update Substrate

* Second time

* remove prefix from intergration test for zombienet

* update zombienet image

* Update Substrate

Co-authored-by: Bastian Köcher <info@kchr.de>
Co-authored-by: Javier Viola <pepoviola@gmail.com>
2021-12-10 15:24:54 +01:00
Sergei Shulepov 68c03f66f3 Mass replace ,); pattern (#3580)
This is an artifact left by rustfmt which is not dare to remove the
comma being conservative.
2021-08-05 19:53:17 +02:00
Shawn Tabrizi ff5d56fb76 cargo +nightly fmt (#3540)
* cargo +nightly fmt

* add cargo-fmt check to ci

* update ci

* fmt

* fmt

* skip macro

* ignore bridges
2021-08-02 10:47:33 +00:00
Bernhard Schuster 3c9104daff refactor overseer into proc-macro based pattern (#2962) 2021-07-08 21:09:26 +02:00
Andronik Ordian ad9c02886d improved gossip topology (#3270)
* gossip-support: gossip topology

* some fixes

* handle view update for newly added gossip peers

* fix neighbors calculation

* fix test

* resolve TODOs

* typo

* guide updates

* spaces in the guide

* sneaky spaces

* hash randomness

* address some review nits

* use unbounded in bridge for subsystem msg
2021-06-18 14:30:35 -05:00
Andronik Ordian 325cc888b1 cleanup more tests and spaces (#3288)
* cleanup more tests and spaces

* oops
2021-06-17 17:28:10 +00:00
Andronik Ordian 29b531f4ec remove tracing::intrument annotations (#3197)
* remove tracing::intrument annotations

* remove unused param and leftover

* more leftovers
2021-06-09 10:35:18 +00:00
Bernhard Schuster e8652e73db cargo spellcheck (#3067) 2021-05-22 00:15:47 +00:00
Robert Klotzner 0dbdfef95e More secure Signed implementation (#2963)
* Remove signature verification in backing.

`SignedFullStatement` now signals that the signature has already been
checked.

* Remove unused check_payload function.

* Introduced unchecked signed variants.

* Fix inclusion to use unchecked variant.

* More unchecked variants.

* Use unchecked variants in protocols.

* Start fixing statement-distribution.

* Fixup statement distribution.

* Fix inclusion.

* Fix warning.

* Fix backing properly.

* Fix bitfield distribution.

* Make crypto store optional for `RuntimeInfo`.

* Factor out utility functions.

* get_group_rotation_info

* WIP: Collator cleanup + check signatures.

* Convenience signature checking functions.

* Check signature on collator-side.

* Fix warnings.

* Fix collator side tests.

* Get rid of warnings.

* Better Signed/UncheckedSigned implementation.

Also get rid of Encode/Decode for Signed! *party*

* Get rid of dead code.

* Move Signed in its own module.

* into_checked -> try_into_checked

* Fix merge.
2021-05-03 21:41:14 +02:00
Robert Klotzner dacde443f7 Infrastructure improvements (#2897)
* Factor out runtime module into utils.

* Add maybe_authority information to `PeerConnected` event.

We already gather this information in authority discovery, so we might
as well share it with others.

This opens up an easy path to trigger validators differently from normal
nodes, e.g. for prioritization. This change has become more important
now, that we just connect to all validators and therefore just have a
long peer list without any information about those nodes.

* Test fix.
2021-04-16 21:42:20 +02:00
Andronik Ordian e62939ad7a distribution: handle sqrt peer view updates (#2871)
* distribution: handle sqrt peer view updates

* someone please put rustc into my brain

* guide updates
2021-04-10 00:45:25 +02:00
Andronik Ordian 8961628e7c bitfield-dist: check signatures once (#2868) 2021-04-09 17:31:34 +00:00
Andronik Ordian 4df29e71ab bitfield-dist: fix state update on gossip (#2817)
* bitfield-dist: fix state update on gossip

* fixes

* doc fixes

* oops

* 2 lines of code change
2021-04-04 22:25:40 +00:00
Andronik Ordian 9ac35d9f2b gossip: choose a random subset on send instead of limiting connections (#2776)
* gossip: choose random subset on send

* naming bikeshed
2021-03-30 20:59:53 +02:00
Robert Habermeier 064df81ee4 Add block number to activated leaves and associated fixes (#2718)
* add number to `ActivatedLeavesUpdate`

* update subsystem util and overseer

* use new ActivatedLeaf everywhere

* sort view

* sorted and limited view in network bridge

* use live block hash only if it's newer

* grumples
2021-03-26 13:06:40 +01:00
Arkadiy Paronyan 5929d1ef15 Additional logging for polkadot network protocols (#2684)
* Additional logging for polkadot network protocols

* Additional log

* Update node/network/bitfield-distribution/src/lib.rs

Co-authored-by: Robert Habermeier <rphmeier@gmail.com>

* Update node/network/availability-distribution/src/responder.rs

* Added additional chunk info

* Added additional peer info

Co-authored-by: Robert Habermeier <rphmeier@gmail.com>
2021-03-24 11:55:50 +00:00
Bernhard Schuster ea6294fa79 restructure polkadot-node-jaeger (#2642) 2021-03-19 16:51:16 +01:00
Andronik Ordian baa691deb1 prefix parachain log targets with parachain:: (#2600)
* prefix parachain log targets with parachain::

* even more consistent
2021-03-10 17:07:56 +01:00
Robert Klotzner 48409e5548 Request based availability distribution (#2423)
* WIP

* availability distribution, still very wip.

Work on the requesting side of things.

* Some docs on what I intend to do.

* Checkpoint of session cache implementation

as I will likely replace it with something smarter.

* More work, mostly on cache

and getting things to type check.

* Only derive MallocSizeOf and Debug for std.

* availability-distribution: Cache feature complete.

* Sketch out logic in `FetchTask` for actual fetching.

- Compile fixes.
- Cleanup.

* Format cleanup.

* More format fixes.

* Almost feature complete `fetch_task`.

Missing:

- Check for cancel
- Actual querying of peer ids.

* Finish FetchTask so far.

* Directly use AuthorityDiscoveryId in protocol and cache.

* Resolve `AuthorityDiscoveryId` on sending requests.

* Rework fetch_task

- also make it impossible to check the wrong chunk index.
- Export needed function in validator_discovery.

* From<u32> implementation for `ValidatorIndex`.

* Fixes and more integration work.

* Make session cache proper lru cache.

* Use proper lru cache.

* Requester finished.

* ProtocolState -> Requester

Also make sure to not fetch our own chunk.

* Cleanup + fixes.

* Remove unused functions

- FetchTask::is_finished
- SessionCache::fetch_session_info

* availability-distribution responding side.

* Cleanup + Fixes.

* More fixes.

* More fixes.

adder-collator is running!

* Some docs.

* Docs.

* Fix reporting of bad guys.

* Fix tests

* Make all tests compile.

* Fix test.

* Cleanup + get rid of some warnings.

* state -> requester

* Mostly doc fixes.

* Fix test suite.

* Get rid of now redundant message types.

* WIP

* Rob's review remarks.

* Fix test suite.

* core.relay_parent -> leaf for session request.

* Style fix.

* Decrease request timeout.

* Cleanup obsolete errors.

* Metrics + don't fail on non fatal errors.

* requester.rs -> requester/mod.rs

* Panic on invalid BadValidator report.

* Fix indentation.

* Use typed default timeout constant.

* Make channel size 0, as each sender gets one slot anyways.

* Fix incorrect metrics initialization.

* Fix build after merge.

* More fixes.

* Hopefully valid metrics names.

* Better metrics names.

* Some tests that already work.

* Slightly better docs.

* Some more tests.

* Fix network bridge test.
2021-02-26 11:58:07 -06:00
Bernhard Schuster e3f776abed feat/view: assure heads in a view are sorted (#2493)
* feat/view: assure heads in a view are sorted

Allows O(n) comparisons, adds an alternate equiv relation
which takes O(n^2) for integrity verification.

Ref #2133

* revert: remove custom PartialEq impl, there are no duplicates

* fix: do not sort the live_heads, that alters the local view

* refactor/view: heads should not be public

* chore/spellcheck: add unfinalized

* fix/view: add missing len() and is_empty() fns

* quirk

* vec is not view

* Update node/network/approval-distribution/src/tests.rs

Co-authored-by: Andronik Ordian <write@reusable.software>

* Update node/network/bridge/src/lib.rs

Co-authored-by: Andronik Ordian <write@reusable.software>

* Update node/network/protocol/src/lib.rs

Co-authored-by: Andronik Ordian <write@reusable.software>

* fixup comment

* fix botched test

Co-authored-by: Andronik Ordian <write@reusable.software>
2021-02-23 15:39:57 +00:00
Bastian Köcher 2584c121fb Substrate companion for #8163 (#2492)
* Substrate companion for #8163

https://github.com/paritytech/substrate/pull/8163

* "Update Substrate"

Co-authored-by: parity-processbot <>
2021-02-22 14:52:43 +00:00
Bernhard Schuster 49c6aa9a76 feat/jaeger: more spans, more stages (#2477)
* feat/jaeger: more spans, more stages

Stage numbers are still arbitrarily picked.

* feat/jaeger: additional spans

* chore/spellcheck: improve the dictionary

* fix/jaeger JaegerSpan -> jaeger::Span
2021-02-19 14:19:43 +00:00
Bernhard Schuster 85489ceb36 [jaeger] unify all used tags, introduce builder pattern, additional… (#2473)
* feat/jaeger: unify all used tags, introduce builder pattern, additional candidate annotations

* chores

* fixes, incomplete fn rename

* another fix

* more fixes

* silly doctests
2021-02-18 15:07:17 +01:00
Bernhard Schuster 1e2161258b refactor/reputation: unify the values used (#2462)
* refactor/reputation: unify the values used

* chore/rep: rename Annoy* to Cost*, make duplicate message Cost*Repeated

* fix/reputation: lost and found, convert at the boundary to substrate

* refactor/rep: move conversion to base reputation one level down, left conversions

* fix/rep: order of magnitude adjustments

Thanks pierre!

* remove spaces

* chore/rep: give rationale for order of magnitude

* refactor/rep: move UnifiedReputationChange to separate file

* fix/rep: order of magnitudes correction
2021-02-17 17:18:13 +01:00
Robert Klotzner 0cb1ccd122 Generic request/response infrastructure for Polkadot (#2352)
* Move NetworkBridgeEvent to subsystem::messages.

It is not protocol related at all, it is in fact only part of the
subsystem communication as it gets wrapped into messages of each
subsystem.

* Request/response infrastructure is taking shape.

WIP: Does not compile.

* Multiplexer variant not supported by Rusts type system.

* request_response::request type checks.

* Cleanup.

* Minor fixes for request_response.

* Implement request sending + move multiplexer.

Request multiplexer is moved to bridge, as there the implementation is
more straight forward as we can specialize on `AllMessages` for the
multiplexing target.

Sending of requests is mostly complete, apart from a few `From`
instances. Receiving is also almost done, initializtion needs to be
fixed and the multiplexer needs to be invoked.

* Remove obsolete multiplexer.

* Initialize bridge with multiplexer.

* Finish generic request sending/receiving.

Subsystems are now able to receive and send requests and responses via
the overseer.

* Doc update.

* Fixes.

* Link issue for not yet implemented code.

* Fixes suggested by @ordian - thanks!

- start encoding at 0
- don't crash on zero protocols
- don't panic on not yet implemented request handling

* Update node/network/protocol/src/request_response/v1.rs

Use index 0 instead of 1.

Co-authored-by: Andronik Ordian <write@reusable.software>

* Update node/network/protocol/src/request_response.rs

Co-authored-by: Andronik Ordian <write@reusable.software>

* Fix existing tests.

* Better avoidance of division by zoro errors.

* Doc fixes.

* send_request -> start_request.

* Fix missing renamings.

* Update substrate.

* Pass TryConnect instead of true.

* Actually import `IfDisconnected`.

* Fix wrong import.

* Update node/network/bridge/src/lib.rs

typo

Co-authored-by: Pierre Krieger <pierre.krieger1708@gmail.com>

* Update node/network/bridge/src/multiplexer.rs

Remove redundant import.

Co-authored-by: Pierre Krieger <pierre.krieger1708@gmail.com>

* Stop doing tracing from within `From` instance.

Thanks for the catch @tomaka!

* Get rid of redundant import.

* Formatting cleanup.

* Fix tests.

* Add link to issue.

* Clarify comments some more.

* Fix tests.

* Formatting fix.

* tabs

* Fix link

Co-authored-by: Bernhard Schuster <bernhard@ahoi.io>

* Use map_err.

Co-authored-by: Bernhard Schuster <bernhard@ahoi.io>

* Improvements inspired by suggestions by @drahnr.

- Channel size is now determined by function.
- Explicitely scope NetworkService::start_request.

Co-authored-by: Andronik Ordian <write@reusable.software>
Co-authored-by: Pierre Krieger <pierre.krieger1708@gmail.com>
Co-authored-by: Bernhard Schuster <bernhard@ahoi.io>
2021-02-03 20:21:09 +00:00
Bastian Köcher 5be092894e Add one Jaeger span per relay parent (#2196)
* Add one Jaeger span per relay parent

This adds one Jaeger span per relay parent, instead of always creating
new spans per relay parent. This should improve the UI view, because
subsystems are now grouped below one common span.

* Fix doc tests

* Replace `PerLeaveSpan` to `PerLeafSpan`

* More renaming

* Moare

* Update node/subsystem/src/lib.rs

Co-authored-by: Andronik Ordian <write@reusable.software>

* Skip the spans

* Increase `spec_version`

Co-authored-by: Andronik Ordian <write@reusable.software>
2021-01-05 15:09:25 +01:00
Robert Habermeier 41102a6ff9 Improve Network Spans (#2169)
* utility functions for erasure-coding threshold

* add candidate-hash tag to candidate jaeger spans

* debug implementation for jaeger span

* add a span to each live candidate in availability dist.

* availability span covers only our piece

* fix tests

* keep span alive slightly longer

* remove spammy bitfield-gossip-received log

* Revert "remove spammy bitfield-gossip-received log"

This reverts commit 831a2db506d66f64ea516af3caf891e8643f5c43.

* add claimed validator to bitfield-gossip span

* add peer-id to handle-incoming span

* add peer-id to availability distribution span

* Update node/network/availability-distribution/src/lib.rs

Co-authored-by: Bernhard Schuster <bernhard@ahoi.io>

* Update erasure-coding/src/lib.rs

Co-authored-by: Bernhard Schuster <bernhard@ahoi.io>

* Update node/subsystem/src/jaeger.rs

Co-authored-by: Bernhard Schuster <bernhard@ahoi.io>

Co-authored-by: Bernhard Schuster <bernhard@ahoi.io>
2021-01-04 17:29:53 +00:00
Robert Habermeier fa7cced58d adjust span names (#2135)
* adjust span names

* fix compile
2020-12-17 16:11:37 -05:00
Andronik Ordian 3f5156e866 refactor View to include finalized_number (#2128)
* refactor View to include finalized_number

* guide: update the NetworkBridge on BlockFinalized

* av-store: fix the tests

* actually fix tests

* grumbles

* ignore macro doctest

* use Hash::repeat_bytes more consistently

* broadcast empty leaves updates as well

* fix issuing view updates on empty leaves updates
2020-12-17 12:50:58 -05:00
Bernhard Schuster a5fe710cc6 initial jaeger integration (#2047)
Co-authored-by: Pierre Krieger <pierre.krieger1708@gmail.com>
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
2020-12-11 17:38:55 +01:00
Bastian Köcher 884b432035 Do not send messages twice in bitfield distribution (#2005)
* Do not send messages twice in bitfield distribution

This removes a bug which resulted in sending bitfield messages multiple
times by not checking if we already relayed them. Besides that it also
adds an optimization to not relay a message to a peer that send us
this message.

* Review comments

* Break some lines
2020-11-24 16:40:36 -05:00
Andronik Ordian 69b103b1d5 overseer: send_msg should not return an error (#1995)
* send_message should not return an error

* Apply suggestions from code review

Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>

* s/send_logging_error/send_and_log_error

Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
2020-11-23 12:42:14 +01:00
Peter Goodspeed-Niklaus 0a5bc82529 Add Prometheus timers to the subsystems (#1923)
* reexport prometheus-super for ease of use of other subsystems

* add some prometheus timers for collation generation subsystem

* add timing metrics to av-store

* add metrics to candidate backing

* add timing metric to bitfield signing

* add timing metrics to candidate selection

* add timing metrics to candidate-validation

* add timing metrics to chain-api

* add timing metrics to provisioner

* add timing metrics to runtime-api

* add timing metrics to availability-distribution

* add timing metrics to bitfield-distribution

* add timing metrics to collator protocol: collator side

* add timing metrics to collator protocol: validator side

* fix candidate validation test failures

* add timing metrics to pov distribution

* add timing metrics to statement-distribution

* use substrate_prometheus_endpoint prometheus reexport instead of prometheus_super

* don't include JOB_DELAY in bitfield-signing metrics

* give adder-collator ability to easily export its genesis-state and validation code

* wip: adder-collator pushbutton script

* don't attempt to register the adder-collator automatically

Instead, get these values with

```sh
target/release/adder-collator export-genesis-state
target/release/adder-collator export-genesis-wasm
```

And then register the parachain on https://polkadot.js.org/apps/?rpc=ws%3A%2F%2F127.0.0.1%3A9944#/explorer

To collect prometheus data, after running the script, create `prometheus.yml` per the instructions
at https://www.notion.so/paritytechnologies/Setting-up-Prometheus-locally-835cb3a9df7541a781c381006252b5ff
and then run:

```sh
docker run -v `pwd`/prometheus.yml:/etc/prometheus/prometheus.yml:z --network host prom/prometheus
```

Demonstrates that data makes it across to prometheus, though it is likely to be useful in the future
to tweak the buckets.

* Update parachain/test-parachains/adder/collator/src/cli.rs

Co-authored-by: Andronik Ordian <write@reusable.software>

* use the grandpa-pause parameter

* skip metrics in tracing instrumentation

* remove unnecessary grandpa_pause cli param

Co-authored-by: Andronik Ordian <write@reusable.software>
2020-11-20 15:04:51 +01:00
Peter Goodspeed-Niklaus e49989971d Add tracing support to node (#1940)
* drop in tracing to replace log

* add structured logging to trace messages

* add structured logging to debug messages

* add structured logging to info messages

* add structured logging to warn messages

* add structured logging to error messages

* normalize spacing and Display vs Debug

* add instrumentation to the various 'fn run'

* use explicit tracing module throughout

* fix availability distribution test

* don't double-print errors

* remove further redundancy from logs

* fix test errors

* fix more test errors

* remove unused kv_log_macro

* fix unused variable

* add tracing spans to collation generation

* add tracing spans to av-store

* add tracing spans to backing

* add tracing spans to bitfield-signing

* add tracing spans to candidate-selection

* add tracing spans to candidate-validation

* add tracing spans to chain-api

* add tracing spans to provisioner

* add tracing spans to runtime-api

* add tracing spans to availability-distribution

* add tracing spans to bitfield-distribution

* add tracing spans to network-bridge

* add tracing spans to collator-protocol

* add tracing spans to pov-distribution

* add tracing spans to statement-distribution

* add tracing spans to overseer

* cleanup
2020-11-20 12:02:04 +01:00
Andronik Ordian 0a8a607a58 update most of the dependencies (#1946)
* update tiny-keccak to 0.2

* update deps except bitvec and shared_memory

* fix some warning after futures upgrade

* remove useless package rename caused by bug in cargo-upgrade

* revert parity-util-mem *

* remove unused import

* cargo update

* remove all renames on parity-scale-codec

* remove the leftovers

* remove unused dep
2020-11-17 11:16:31 +01:00