* use snake_case for log targets
* remove unused continue
* validator_discovery: when disconnecting, use all addresses
* validator_discovery: simplify request revokation
* fix a typo
* drop in tracing to replace log
* add structured logging to trace messages
* add structured logging to debug messages
* add structured logging to info messages
* add structured logging to warn messages
* add structured logging to error messages
* normalize spacing and Display vs Debug
* add instrumentation to the various 'fn run'
* use explicit tracing module throughout
* fix availability distribution test
* don't double-print errors
* remove further redundancy from logs
* fix test errors
* fix more test errors
* remove unused kv_log_macro
* fix unused variable
* add tracing spans to collation generation
* add tracing spans to av-store
* add tracing spans to backing
* add tracing spans to bitfield-signing
* add tracing spans to candidate-selection
* add tracing spans to candidate-validation
* add tracing spans to chain-api
* add tracing spans to provisioner
* add tracing spans to runtime-api
* add tracing spans to availability-distribution
* add tracing spans to bitfield-distribution
* add tracing spans to network-bridge
* add tracing spans to collator-protocol
* add tracing spans to pov-distribution
* add tracing spans to statement-distribution
* add tracing spans to overseer
* cleanup
* Make sure validator discovery works with a delayed peer to validator mapping
Currently the implementation checks on connect of a peer if this peer is
a validator by asking the authority discovery. It can now happen that
the authority discovery is not yet aware that a given peer is an
authority. This can for example happen on start up of the node.
This pr changes the behavior, to make it possible to later associate a
peer to a validator id. Instead of just storing the connected
validators, we now store all connected peers with a vector of associated
validator ids. When we get a request to connect to a given given set of
validators, we start by checking the connected peers. If we didn't find
a validator id in the connected peers, we ask the authority discovery
for the peerid of a given authority id. When the returned peerid is part
of our connected peers set, we cache and return the authority id.
* Update node/network/bridge/Cargo.toml
Co-authored-by: Pierre Krieger <pierre.krieger1708@gmail.com>
* Update node/network/bridge/src/validator_discovery.rs
Co-authored-by: Pierre Krieger <pierre.krieger1708@gmail.com>
* Update `Cargo.lock`
Co-authored-by: Pierre Krieger <pierre.krieger1708@gmail.com>
* stupid, but it compiles
* redo
* cleanup
* add ValidatorDiscovery to msgs
* sketch network bridge code
* ConnectToAuthorities instead of validators
* more stuff
* cleanup
* more stuff
* complete ConnectToAuthoritiesState
* Update node/network/bridge/src/lib.rs
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
* Collator protocol subsystem (#1659)
* WIP
* The initial implementation of the collator side.
* Improve comments
* Multiple collation requests
* Add more tests and comments to validator side
* Add comments, remove dead code
* Apply suggestions from code review
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
* Fix build after suggested changes
* Also connect to the next validator group
* Remove a Future impl and move TimeoutExt to util
* Minor nits
* Fix build
* Change FetchCollations back to FetchCollation
* Try this
* Final fixes
* Fix build
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
* handle multiple in-flight connection requests
* handle cancelled requests
* Update node/core/runtime-api/src/lib.rs
Co-authored-by: Bernhard Schuster <bernhard@ahoi.io>
* redo it again
* more stuff
* redo it again
* update comments
* workaround Future is not Send
* fix trailing spaces
* clarify comments
* bridge: fix compilation in tests
* update more comments
* small fixes
* port collator protocol to new validator discovery api
* collator tests compile
* collator tests pass
* do not revoke a request when the stream receiver is closed
* make revoking opt-in
* fix is_fulfilled
* handle request revokation in collator
* tests
* wait for validator connections asyncronously
* fix compilation
* relabel my todos
* apply Fedor's patch
* resolve reconnection TODO
* resolve revoking TODO
* resolve channel capacity TODO
* resolve peer cloning TODO
* resolve peer disconnected TODO
* resolve PeerSet TODO
* wip tests
* more tests
* resolve Arc TODO
* rename pending to non_revoked
* one more test
* extract utility function into util crate
* fix compilation in tests
* Apply suggestions from code review
Co-authored-by: Fedor Sakharov <fedor.sakharov@gmail.com>
* revert pin_project removal
* fix while let loop
* Revert "revert pin_project removal"
This reverts commit ae7f529d8de982ef66c3007dd1ff74c6ddce80d2.
* fix compilation
* Update node/subsystem/src/messages.rs
* docs on pub items
* guide updates
* remove a TODO
* small guide update
* fix a typo
* link to the issue
* validator discovery: on_request docs
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
Co-authored-by: Fedor Sakharov <fedor.sakharov@gmail.com>
Co-authored-by: Bernhard Schuster <bernhard@ahoi.io>
* service-new: cosmetic changes
* overseer: draft of prometheus metrics
* metrics: update active_leaves metrics
* metrics: extract into functions
* metrics: resolve XXX
* metrics: it's ugly, but it works
* Bump Substrate
* metrics: move a bunch of code around
* Bumb substrate again
* metrics: fix a warning
* fix a warning in runtime
* metrics: statements signed
* metrics: statements impl RegisterMetrics
* metrics: refactor Metrics trait
* metrics: add Metrics assoc type to JobTrait
* metrics: move Metrics trait to util
* metrics: fix overseer
* metrics: fix backing
* metrics: fix candidate validation
* metrics: derive Default
* metrics: docs
* metrics: add stubs for other subsystems
* metrics: add more stubs and fix compilation
* metrics: fix doctest
* metrics: move to subsystem
* metrics: fix candidate validation
* metrics: bitfield signing
* metrics: av store
* metrics: chain API
* metrics: runtime API
* metrics: stub for avad
* metrics: candidates seconded
* metrics: ok I gave up
* metrics: provisioner
* metrics: remove a clone by requiring Metrics: Sync
* metrics: YAGNI
* metrics: remove another TODO
* metrics: for later
* metrics: add parachain_ prefix
* metrics: s/signed_statement/signed_statements
* utils: add a comment for job metrics
* metrics: address review comments
* metrics: oops
* metrics: make sure to save files before commit 😅
* use _total suffix for requests metrics
Co-authored-by: Max Inden <mail@max-inden.de>
* metrics: add tests for overseer
* update Cargo.lock
* overseer: add a test for CollationGeneration
* collation-generation: impl metrics
* collation-generation: use kebab-case for name
* collation-generation: add a constructor
Co-authored-by: Gav Wood <gavin@parity.io>
Co-authored-by: Ashley Ruglys <ashley.ruglys@gmail.com>
Co-authored-by: Max Inden <mail@max-inden.de>
* update networking types
* port over overseer-protocol message types
* Add the collation protocol to network bridge
* message sending
* stub for ConnectToValidators
* add some helper traits and methods to protocol types
* add collator protocol message
* leaves-updating
* peer connection and disconnection
* add utilities for dispatching multiple events
* implement message handling
* add an observedrole enum with equality and no sentry nodes
* derive partial-eq on network bridge event
* add PartialEq impls for network message types
* add Into implementation for observedrole
* port over existing network bridge tests
* add some more tests
* port bitfield distribution
* port over bitfield distribution tests
* add codec indices
* port PoV distribution
* port over PoV distribution tests
* port over statement distribution
* port over statement distribution tests
* update overseer and service-new
* address review comments
* port availability distribution
* port over availability distribution tests
* Initial commit
* WIP
* Make atomic transactions
* Remove pruning code
* Fix build and add a Nop to bridge
* Fixes from review
* Move config struct around for clarity
* Rename constructor and warn on missing docs
* Fix a test and rename a message
* Fix some more reviews
* Obviously failed to rebase cleanly
* add ActiveLeavesUpdate, remove StartWork, StopWork
* replace StartWork, StopWork in subsystem crate tests
* mechanically update OverseerSignal in other modules
* convert overseer to take advantage of new multi-hash update abilities
Note: this does not yet convert the tests; some of the tests now freeze:
test tests::overseer_start_stop_works ... test tests::overseer_start_stop_works has been running for over 60 seconds
test tests::overseer_finalize_works ... test tests::overseer_finalize_works has been running for over 60 seconds
* fix broken overseer tests
* manually impl PartialEq for ActiveLeavesUpdate, rm trait Equivalent
This cleans up the code a bit and makes it easier in the future to
do the right thing when comparing ALUs.
* use target in all network bridge logging
* reduce spamming of and
* get conclude signal working properly; don't allocate a vector
* wip: add test suite / example / explanation for using utility subsystem
Unfortunately, the test fails right now for reasons which seem
very odd. Just have to keep poking at it.
* explicitly import everything
* fix subsystem-util test
The root problem here was two-fold:
- there was a circular dependency from subsystem -> test-helpers/subsystem ->
subsystem
- cfg(test) doesn't propagate between crates
The solution: move the subsystem test helpers into a sub-module
within subsystem. Publicly export them from the previous location
so no other code breaks.
Doing this has an additional benefit: it ensures that no production
code can ever accidentally use the subsystem helpers, as they are compile-
gated on cfg(test).
* fully commit to moving test helpers into a subsystem module
* add some more tests
* get rid of log tests in favor of real error forwarding
It's not obvious whether we'll ever really want to chase down
these errors outside a testing context, but having the capability
won't hurt.
* fix issue which caused test to hang on osx
* only require that job errors are PartialEq when testing
also fix polkadot-node-core-backing tests
* get rid of any notion of partialeq
* rethink testing
Combine tests of starting and stopping job: leaving a test executor
with a job running was pretty clearly the cause of the sometimes-hang.
Also, add a timeout so tests _can't_ hang anymore; they just fail
after a while.
* rename fwd_errors -> forward_errors
* warn on error propagation failure
* fix unused import leftover from merge
* derive eq for subsystemerror
* create a v1 primitives module
* Improve guide on availability types
* punctuate
* new parachains runtime uses new primitives
* tests of new runtime now use new primitives
* add ErasureChunk to guide
* export erasure chunk from v1 primitives
* subsystem crate uses v1 primitives
* node-primitives uses new v1 primitives
* port overseer to new primitives
* new-proposer uses v1 primitives (no ParachainHost anymore)
* fix no-std compilation for primitives
* service-new uses v1 primitives
* network-bridge uses new primitives
* statement distribution uses v1 primitives
* PoV distribution uses v1 primitives; add PoV::hash fn
* move parachain to v0
* remove inclusion_inherent module and place into v1
* remove everything from primitives crate root
* remove some unused old types from v0 primitives
* point everything else at primitives::v0
* squanch some warns up
* add RuntimeDebug import to no-std as well
* port over statement-table and validation
* fix final errors in validation and node-primitives
* add dummy Ord impl to committed candidate receipt
* guide: update CandidateValidationMessage
* add primitive for validationoutputs
* expand CandidateValidationMessage further
* bikeshed
* add some impls to omitted-validation-data and available-data
* expand CandidateValidationMessage
* make erasure-coding generic over v1/v0
* update usages of erasure-coding
* implement commitments.hash()
* use Arc<Pov> for CandidateValidation
* improve new erasure-coding method names
* fix up candidate backing
* update docs a bit
* fix most tests and add short-circuiting to make_pov_available
* fix remainder of candidate backing tests
* squanching warns
* squanch it up
* some fallout
* overseer fallout
* free from polkadot-test-service hell
* network bridge skeleton
* move some primitives around and add debug impls
* protocol registration glue & abstract network interface
* add send_msgs to subsystemctx
* select logic
* transform different events into actions and handle
* implement remaining network bridge state machine
* start test skeleton
* make network methods asynchronous
* extract subsystem out to subsystem crate
* port over overseer to subsystem context trait
* fix minimal example
* fix overseer doc test
* update network-bridge crate
* write a subsystem test-helpers crate
* write a network test helper for network-bridge
* set up (broken) view test
* Revamp network to be more async-friendly and not require Sync
* fix spacing
* fix test compilation
* insert side-channel for actions
* Add some more message types to AllMessages
* introduce a test harness
* add some tests
* ensure service compiles and passes tests
* fix typo
* fix service-new compilation
* Subsystem test helpers send messages synchronously
* remove smelly action inspector
* remove superfluous let binding
* fix warnings
* Update node/network/bridge/src/lib.rs
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
* fix compilation
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>