* Some code cleanup in overseer
- Switches to select! in the overseer run loop to be more fair about
message processing between the different sources.
- Added a check to only send `ActiveLeaves` if the update actually
contains any data.
* Move the check
* Restore old behavior
* Simplify message sending and signal sending to subsystems
* Update node/subsystem/src/lib.rs
* drop in tracing to replace log
* add structured logging to trace messages
* add structured logging to debug messages
* add structured logging to info messages
* add structured logging to warn messages
* add structured logging to error messages
* normalize spacing and Display vs Debug
* add instrumentation to the various 'fn run'
* use explicit tracing module throughout
* fix availability distribution test
* don't double-print errors
* remove further redundancy from logs
* fix test errors
* fix more test errors
* remove unused kv_log_macro
* fix unused variable
* add tracing spans to collation generation
* add tracing spans to av-store
* add tracing spans to backing
* add tracing spans to bitfield-signing
* add tracing spans to candidate-selection
* add tracing spans to candidate-validation
* add tracing spans to chain-api
* add tracing spans to provisioner
* add tracing spans to runtime-api
* add tracing spans to availability-distribution
* add tracing spans to bitfield-distribution
* add tracing spans to network-bridge
* add tracing spans to collator-protocol
* add tracing spans to pov-distribution
* add tracing spans to statement-distribution
* add tracing spans to overseer
* cleanup
* Make `CandidateHash` a real type
This pr adds a new type `CandidateHash` that is used instead of the
opaque `Hash` type. This helps to ensure on the type system level that
we are passing the correct types.
This pr also fixes wrong usage of `relay_parent` as `candidate_hash`
when communicating with the av storage.
* Update core-primitives/src/lib.rs
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
* Wrap the lines
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
* Parachain improvements
- Set the parachains configuration in Rococo genesis
- Don't stop the overseer when a subsystem job is stopped
- Several small code changes
* Remove unused functionality
* Return error from the runtime instead of printing it
* Apply suggestions from code review
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
* Update primitives/src/v1.rs
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
* Update primitives/src/v1.rs
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
* Fix test
* Revert "Update primitives/src/v1.rs"
This reverts commit 11fce2785acd1de481ca57815b8e18400f09fd52.
* Revert "Update primitives/src/v1.rs"
This reverts commit d6439fed4f954360c89fb1e12b73954902c76a41.
* Revert "Return error from the runtime instead of printing it"
This reverts commit cb4b5c0830ac516a6d54b2c24197e9354f2b98cb.
* Revert "Fix test"
This reverts commit 0c5fa1b5566d4cd3c55a55d485e707165ce7a59e.
* Update runtime/parachains/src/runtime_api_impl/v1.rs
Co-authored-by: Sergei Shulepov <sergei@parity.io>
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
Co-authored-by: Sergei Shulepov <sergei@parity.io>
* start working on building the real overseer
Unfortunately, this fails to compile right now due to an upstream
failure to compile which is probably brought on by a recent upgrade
to rustc v1.47.
* fill in AllSubsystems internal constructors
* replace fn make_metrics with Metrics::attempt_to_register
* update to account for #1740
* remove Metrics::register, rename Metrics::attempt_to_register
* add 'static bounds to real_overseer type params
* pass authority_discovery and network_service to real_overseer
It's not straightforwardly obvious that this is the best way to handle
the case when there is no authority discovery service, but it seems
to be the best option available at the moment.
* select a proper database configuration for the availability store db
* use subdirectory for av-store database path
* apply Basti's patch which avoids needing to parameterize everything on Block
* simplify path extraction
* get all tests to compile
* Fix Prometheus double-registry error
for debugging purposes, added this to node/subsystem-util/src/lib.rs:472-476:
```rust
Some(registry) => Self::try_register(registry).map_err(|err| {
eprintln!("PrometheusError calling {}::register: {:?}", std::any::type_name::<Self>(), err);
err
}),
```
That pointed out where the registration was failing, which led to
this fix. The test still doesn't pass, but it now fails in a new
and different way!
* authorities must have authority discovery, but not necessarily overseer handlers
* fix broken SpawnedSubsystem impls
detailed logging determined that using the `Box::new` style of
future generation, the `self.run` method was never being called,
leading to dropped receivers / closed senders for those subsystems,
causing the overseer to shut down immediately.
This is not the final fix needed to get things working properly,
but it's a good start.
* use prometheus properly
Prometheus lets us register simple counters, which aren't very
interesting. It also allows us to register CounterVecs, which are.
With a CounterVec, you can provide a set of labels, which can
later be used to filter the counts.
We were using them wrong, though. This pattern was repeated in a
variety of places in the code:
```rust
// panics with an cardinality mismatch
let my_counter = register(CounterVec::new(opts, &["succeeded", "failed"])?, registry)?;
my_counter.with_label_values(&["succeeded"]).inc()
```
The problem is that the labels provided in the constructor are not
the set of legal values which can be annotated, but a set of individual
label names which can have individual, arbitrary values.
This commit fixes that.
* get av-store subsystem to actually run properly and not die on first signal
* typo fix: incomming -> incoming
* don't disable authority discovery in test nodes
* Fix rococo-v1 missing session keys
* Update node/core/av-store/Cargo.toml
* try dummying out av-store on non-full-nodes
* overseer and subsystems are required only for full nodes
* Reduce the amount of warnings on browser target
* Fix two more warnings
* InclusionInherent should actually have an Inherent module on rococo
* Ancestry: don't return genesis' parent hash
* Update Cargo.lock
* fix broken test
* update test script: specify chainspec as script argument
* Apply suggestions from code review
Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
* Update node/service/src/lib.rs
Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
* node/service/src/lib: Return error via ? operator
* post-merge blues
* add is_collator flag
* prevent occasional av-store test panic
* simplify fix; expand application
* run authority_discovery in Role::Discover when collating
* distinguish between proposer closed channel errors
* add IsCollator enum, remove is_collator CLI flag
* improve formatting
* remove nop loop
* Fix some stuff
Co-authored-by: Andronik Ordian <write@reusable.software>
Co-authored-by: Bastian Köcher <git@kchr.de>
Co-authored-by: Fedor Sakharov <fedor.sakharov@gmail.com>
Co-authored-by: Robert Habermeier <robert@Roberts-MBP.lan1>
Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
Co-authored-by: Max Inden <mail@max-inden.de>
* Make `AllSubsystems` usage easier in tests
This makes the usage of `AllSubsystems` easier in tests by introducing
new methods.
- `dummy` initializes `AllSubsystems` with all systems set to dummy
- `replace_*` to replace any subsystem
Besides that this pr adds a `ForwardSubsystem` that is also useful for
tests. This subsystem will forward all incoming messages to the given channel.
* Update node/overseer/src/lib.rs
Co-authored-by: Andronik Ordian <write@reusable.software>
* Update node/subsystem/src/lib.rs
Co-authored-by: Andronik Ordian <write@reusable.software>
* Update node/subsystem/src/lib.rs
Co-authored-by: Andronik Ordian <write@reusable.software>
* Move ForwardSubsystem and add a test
* Break some lines
Co-authored-by: Andronik Ordian <write@reusable.software>
This pr changes the collator interface function to return an optional
collation instead of a collation. This is required as the parachain
itself can fail to generate a valid collation for various reason. Now if
the collation fails it will return `None`.
Besides that the pr adds some `RuntimeDebug` derive for `ValidationData`
and removes some whitespaces.
* update primitives
* correct parent_head field
* make hrmp field pub
* refactor validation data: runtime
* refactor validation data: messages
* add arguments to full_validation_data runtime API
* port runtime API
* mostly port over candidate validation
* remove some parameters from ValidationParams
* guide: update candidate validation
* update candidate outputs
* update ValidationOutputs in primitives
* port over candidate validation
* add a new test for no-transient behavior
* update util runtime API wrappers
* candidate backing
* fix missing imports
* change some fields of validation data around
* runtime API impl
* update candidate validation
* fix backing tests
* grumbles from review
* fix av-store tests
* fix some more crates
* fix provisioner tests
* fix availability distribution tests
* port collation-generation to new validation data
* fix overseer tests
* Update roadmap/implementers-guide/src/node/utility/candidate-validation.md
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
* service-new: cosmetic changes
* overseer: draft of prometheus metrics
* metrics: update active_leaves metrics
* metrics: extract into functions
* metrics: resolve XXX
* metrics: it's ugly, but it works
* Bump Substrate
* metrics: move a bunch of code around
* Bumb substrate again
* metrics: fix a warning
* fix a warning in runtime
* metrics: statements signed
* metrics: statements impl RegisterMetrics
* metrics: refactor Metrics trait
* metrics: add Metrics assoc type to JobTrait
* metrics: move Metrics trait to util
* metrics: fix overseer
* metrics: fix backing
* metrics: fix candidate validation
* metrics: derive Default
* metrics: docs
* metrics: add stubs for other subsystems
* metrics: add more stubs and fix compilation
* metrics: fix doctest
* metrics: move to subsystem
* metrics: fix candidate validation
* metrics: bitfield signing
* metrics: av store
* metrics: chain API
* metrics: runtime API
* metrics: stub for avad
* metrics: candidates seconded
* metrics: ok I gave up
* metrics: provisioner
* metrics: remove a clone by requiring Metrics: Sync
* metrics: YAGNI
* metrics: remove another TODO
* metrics: for later
* metrics: add parachain_ prefix
* metrics: s/signed_statement/signed_statements
* utils: add a comment for job metrics
* metrics: address review comments
* metrics: oops
* metrics: make sure to save files before commit 😅
* use _total suffix for requests metrics
Co-authored-by: Max Inden <mail@max-inden.de>
* metrics: add tests for overseer
* update Cargo.lock
* overseer: add a test for CollationGeneration
* collation-generation: impl metrics
* collation-generation: use kebab-case for name
* collation-generation: add a constructor
Co-authored-by: Gav Wood <gavin@parity.io>
Co-authored-by: Ashley Ruglys <ashley.ruglys@gmail.com>
Co-authored-by: Max Inden <mail@max-inden.de>
* update networking types
* port over overseer-protocol message types
* Add the collation protocol to network bridge
* message sending
* stub for ConnectToValidators
* add some helper traits and methods to protocol types
* add collator protocol message
* leaves-updating
* peer connection and disconnection
* add utilities for dispatching multiple events
* implement message handling
* add an observedrole enum with equality and no sentry nodes
* derive partial-eq on network bridge event
* add PartialEq impls for network message types
* add Into implementation for observedrole
* port over existing network bridge tests
* add some more tests
* port bitfield distribution
* port over bitfield distribution tests
* add codec indices
* port PoV distribution
* port over PoV distribution tests
* port over statement distribution
* port over statement distribution tests
* update overseer and service-new
* address review comments
* port availability distribution
* port over availability distribution tests
* propagate chain_api subsystem through various locations
* add ChainApi to AllMessages
Co-authored-by: Peter Goodspeed-Niklaus <peter.r.goodspeedniklaus@gmail.com>
* Initial commit
* WIP
* Make atomic transactions
* Remove pruning code
* Fix build and add a Nop to bridge
* Fixes from review
* Move config struct around for clarity
* Rename constructor and warn on missing docs
* Fix a test and rename a message
* Fix some more reviews
* Obviously failed to rebase cleanly
* add ActiveLeavesUpdate, remove StartWork, StopWork
* replace StartWork, StopWork in subsystem crate tests
* mechanically update OverseerSignal in other modules
* convert overseer to take advantage of new multi-hash update abilities
Note: this does not yet convert the tests; some of the tests now freeze:
test tests::overseer_start_stop_works ... test tests::overseer_start_stop_works has been running for over 60 seconds
test tests::overseer_finalize_works ... test tests::overseer_finalize_works has been running for over 60 seconds
* fix broken overseer tests
* manually impl PartialEq for ActiveLeavesUpdate, rm trait Equivalent
This cleans up the code a bit and makes it easier in the future to
do the right thing when comparing ALUs.
* use target in all network bridge logging
* reduce spamming of and
* update guide to reduce confusion and TODOs
* work from previous bitfield signing effort
There were large merge issues with the old bitfield signing PR, so
we're just copying all the work from that onto this and restarting.
Much of the existing work will be discarded because we now have better
tools available, but that's fine.
* start rewriting bitfield signing in terms of the util module
* implement construct_availability_bitvec
It's not an ideal implementation--we can make it much more concurrent--
but at least it compiles.
* implement the unimplemented portions of bitfield signing
* get core availability concurrently, not sequentially
* use sp-std instead of std for a parachain item
* resolve type inference failure caused by multiple From impls
* handle bitfield signing subsystem & Allmessages variant in overseer
* fix more multi-From inference issues
* more concisely handle overflow
Co-authored-by: Andronik Ordian <write@reusable.software>
* Revert "resolve type inference failure caused by multiple From impls"
This reverts commit 7fc77805de5e5074a1b01037f8d4e3919e03e0e1.
* Revert "fix more multi-From inference issues"
This reverts commit f14ffe589e20d664d8a900ed62f68b6fb844a514.
* impl From<i32> for ParaId
* handle another instance of AllSubsystems
* improve consistency when returning existing options
Co-authored-by: Andronik Ordian <write@reusable.software>
* create a v1 primitives module
* Improve guide on availability types
* punctuate
* new parachains runtime uses new primitives
* tests of new runtime now use new primitives
* add ErasureChunk to guide
* export erasure chunk from v1 primitives
* subsystem crate uses v1 primitives
* node-primitives uses new v1 primitives
* port overseer to new primitives
* new-proposer uses v1 primitives (no ParachainHost anymore)
* fix no-std compilation for primitives
* service-new uses v1 primitives
* network-bridge uses new primitives
* statement distribution uses v1 primitives
* PoV distribution uses v1 primitives; add PoV::hash fn
* move parachain to v0
* remove inclusion_inherent module and place into v1
* remove everything from primitives crate root
* remove some unused old types from v0 primitives
* point everything else at primitives::v0
* squanch some warns up
* add RuntimeDebug import to no-std as well
* port over statement-table and validation
* fix final errors in validation and node-primitives
* add dummy Ord impl to committed candidate receipt
* guide: update CandidateValidationMessage
* add primitive for validationoutputs
* expand CandidateValidationMessage further
* bikeshed
* add some impls to omitted-validation-data and available-data
* expand CandidateValidationMessage
* make erasure-coding generic over v1/v0
* update usages of erasure-coding
* implement commitments.hash()
* use Arc<Pov> for CandidateValidation
* improve new erasure-coding method names
* fix up candidate backing
* update docs a bit
* fix most tests and add short-circuiting to make_pov_available
* fix remainder of candidate backing tests
* squanching warns
* squanch it up
* some fallout
* overseer fallout
* free from polkadot-test-service hell
* overseer: introduce a utility typemap
* it's ugly but it compiles
* move DummySubsystem to subsystem crate
* fix tests fallout
* use a struct for all subsystems
* more tests fallout
* add missing pov_distribution subsystem
* remove unused imports and bounds
* fix minimal-example
* Updates guide for CandidateBacking
* Move assignment types to primitives
* Initial implementation.
* More functionality
* use assert_matches
* Changes to report misbehaviors
* Some fixes after a review
* Remove a blank line
* Update guide and some types
* Adds run_job function
* Some comments and refactorings
* Fix review
* Remove warnings
* Use summary in kicking off validation
* Parallelize requests
* Validation provides local and global validation params
* Test issued validity tracking
* Nits from review
* network bridge skeleton
* move some primitives around and add debug impls
* protocol registration glue & abstract network interface
* add send_msgs to subsystemctx
* select logic
* transform different events into actions and handle
* implement remaining network bridge state machine
* start test skeleton
* make network methods asynchronous
* extract subsystem out to subsystem crate
* port over overseer to subsystem context trait
* fix minimal example
* fix overseer doc test
* update network-bridge crate
* write a subsystem test-helpers crate
* write a network test helper for network-bridge
* set up (broken) view test
* Revamp network to be more async-friendly and not require Sync
* fix spacing
* fix test compilation
* insert side-channel for actions
* Add some more message types to AllMessages
* introduce a test harness
* add some tests
* ensure service compiles and passes tests
* fix typo
* fix service-new compilation
* Subsystem test helpers send messages synchronously
* remove smelly action inspector
* remove superfluous let binding
* fix warnings
* Update node/network/bridge/src/lib.rs
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
* fix compilation
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
* introduce polkadot-node-primitives
* guide: change statement distribution message types
* guide: remove variant from `CandidateSelectionMessage`
* add a few more message types
* add TODOs
* Almost all messages
* NewBackedCandidate notification
* Formatting
* Use AttestedCandidate as BackedCandidate
* Update node/primitives/src/lib.rs
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
* Fix the tests
* Bring in types from #1242
* Adds network bridge messages
* More message types from doc
* use fn pointer type
* Fixes from the review
* Add missing Runtime subsystem message
* rename to CandidateValidationMessage and fix tests
Co-authored-by: Fedor Sakharov <fedor.sakharov@gmail.com>
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
* New service initial commit
* More separation of the new and old services
* Fix review comments
* Adds polkadot.json
* Fix browser build
* Remove unused import
* Update node/service/src/lib.rs
Co-authored-by: Fedor Sakharov <fedor.sakharov@gmail.com>
* Remove duplicate json files
Co-authored-by: Robert Habermeier <rphmeier@gmail.com>