* remove low information density error doc comments
* another round of error dancing
* fix compilation
* remove stale `None` argument
* adjust test, minor slip in command
* only add AvailabilityError for full node features
* another None where none shuld be
* Add an upper number of maximum parallel runtime api requests
Instead of spawning all runtime api requests in the background and using
all wasm instances. This pr adds a maximum number of parallel requests.
* Update node/core/runtime-api/src/lib.rs
Co-authored-by: Sergei Shulepov <sergei@parity.io>
* Review feedback
* Increase instances
* Add warning
* Update node/core/runtime-api/src/lib.rs
Co-authored-by: Sergei Shulepov <sergei@parity.io>
Co-authored-by: Sergei Shulepov <sergei@parity.io>
* guide: non-semantic changes
* guide: update per the issue description
* GetBackedCandidates operates on multiple hashes now
* GetBackedCandidates still needs a relay parent
* implement changes specified in guide
* distinguish between various occasions for canceled oneshots
* add tracing info to getbackedcandidates
* REVERT ME: add tracing messages for GetBackedCandidates
Note that these messages are only sometimes actually passed on to the
candidate backing subsystem, with the consequence that it is
unexpectedly frequent that the provisioner fails to create its
provisionable data.
* REVERT ME: more tracing logging
* REVERT ME: log when CandidateBackingJob receives any message at all
* REVERT ME: log when send_msg sends a message to a job
* fix candidate-backing tests
* streamline GetBackedCandidates
This uses table.attested_candidate instead of table.get_candidate, because
it's not obvious how to get a BackedCandidate from just a
CommittedCandidateReceipt.
* REVERT ME: more logging tracing job lifespans
* promote warning about job premature demise
* don't terminate CandiateBackingJob::run_loop in event of failure to process message
* Revert "REVERT ME: more logging tracing job lifespans"
This reverts commit 7365f2fb3dec988d95cfcd317eba75587fe7fd16.
* Revert "REVERT ME: log when send_msg sends a message to a job"
This reverts commit 58e46aad038e6517d6d56390c8be65b046a21884.
* Revert "REVERT ME: log when CandidateBackingJob receives any message at all"
This reverts commit 0d6f38413c7c66b5e9e81dabc587906fa9f82656.
* Revert "REVERT ME: more tracing logging"
This reverts commit 675fd2628e84d1596965280e7314155ef21b28e6.
* Revert "REVERT ME: add tracing messages for GetBackedCandidates"
This reverts commit e09e156493430b33b6c8ab4b5cedb3f2f91afd51.
* formatting
* add logging message to CandidateBackingJob::run_loop start
* REVERT ME: add tracing to candidate-backing job creation
* run candidatebacking loop even if no assignment
* use unique error variants for each canceled oneshot
* Revert "REVERT ME: add tracing to candidate-backing job creation"
This reverts commit 8ce5f4f0bd7186dade134b118751480f72ea1fd6.
* try_runtime_api more to reduce silent exits
* add sanity check that returned backed candidates preserve ordering
* remove redundant err attribute
* refactor some functions to not rely on `self`
* factor out common elements of seconding and attesting
* Add Spawn to backing FromJob
* do candidate validation in background
* tests
* address grumbles
* Simplify subsystem jobs
This pr simplifies the subsystem jobs interface. Instead of requiring an
extra message that is used to signal that a job should be ended, a job
now ends when the receiver returns `None`. Besides that it changes the
interface to enforce that messages to a job provide a relay parent.
* Drop ToJobTrait
* Remove FromJob
We always convert this message to FromJobCommand anyway.
This pr changes how the runtime api subsystem processes runtime api
requests. Instead of answering all of them in the subsystem task and
thus, making all requests sequential, we now answer them in a background
task. This enables us to serve multiple requests at once.
* guide: fix formatting for SessionInfo module
* primitives: SessionInfo type
* punt on approval keys
* ah, revert the type alias
* session info runtime module skeleton
* update the guide
* runtime/configuration: sync with the guide
* runtime/configuration: setters for newly added fields
* runtime/configuration: set codec indexes
* runtime/configuration: update test
* primitives: fix SessionInfo definition
* runtime/session_info: initial impl
* runtime/session_info: use initializer for session handling (wip)
* runtime/session_info: mock authority discovery trait
* guide: update the initializer's order
* runtime/session_info: tests skeleton
* runtime/session_info: store n_delay_tranches in Configuration
* runtime/session_info: punt on approval keys
* runtime/session_info: add some basic tests
* Update primitives/src/v1.rs
* small fixes
* remove codec index annotation on structs
* fix off-by-one error
* validator_discovery: accept a session index
* runtime: replace validator_discovery api with session_info
* Update runtime/parachains/src/session_info.rs
Co-authored-by: Sergei Shulepov <sergei@parity.io>
* runtime/session_info: add a comment about missing entries
* runtime/session_info: define the keys
* util: expose connect_to_past_session_validators
* util: allow session_info requests for jobs
* runtime-api: add mock test for session_info
* collator-protocol: add session_index to test state
* util: fix error message for runtime error
* fix compilation
* fix tests after merge with master
Co-authored-by: Sergei Shulepov <sergei@parity.io>
* use snake_case for log targets
* remove unused continue
* validator_discovery: when disconnecting, use all addresses
* validator_discovery: simplify request revokation
* fix a typo
* reexport prometheus-super for ease of use of other subsystems
* add some prometheus timers for collation generation subsystem
* add timing metrics to av-store
* add metrics to candidate backing
* add timing metric to bitfield signing
* add timing metrics to candidate selection
* add timing metrics to candidate-validation
* add timing metrics to chain-api
* add timing metrics to provisioner
* add timing metrics to runtime-api
* add timing metrics to availability-distribution
* add timing metrics to bitfield-distribution
* add timing metrics to collator protocol: collator side
* add timing metrics to collator protocol: validator side
* fix candidate validation test failures
* add timing metrics to pov distribution
* add timing metrics to statement-distribution
* use substrate_prometheus_endpoint prometheus reexport instead of prometheus_super
* don't include JOB_DELAY in bitfield-signing metrics
* give adder-collator ability to easily export its genesis-state and validation code
* wip: adder-collator pushbutton script
* don't attempt to register the adder-collator automatically
Instead, get these values with
```sh
target/release/adder-collator export-genesis-state
target/release/adder-collator export-genesis-wasm
```
And then register the parachain on https://polkadot.js.org/apps/?rpc=ws%3A%2F%2F127.0.0.1%3A9944#/explorer
To collect prometheus data, after running the script, create `prometheus.yml` per the instructions
at https://www.notion.so/paritytechnologies/Setting-up-Prometheus-locally-835cb3a9df7541a781c381006252b5ff
and then run:
```sh
docker run -v `pwd`/prometheus.yml:/etc/prometheus/prometheus.yml:z --network host prom/prometheus
```
Demonstrates that data makes it across to prometheus, though it is likely to be useful in the future
to tweak the buckets.
* Update parachain/test-parachains/adder/collator/src/cli.rs
Co-authored-by: Andronik Ordian <write@reusable.software>
* use the grandpa-pause parameter
* skip metrics in tracing instrumentation
* remove unnecessary grandpa_pause cli param
Co-authored-by: Andronik Ordian <write@reusable.software>
* drop in tracing to replace log
* add structured logging to trace messages
* add structured logging to debug messages
* add structured logging to info messages
* add structured logging to warn messages
* add structured logging to error messages
* normalize spacing and Display vs Debug
* add instrumentation to the various 'fn run'
* use explicit tracing module throughout
* fix availability distribution test
* don't double-print errors
* remove further redundancy from logs
* fix test errors
* fix more test errors
* remove unused kv_log_macro
* fix unused variable
* add tracing spans to collation generation
* add tracing spans to av-store
* add tracing spans to backing
* add tracing spans to bitfield-signing
* add tracing spans to candidate-selection
* add tracing spans to candidate-validation
* add tracing spans to chain-api
* add tracing spans to provisioner
* add tracing spans to runtime-api
* add tracing spans to availability-distribution
* add tracing spans to bitfield-distribution
* add tracing spans to network-bridge
* add tracing spans to collator-protocol
* add tracing spans to pov-distribution
* add tracing spans to statement-distribution
* add tracing spans to overseer
* cleanup
* Rename ExecutionMode to IsolationStrategy
Execution mode is too generic name and can imply a lot of different
aspects of execution. The notion of isolation better describes the
meant aspect.
And while I am at it, I also renamed mode -> strategy cause it seems a
bit more appropriate, although that is way more subjective.
* Fix compilation in wasm_executor tests.
* Add a comment to IsolationStrategy
* Update comments on IsolationStrategy
* Update node/core/candidate-validation/src/lib.rs
Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
* Accomodate the point on interruption
* Update parachain/src/wasm_executor/mod.rs
Co-authored-by: Andronik Ordian <write@reusable.software>
* Naming nits
Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
Co-authored-by: Andronik Ordian <write@reusable.software>
* Adds integration test based on adder collator
This adds an integration test for parachains that uses the adder
collator. The test will start two relay chain nodes and one collator and
waits until 4 blocks are build and enacted by the parachain.
* Make sure the integration test is run in CI
* Fix wasm compilation
* Update parachain/test-parachains/adder/collator/src/lib.rs
Co-authored-by: Sergei Shulepov <sergei@parity.io>
* Update cli/src/command.rs
Co-authored-by: Sergei Shulepov <sergei@parity.io>
* HRMP: Update the impl guide
* HRMP: Incorporate the channel notifications into the guide
* HRMP: Renaming in the impl guide
* HRMP: Constrain the maximum number of HRMP messages per candidate
This commit addresses the HRMP part of https://github.com/paritytech/polkadot/issues/1869
* XCM: Introduce HRMP related message types
* HRMP: Data structures and plumbing
* HRMP: Configuration
* HRMP: Data layout
* HRMP: Acceptance & Enactment
* HRMP: Test base logic
* Update adder collator
* HRMP: Runtime API for accessing inbound messages
Also, removing some redundant fully-qualified names.
* HRMP: Add diagnostic logging in acceptance criteria
* HRMP: Additional tests
* Self-review fixes
* save test refactorings for the next time
* Missed a return statement.
* a formatting blip
* Add missing logic for appending HRMP digests
* Remove the channel contents vectors which became empty
* Tighten HRMP channel digests invariants.
* Apply suggestions from code review
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
* Remove a note about sorting for channel id
* Add missing rustdocs to the configuration
* Clarify and update the invariant for HrmpChannelDigests
* Make the onboarding invariant less sloppy
Namely, introduce `Paras::is_valid_para` (in fact, it already is present
in the implementation) and hook up the invariant to that.
Note that this says "within a session" because I don't want to make it
super strict on the session boundary. The logic on the session boundary
should be extremely careful.
* Make `CandidateCheckContext` use T::BlockNumber for hrmp_watermark
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
We need to distribute the PoV after we have seconded it. Other nodes
that will receive our `Secondded` statement and want to validate the
candidate another time will request this PoV from us.
* Make `CandidateHash` a real type
This pr adds a new type `CandidateHash` that is used instead of the
opaque `Hash` type. This helps to ensure on the type system level that
we are passing the correct types.
This pr also fixes wrong usage of `relay_parent` as `candidate_hash`
when communicating with the av storage.
* Update core-primitives/src/lib.rs
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
* Wrap the lines
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
* backing: extract log target
* bitfield-signing: extract log target
* utils: fix a typo
* provisioner: extract log target
* candidate selection: remove unused error variant
* bitfield-distribution: change the return type of run
* pov-distribution: extract log target
* collator-protocol: simplify runtime request
* collation-generation: do not exit early on error
* collation-generation: do not exit on double init
* collator-protocol: do not exit on errors and rename LOG_TARGET
* collator-protocol: a workaround for ununused imports warning
* Update node/network/bitfield-distribution/src/lib.rs
* collation-generation: elevate warn! to error!
* collator-protocol: fix imports
* post merge fix
* fix compilation
* Do not validate a candidate in candidate selection
The candidate selection subsystem should not validate a candidate, as
this is done by the backing subsystem on a `Second` request. Otherwise
we validate one candidate twice.
* Update candidate-selection.md
* Moare fixes for parachains
- Sending data to a job should always contain a relay parent. Done this
for the provisioner
- Fixed the `select_availability_bitfields` function. It was assuming we
have one core per validator, while we only have one core per parachain.
- Drive by async "rewrite" in proposer
* Make tests compile
* Update primitives/src/v1.rs
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
* Parachain improvements
- Set the parachains configuration in Rococo genesis
- Don't stop the overseer when a subsystem job is stopped
- Several small code changes
* Remove unused functionality
* Return error from the runtime instead of printing it
* Apply suggestions from code review
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
* Update primitives/src/v1.rs
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
* Update primitives/src/v1.rs
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
* Fix test
* Revert "Update primitives/src/v1.rs"
This reverts commit 11fce2785acd1de481ca57815b8e18400f09fd52.
* Revert "Update primitives/src/v1.rs"
This reverts commit d6439fed4f954360c89fb1e12b73954902c76a41.
* Revert "Return error from the runtime instead of printing it"
This reverts commit cb4b5c0830ac516a6d54b2c24197e9354f2b98cb.
* Revert "Fix test"
This reverts commit 0c5fa1b5566d4cd3c55a55d485e707165ce7a59e.
* Update runtime/parachains/src/runtime_api_impl/v1.rs
Co-authored-by: Sergei Shulepov <sergei@parity.io>
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
Co-authored-by: Sergei Shulepov <sergei@parity.io>
* fix: ensure candidate validation gets code based on occupied core assumption
* guide: runtime API for historical validation code
* add historical runtime API
* integrate into runtime API subsystem
* remove blocked TODO
* fix service build: enable notifications protocol only under real overseer
* Update node/subsystem/src/messages.rs
Co-authored-by: Sergei Shulepov <sergei@parity.io>
* fix compilation
Co-authored-by: Robert Habermeier <robert@Roberts-MacBook-Pro.local>
Co-authored-by: Sergei Shulepov <sergei@parity.io>
* UMP: Update the impl guide
* UMP: Incorporate XCM related changes into the guide
* UMP: Data structures and configuration
* UMP: Initial plumbing
* UMP: Data layout
* UMP: Acceptance criteria & enactment
* UMP: Fix dispatcher bug and add the test for it
* UMP: Constrain the maximum size of an UMP message
This commit addresses the UMP part of https://github.com/paritytech/polkadot/issues/1869
* Fix failing test due to misconfiguration
* Make the type of RelayDispatchQueueSize be more apparent in the guide
* Revert renaming `max_upward_queue_capacity` to `max_upward_queue_count`
* convert spaces to tabs
Co-authored-by: Bernhard Schuster <bernhard@ahoi.io>
* Update runtime/parachains/src/router/ump.rs
Co-authored-by: Bernhard Schuster <bernhard@ahoi.io>
Co-authored-by: Bernhard Schuster <bernhard@ahoi.io>
* start working on building the real overseer
Unfortunately, this fails to compile right now due to an upstream
failure to compile which is probably brought on by a recent upgrade
to rustc v1.47.
* fill in AllSubsystems internal constructors
* replace fn make_metrics with Metrics::attempt_to_register
* update to account for #1740
* remove Metrics::register, rename Metrics::attempt_to_register
* add 'static bounds to real_overseer type params
* pass authority_discovery and network_service to real_overseer
It's not straightforwardly obvious that this is the best way to handle
the case when there is no authority discovery service, but it seems
to be the best option available at the moment.
* select a proper database configuration for the availability store db
* use subdirectory for av-store database path
* apply Basti's patch which avoids needing to parameterize everything on Block
* simplify path extraction
* get all tests to compile
* Fix Prometheus double-registry error
for debugging purposes, added this to node/subsystem-util/src/lib.rs:472-476:
```rust
Some(registry) => Self::try_register(registry).map_err(|err| {
eprintln!("PrometheusError calling {}::register: {:?}", std::any::type_name::<Self>(), err);
err
}),
```
That pointed out where the registration was failing, which led to
this fix. The test still doesn't pass, but it now fails in a new
and different way!
* authorities must have authority discovery, but not necessarily overseer handlers
* fix broken SpawnedSubsystem impls
detailed logging determined that using the `Box::new` style of
future generation, the `self.run` method was never being called,
leading to dropped receivers / closed senders for those subsystems,
causing the overseer to shut down immediately.
This is not the final fix needed to get things working properly,
but it's a good start.
* use prometheus properly
Prometheus lets us register simple counters, which aren't very
interesting. It also allows us to register CounterVecs, which are.
With a CounterVec, you can provide a set of labels, which can
later be used to filter the counts.
We were using them wrong, though. This pattern was repeated in a
variety of places in the code:
```rust
// panics with an cardinality mismatch
let my_counter = register(CounterVec::new(opts, &["succeeded", "failed"])?, registry)?;
my_counter.with_label_values(&["succeeded"]).inc()
```
The problem is that the labels provided in the constructor are not
the set of legal values which can be annotated, but a set of individual
label names which can have individual, arbitrary values.
This commit fixes that.
* get av-store subsystem to actually run properly and not die on first signal
* typo fix: incomming -> incoming
* don't disable authority discovery in test nodes
* Fix rococo-v1 missing session keys
* Update node/core/av-store/Cargo.toml
* try dummying out av-store on non-full-nodes
* overseer and subsystems are required only for full nodes
* Reduce the amount of warnings on browser target
* Fix two more warnings
* InclusionInherent should actually have an Inherent module on rococo
* Ancestry: don't return genesis' parent hash
* Update Cargo.lock
* fix broken test
* update test script: specify chainspec as script argument
* Apply suggestions from code review
Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
* Update node/service/src/lib.rs
Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
* node/service/src/lib: Return error via ? operator
* post-merge blues
* add is_collator flag
* prevent occasional av-store test panic
* simplify fix; expand application
* run authority_discovery in Role::Discover when collating
* distinguish between proposer closed channel errors
* add IsCollator enum, remove is_collator CLI flag
* improve formatting
* remove nop loop
* Fix some stuff
* Adds test parachain adder collator
* Add sudo to Rococo, change session length to 30 seconds and some renaming
* Update to the latest changes on master
* Some fixes
* Fix compilation
* Update parachain/test-parachains/adder/collator/src/lib.rs
Co-authored-by: Sergei Shulepov <sergei@parity.io>
* Review comments
* Downgrade transaction version
* Fixes
* MOARE
* Register notification protocols
* utils: remove unused error
* av-store: more resilient to some errors
* address review nits
* address more review nits
Co-authored-by: Peter Goodspeed-Niklaus <peter.r.goodspeedniklaus@gmail.com>
Co-authored-by: Andronik Ordian <write@reusable.software>
Co-authored-by: Fedor Sakharov <fedor.sakharov@gmail.com>
Co-authored-by: Robert Habermeier <robert@Roberts-MBP.lan1>
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
Co-authored-by: Max Inden <mail@max-inden.de>
Co-authored-by: Sergey Shulepov <s.pepyakin@gmail.com>
Co-authored-by: Sergei Shulepov <sergei@parity.io>
* remove pending TODO after the DMP impl merge
* DMP: Update the impl guide
* DMP: Incorporate XCM related changes into the guide
This is the DMP related part of https://github.com/paritytech/polkadot/issues/1702
* DMP: data structures and plumbing
* DMP: Implement DMP logic in the router module
DMP: Integrate DMP parts into the inclusion module
* DMP: Introduce the max size limit for the size of a downward message
* DMP: Runtime API for accessing inbound messages
* OCD
Small clean ups
* DMP: fix the naming of the error
* DMP: add caution about a non-existent recipient