* Rework `ConnectionsRequests`
Instead of implementing the `Stream` trait, this struct now provides a
function `next()`. This enables us to encode into the type system that
it will always return a value or block indefinitely.
* Review feedback
* guide: fix formatting for SessionInfo module
* primitives: SessionInfo type
* punt on approval keys
* ah, revert the type alias
* session info runtime module skeleton
* update the guide
* runtime/configuration: sync with the guide
* runtime/configuration: setters for newly added fields
* runtime/configuration: set codec indexes
* runtime/configuration: update test
* primitives: fix SessionInfo definition
* runtime/session_info: initial impl
* runtime/session_info: use initializer for session handling (wip)
* runtime/session_info: mock authority discovery trait
* guide: update the initializer's order
* runtime/session_info: tests skeleton
* runtime/session_info: store n_delay_tranches in Configuration
* runtime/session_info: punt on approval keys
* runtime/session_info: add some basic tests
* Update primitives/src/v1.rs
* small fixes
* remove codec index annotation on structs
* fix off-by-one error
* validator_discovery: accept a session index
* runtime: replace validator_discovery api with session_info
* Update runtime/parachains/src/session_info.rs
Co-authored-by: Sergei Shulepov <sergei@parity.io>
* runtime/session_info: add a comment about missing entries
* runtime/session_info: define the keys
* util: expose connect_to_past_session_validators
* util: allow session_info requests for jobs
* runtime-api: add mock test for session_info
* collator-protocol: add session_index to test state
* util: fix error message for runtime error
* fix compilation
* fix tests after merge with master
Co-authored-by: Sergei Shulepov <sergei@parity.io>
* Initial commit
* Remove unnecessary struct
* Some review nits
* Update node/network/pov-distribution/src/lib.rs
* Update parachain/test-parachains/adder/collator/tests/integration.rs
* Review nits
* notify_all_we_are_awaiting
* Both ways of peers connections should work the same
* Add mod-level docs to error.rs
* Avoid multiple connection requests at same parent
* Dont bail on errors
* FusedStream for ConnectionRequests
* Fix build after merge
* Improve error handling
* Remove whitespace formatting
* reexport prometheus-super for ease of use of other subsystems
* add some prometheus timers for collation generation subsystem
* add timing metrics to av-store
* add metrics to candidate backing
* add timing metric to bitfield signing
* add timing metrics to candidate selection
* add timing metrics to candidate-validation
* add timing metrics to chain-api
* add timing metrics to provisioner
* add timing metrics to runtime-api
* add timing metrics to availability-distribution
* add timing metrics to bitfield-distribution
* add timing metrics to collator protocol: collator side
* add timing metrics to collator protocol: validator side
* fix candidate validation test failures
* add timing metrics to pov distribution
* add timing metrics to statement-distribution
* use substrate_prometheus_endpoint prometheus reexport instead of prometheus_super
* don't include JOB_DELAY in bitfield-signing metrics
* give adder-collator ability to easily export its genesis-state and validation code
* wip: adder-collator pushbutton script
* don't attempt to register the adder-collator automatically
Instead, get these values with
```sh
target/release/adder-collator export-genesis-state
target/release/adder-collator export-genesis-wasm
```
And then register the parachain on https://polkadot.js.org/apps/?rpc=ws%3A%2F%2F127.0.0.1%3A9944#/explorer
To collect prometheus data, after running the script, create `prometheus.yml` per the instructions
at https://www.notion.so/paritytechnologies/Setting-up-Prometheus-locally-835cb3a9df7541a781c381006252b5ff
and then run:
```sh
docker run -v `pwd`/prometheus.yml:/etc/prometheus/prometheus.yml:z --network host prom/prometheus
```
Demonstrates that data makes it across to prometheus, though it is likely to be useful in the future
to tweak the buckets.
* Update parachain/test-parachains/adder/collator/src/cli.rs
Co-authored-by: Andronik Ordian <write@reusable.software>
* use the grandpa-pause parameter
* skip metrics in tracing instrumentation
* remove unnecessary grandpa_pause cli param
Co-authored-by: Andronik Ordian <write@reusable.software>
* drop in tracing to replace log
* add structured logging to trace messages
* add structured logging to debug messages
* add structured logging to info messages
* add structured logging to warn messages
* add structured logging to error messages
* normalize spacing and Display vs Debug
* add instrumentation to the various 'fn run'
* use explicit tracing module throughout
* fix availability distribution test
* don't double-print errors
* remove further redundancy from logs
* fix test errors
* fix more test errors
* remove unused kv_log_macro
* fix unused variable
* add tracing spans to collation generation
* add tracing spans to av-store
* add tracing spans to backing
* add tracing spans to bitfield-signing
* add tracing spans to candidate-selection
* add tracing spans to candidate-validation
* add tracing spans to chain-api
* add tracing spans to provisioner
* add tracing spans to runtime-api
* add tracing spans to availability-distribution
* add tracing spans to bitfield-distribution
* add tracing spans to network-bridge
* add tracing spans to collator-protocol
* add tracing spans to pov-distribution
* add tracing spans to statement-distribution
* add tracing spans to overseer
* cleanup
We need to distribute the PoV after we have seconded it. Other nodes
that will receive our `Secondded` statement and want to validate the
candidate another time will request this PoV from us.
* backing: extract log target
* bitfield-signing: extract log target
* utils: fix a typo
* provisioner: extract log target
* candidate selection: remove unused error variant
* bitfield-distribution: change the return type of run
* pov-distribution: extract log target
* collator-protocol: simplify runtime request
* collation-generation: do not exit early on error
* collation-generation: do not exit on double init
* collator-protocol: do not exit on errors and rename LOG_TARGET
* collator-protocol: a workaround for ununused imports warning
* Update node/network/bitfield-distribution/src/lib.rs
* collation-generation: elevate warn! to error!
* collator-protocol: fix imports
* post merge fix
* fix compilation
* service-new: cosmetic changes
* overseer: draft of prometheus metrics
* metrics: update active_leaves metrics
* metrics: extract into functions
* metrics: resolve XXX
* metrics: it's ugly, but it works
* Bump Substrate
* metrics: move a bunch of code around
* Bumb substrate again
* metrics: fix a warning
* fix a warning in runtime
* metrics: statements signed
* metrics: statements impl RegisterMetrics
* metrics: refactor Metrics trait
* metrics: add Metrics assoc type to JobTrait
* metrics: move Metrics trait to util
* metrics: fix overseer
* metrics: fix backing
* metrics: fix candidate validation
* metrics: derive Default
* metrics: docs
* metrics: add stubs for other subsystems
* metrics: add more stubs and fix compilation
* metrics: fix doctest
* metrics: move to subsystem
* metrics: fix candidate validation
* metrics: bitfield signing
* metrics: av store
* metrics: chain API
* metrics: runtime API
* metrics: stub for avad
* metrics: candidates seconded
* metrics: ok I gave up
* metrics: provisioner
* metrics: remove a clone by requiring Metrics: Sync
* metrics: YAGNI
* metrics: remove another TODO
* metrics: for later
* metrics: add parachain_ prefix
* metrics: s/signed_statement/signed_statements
* utils: add a comment for job metrics
* metrics: address review comments
* metrics: oops
* metrics: make sure to save files before commit 😅
* use _total suffix for requests metrics
Co-authored-by: Max Inden <mail@max-inden.de>
* metrics: add tests for overseer
* update Cargo.lock
* overseer: add a test for CollationGeneration
* collation-generation: impl metrics
* collation-generation: use kebab-case for name
* collation-generation: add a constructor
Co-authored-by: Gav Wood <gavin@parity.io>
Co-authored-by: Ashley Ruglys <ashley.ruglys@gmail.com>
Co-authored-by: Max Inden <mail@max-inden.de>
* update networking types
* port over overseer-protocol message types
* Add the collation protocol to network bridge
* message sending
* stub for ConnectToValidators
* add some helper traits and methods to protocol types
* add collator protocol message
* leaves-updating
* peer connection and disconnection
* add utilities for dispatching multiple events
* implement message handling
* add an observedrole enum with equality and no sentry nodes
* derive partial-eq on network bridge event
* add PartialEq impls for network message types
* add Into implementation for observedrole
* port over existing network bridge tests
* add some more tests
* port bitfield distribution
* port over bitfield distribution tests
* add codec indices
* port PoV distribution
* port over PoV distribution tests
* port over statement distribution
* port over statement distribution tests
* update overseer and service-new
* address review comments
* port availability distribution
* port over availability distribution tests
* polkadot-subsystem: update runtime API message types
* update all networking subsystems to use fallible runtime APIs
* fix bitfield-signing and make it use new runtime APIs
* port candidate-backing to handle runtime API errors and new types
* remove old runtime API messages
* remove unused imports
* fix grumbles
* fix backing tests
* Initial commit
* WIP
* Make atomic transactions
* Remove pruning code
* Fix build and add a Nop to bridge
* Fixes from review
* Move config struct around for clarity
* Rename constructor and warn on missing docs
* Fix a test and rename a message
* Fix some more reviews
* Obviously failed to rebase cleanly
* add ActiveLeavesUpdate, remove StartWork, StopWork
* replace StartWork, StopWork in subsystem crate tests
* mechanically update OverseerSignal in other modules
* convert overseer to take advantage of new multi-hash update abilities
Note: this does not yet convert the tests; some of the tests now freeze:
test tests::overseer_start_stop_works ... test tests::overseer_start_stop_works has been running for over 60 seconds
test tests::overseer_finalize_works ... test tests::overseer_finalize_works has been running for over 60 seconds
* fix broken overseer tests
* manually impl PartialEq for ActiveLeavesUpdate, rm trait Equivalent
This cleans up the code a bit and makes it easier in the future to
do the right thing when comparing ALUs.
* use target in all network bridge logging
* reduce spamming of and
* get conclude signal working properly; don't allocate a vector
* wip: add test suite / example / explanation for using utility subsystem
Unfortunately, the test fails right now for reasons which seem
very odd. Just have to keep poking at it.
* explicitly import everything
* fix subsystem-util test
The root problem here was two-fold:
- there was a circular dependency from subsystem -> test-helpers/subsystem ->
subsystem
- cfg(test) doesn't propagate between crates
The solution: move the subsystem test helpers into a sub-module
within subsystem. Publicly export them from the previous location
so no other code breaks.
Doing this has an additional benefit: it ensures that no production
code can ever accidentally use the subsystem helpers, as they are compile-
gated on cfg(test).
* fully commit to moving test helpers into a subsystem module
* add some more tests
* get rid of log tests in favor of real error forwarding
It's not obvious whether we'll ever really want to chase down
these errors outside a testing context, but having the capability
won't hurt.
* fix issue which caused test to hang on osx
* only require that job errors are PartialEq when testing
also fix polkadot-node-core-backing tests
* get rid of any notion of partialeq
* rethink testing
Combine tests of starting and stopping job: leaving a test executor
with a job running was pretty clearly the cause of the sometimes-hang.
Also, add a timeout so tests _can't_ hang anymore; they just fail
after a while.
* rename fwd_errors -> forward_errors
* warn on error propagation failure
* fix unused import leftover from merge
* derive eq for subsystemerror
* create a README on Runtime APIs
* add ParaId type
* write up runtime APIs
* more preamble
* rename
* rejig runtime APIs
* add occupied_since to `BlockNumber`
* skeleton crate for runtime API subsystem
* improve group_for_core
* improve docs on availability cores runtime API
* guide: freed -> free
* add primitives for runtime APIs
* create a v1 ParachainHost API trait
* guide: make validation code return `Option`al.
* skeleton runtime API helpers
* make parachain-host runtime-generic
* skeleton for most runtime API implementation functions
* guide: add runtime API helper methods
* implement new helpers of the inclusion module
* guide: remove retries check, as it is unneeded
* implement helpers for scheduler module for Runtime APIs
* clean up `validator_groups` implementation
* implement next_rotation_at and last_rotation_at
* guide: more helpers on GroupRotationInfo
* almost finish implementing runtime APIs
* add explicit block parameter to runtime API fns
* guide: generalize number parameter
* guide: add group_responsible to occupied-core
* update primitives due to guide changes
* finishing touches on runtime API implementation; squash warnings
* break out runtime API impl to separate file
* add tests for next_up logic
* test group rotation info
* point to filed TODO
* remove unused TODO [now]
* indentation
* guide: para -> para_id
* rename para field to para_id for core meta
* remove reference to outdated AvailabilityCores type
* add an event in `inclusion` for candidates being included or timing out
* guide: candidate events
* guide: adjust language
* Candidate events type from guide and adjust inclusion event
* implement `candidate_events` runtime API
* fix runtime test compilation
* max -> min
* fix typos
* guide: add `RuntimeAPIRequest::CandidateEvents`
* create a v1 primitives module
* Improve guide on availability types
* punctuate
* new parachains runtime uses new primitives
* tests of new runtime now use new primitives
* add ErasureChunk to guide
* export erasure chunk from v1 primitives
* subsystem crate uses v1 primitives
* node-primitives uses new v1 primitives
* port overseer to new primitives
* new-proposer uses v1 primitives (no ParachainHost anymore)
* fix no-std compilation for primitives
* service-new uses v1 primitives
* network-bridge uses new primitives
* statement distribution uses v1 primitives
* PoV distribution uses v1 primitives; add PoV::hash fn
* move parachain to v0
* remove inclusion_inherent module and place into v1
* remove everything from primitives crate root
* remove some unused old types from v0 primitives
* point everything else at primitives::v0
* squanch some warns up
* add RuntimeDebug import to no-std as well
* port over statement-table and validation
* fix final errors in validation and node-primitives
* add dummy Ord impl to committed candidate receipt
* guide: update CandidateValidationMessage
* add primitive for validationoutputs
* expand CandidateValidationMessage further
* bikeshed
* add some impls to omitted-validation-data and available-data
* expand CandidateValidationMessage
* make erasure-coding generic over v1/v0
* update usages of erasure-coding
* implement commitments.hash()
* use Arc<Pov> for CandidateValidation
* improve new erasure-coding method names
* fix up candidate backing
* update docs a bit
* fix most tests and add short-circuiting to make_pov_available
* fix remainder of candidate backing tests
* squanching warns
* squanch it up
* some fallout
* overseer fallout
* free from polkadot-test-service hell
* introduce candidatedescriptor type
* add PoVDistribution message type
* loosen bound on PoV Distribution to account for equivocations
* re-export some types from the messages module
* begin PoV Distribution subsystem
* remove redundant index from PoV distribution
* define state machine for pov distribution
* handle overseer signals
* set up control flow
* remove `ValidatorStatement` section
* implement PoV fetching
* implement distribution logic
* add missing `
* implement some network bridge event handlers
* stub for message processing, handle our view change
* control flow for handling messages
* handle `awaiting` message
* handle any incoming PoVs and redistribute
* actually provide a subsystem implementation
* remove set-builder notation
* begin testing PoV distribution
* test that we send awaiting messages only to peers with same view
* ensure we distribute awaited PoVs to peers on view changes
* test that peers can complete fetch and are rewarded
* test some reporting logic
* ensure peer is reported for flooding
* test punishing peers diverging from awaited protocol
* test that we eagerly complete peers' awaited PoVs based on what we receive
* test that we prune the awaited set after receiving
* expand pov-distribution in guide to match a change I made
* remove unneeded import