* use proper descriptive generic type names
* cleanup
* Table stores a list of detected misbehavior per authority
* add Table::drain_misbehaviors_for
* WIP: unify misbehavior types; report multiple misbehaviors per validator
Code checks, but tests don't yet pass.
* update drain_misbehaviors: return authority id as well as specific misbehavior
* enable unchecked construction of Signed structs in tests
* remove test-features feature & unnecessary generic
* fix backing tests
This took a while to figure out, because where we'd previously been
passing around `SignedFullStatement`s, we now needed to construct
those on the fly within the test, to take advantage of the signature-
checking in the constructor. That, in turn, necessitated changing the
iterable type of `drain_misbehaviors` to return the validator index,
and passing that validator index along within the misbehavior report.
Once that was sorted, however, it became relatively straightforward:
just needed to add appropriate methods to deconstruct the misbehavior
reports, and then we could construct the signed statements directly.
* fix bad merge
* collation-generation: use persisted validation data
* node: remote FullValidationData API
* runtime: remove FullValidationData API
* backing tests: use persisted validation data
* FullCandidateReceipt: use persisted validation data
This is not a big change since this type is not used anywhere
* Remove ValidationData and TransientValidationData
Also update the guide
* Improve logging to make debugging parachains easier
This pr should make debugging parachains easier, by printing more
information about the validation process.
* 🤦
* moare
* Convert to debug
* Add one Jaeger span per relay parent
This adds one Jaeger span per relay parent, instead of always creating
new spans per relay parent. This should improve the UI view, because
subsystems are now grouped below one common span.
* Fix doc tests
* Replace `PerLeaveSpan` to `PerLeafSpan`
* More renaming
* Moare
* Update node/subsystem/src/lib.rs
Co-authored-by: Andronik Ordian <write@reusable.software>
* Skip the spans
* Increase `spec_version`
Co-authored-by: Andronik Ordian <write@reusable.software>
* Cont.: Implement the state root obtaining during inclusion
During inclusion now we obtain the storage root by passing it through
the inclusion_inherent.
* Fix tests
* Bump rococo spec version
* Reorder the parent header into the end
of the inclusion inherent.
When the parent header is in the beginning, it shifts the other two
fields, so that a previous version won't be able to decode that. If
we put the parent header in the end, the other two fields will stay
at their positions, thus make it possible to decode with the previous
version.
That allows us to perform upgrade of rococo runtime without needing of
simultanuous upgrade of nodes and runtime, or restart of the network.
* Squash a stray tab
* refactor View to include finalized_number
* guide: update the NetworkBridge on BlockFinalized
* av-store: fix the tests
* actually fix tests
* grumbles
* ignore macro doctest
* use Hash::repeat_bytes more consistently
* broadcast empty leaves updates as well
* fix issuing view updates on empty leaves updates
* guide: non-semantic changes
* guide: update per the issue description
* GetBackedCandidates operates on multiple hashes now
* GetBackedCandidates still needs a relay parent
* implement changes specified in guide
* distinguish between various occasions for canceled oneshots
* add tracing info to getbackedcandidates
* REVERT ME: add tracing messages for GetBackedCandidates
Note that these messages are only sometimes actually passed on to the
candidate backing subsystem, with the consequence that it is
unexpectedly frequent that the provisioner fails to create its
provisionable data.
* REVERT ME: more tracing logging
* REVERT ME: log when CandidateBackingJob receives any message at all
* REVERT ME: log when send_msg sends a message to a job
* fix candidate-backing tests
* streamline GetBackedCandidates
This uses table.attested_candidate instead of table.get_candidate, because
it's not obvious how to get a BackedCandidate from just a
CommittedCandidateReceipt.
* REVERT ME: more logging tracing job lifespans
* promote warning about job premature demise
* don't terminate CandiateBackingJob::run_loop in event of failure to process message
* Revert "REVERT ME: more logging tracing job lifespans"
This reverts commit 7365f2fb3dec988d95cfcd317eba75587fe7fd16.
* Revert "REVERT ME: log when send_msg sends a message to a job"
This reverts commit 58e46aad038e6517d6d56390c8be65b046a21884.
* Revert "REVERT ME: log when CandidateBackingJob receives any message at all"
This reverts commit 0d6f38413c7c66b5e9e81dabc587906fa9f82656.
* Revert "REVERT ME: more tracing logging"
This reverts commit 675fd2628e84d1596965280e7314155ef21b28e6.
* Revert "REVERT ME: add tracing messages for GetBackedCandidates"
This reverts commit e09e156493430b33b6c8ab4b5cedb3f2f91afd51.
* formatting
* add logging message to CandidateBackingJob::run_loop start
* REVERT ME: add tracing to candidate-backing job creation
* run candidatebacking loop even if no assignment
* use unique error variants for each canceled oneshot
* Revert "REVERT ME: add tracing to candidate-backing job creation"
This reverts commit 8ce5f4f0bd7186dade134b118751480f72ea1fd6.
* try_runtime_api more to reduce silent exits
* add sanity check that returned backed candidates preserve ordering
* remove redundant err attribute
* refactor some functions to not rely on `self`
* factor out common elements of seconding and attesting
* Add Spawn to backing FromJob
* do candidate validation in background
* tests
* address grumbles
* Simplify subsystem jobs
This pr simplifies the subsystem jobs interface. Instead of requiring an
extra message that is used to signal that a job should be ended, a job
now ends when the receiver returns `None`. Besides that it changes the
interface to enforce that messages to a job provide a relay parent.
* Drop ToJobTrait
* Remove FromJob
We always convert this message to FromJobCommand anyway.
* reexport prometheus-super for ease of use of other subsystems
* add some prometheus timers for collation generation subsystem
* add timing metrics to av-store
* add metrics to candidate backing
* add timing metric to bitfield signing
* add timing metrics to candidate selection
* add timing metrics to candidate-validation
* add timing metrics to chain-api
* add timing metrics to provisioner
* add timing metrics to runtime-api
* add timing metrics to availability-distribution
* add timing metrics to bitfield-distribution
* add timing metrics to collator protocol: collator side
* add timing metrics to collator protocol: validator side
* fix candidate validation test failures
* add timing metrics to pov distribution
* add timing metrics to statement-distribution
* use substrate_prometheus_endpoint prometheus reexport instead of prometheus_super
* don't include JOB_DELAY in bitfield-signing metrics
* give adder-collator ability to easily export its genesis-state and validation code
* wip: adder-collator pushbutton script
* don't attempt to register the adder-collator automatically
Instead, get these values with
```sh
target/release/adder-collator export-genesis-state
target/release/adder-collator export-genesis-wasm
```
And then register the parachain on https://polkadot.js.org/apps/?rpc=ws%3A%2F%2F127.0.0.1%3A9944#/explorer
To collect prometheus data, after running the script, create `prometheus.yml` per the instructions
at https://www.notion.so/paritytechnologies/Setting-up-Prometheus-locally-835cb3a9df7541a781c381006252b5ff
and then run:
```sh
docker run -v `pwd`/prometheus.yml:/etc/prometheus/prometheus.yml:z --network host prom/prometheus
```
Demonstrates that data makes it across to prometheus, though it is likely to be useful in the future
to tweak the buckets.
* Update parachain/test-parachains/adder/collator/src/cli.rs
Co-authored-by: Andronik Ordian <write@reusable.software>
* use the grandpa-pause parameter
* skip metrics in tracing instrumentation
* remove unnecessary grandpa_pause cli param
Co-authored-by: Andronik Ordian <write@reusable.software>
* drop in tracing to replace log
* add structured logging to trace messages
* add structured logging to debug messages
* add structured logging to info messages
* add structured logging to warn messages
* add structured logging to error messages
* normalize spacing and Display vs Debug
* add instrumentation to the various 'fn run'
* use explicit tracing module throughout
* fix availability distribution test
* don't double-print errors
* remove further redundancy from logs
* fix test errors
* fix more test errors
* remove unused kv_log_macro
* fix unused variable
* add tracing spans to collation generation
* add tracing spans to av-store
* add tracing spans to backing
* add tracing spans to bitfield-signing
* add tracing spans to candidate-selection
* add tracing spans to candidate-validation
* add tracing spans to chain-api
* add tracing spans to provisioner
* add tracing spans to runtime-api
* add tracing spans to availability-distribution
* add tracing spans to bitfield-distribution
* add tracing spans to network-bridge
* add tracing spans to collator-protocol
* add tracing spans to pov-distribution
* add tracing spans to statement-distribution
* add tracing spans to overseer
* cleanup
* HRMP: Update the impl guide
* HRMP: Incorporate the channel notifications into the guide
* HRMP: Renaming in the impl guide
* HRMP: Constrain the maximum number of HRMP messages per candidate
This commit addresses the HRMP part of https://github.com/paritytech/polkadot/issues/1869
* XCM: Introduce HRMP related message types
* HRMP: Data structures and plumbing
* HRMP: Configuration
* HRMP: Data layout
* HRMP: Acceptance & Enactment
* HRMP: Test base logic
* Update adder collator
* HRMP: Runtime API for accessing inbound messages
Also, removing some redundant fully-qualified names.
* HRMP: Add diagnostic logging in acceptance criteria
* HRMP: Additional tests
* Self-review fixes
* save test refactorings for the next time
* Missed a return statement.
* a formatting blip
* Add missing logic for appending HRMP digests
* Remove the channel contents vectors which became empty
* Tighten HRMP channel digests invariants.
* Apply suggestions from code review
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
* Remove a note about sorting for channel id
* Add missing rustdocs to the configuration
* Clarify and update the invariant for HrmpChannelDigests
* Make the onboarding invariant less sloppy
Namely, introduce `Paras::is_valid_para` (in fact, it already is present
in the implementation) and hook up the invariant to that.
Note that this says "within a session" because I don't want to make it
super strict on the session boundary. The logic on the session boundary
should be extremely careful.
* Make `CandidateCheckContext` use T::BlockNumber for hrmp_watermark
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
We need to distribute the PoV after we have seconded it. Other nodes
that will receive our `Secondded` statement and want to validate the
candidate another time will request this PoV from us.
* Make `CandidateHash` a real type
This pr adds a new type `CandidateHash` that is used instead of the
opaque `Hash` type. This helps to ensure on the type system level that
we are passing the correct types.
This pr also fixes wrong usage of `relay_parent` as `candidate_hash`
when communicating with the av storage.
* Update core-primitives/src/lib.rs
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
* Wrap the lines
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
* backing: extract log target
* bitfield-signing: extract log target
* utils: fix a typo
* provisioner: extract log target
* candidate selection: remove unused error variant
* bitfield-distribution: change the return type of run
* pov-distribution: extract log target
* collator-protocol: simplify runtime request
* collation-generation: do not exit early on error
* collation-generation: do not exit on double init
* collator-protocol: do not exit on errors and rename LOG_TARGET
* collator-protocol: a workaround for ununused imports warning
* Update node/network/bitfield-distribution/src/lib.rs
* collation-generation: elevate warn! to error!
* collator-protocol: fix imports
* post merge fix
* fix compilation
* Moare fixes for parachains
- Sending data to a job should always contain a relay parent. Done this
for the provisioner
- Fixed the `select_availability_bitfields` function. It was assuming we
have one core per validator, while we only have one core per parachain.
- Drive by async "rewrite" in proposer
* Make tests compile
* Update primitives/src/v1.rs
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
* Parachain improvements
- Set the parachains configuration in Rococo genesis
- Don't stop the overseer when a subsystem job is stopped
- Several small code changes
* Remove unused functionality
* Return error from the runtime instead of printing it
* Apply suggestions from code review
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
* Update primitives/src/v1.rs
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
* Update primitives/src/v1.rs
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
* Fix test
* Revert "Update primitives/src/v1.rs"
This reverts commit 11fce2785acd1de481ca57815b8e18400f09fd52.
* Revert "Update primitives/src/v1.rs"
This reverts commit d6439fed4f954360c89fb1e12b73954902c76a41.
* Revert "Return error from the runtime instead of printing it"
This reverts commit cb4b5c0830ac516a6d54b2c24197e9354f2b98cb.
* Revert "Fix test"
This reverts commit 0c5fa1b5566d4cd3c55a55d485e707165ce7a59e.
* Update runtime/parachains/src/runtime_api_impl/v1.rs
Co-authored-by: Sergei Shulepov <sergei@parity.io>
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
Co-authored-by: Sergei Shulepov <sergei@parity.io>
* DMP: data structures and plumbing
* DMP: Implement DMP logic in the router module
DMP: Integrate DMP parts into the inclusion module
* DMP: Introduce the max size limit for the size of a downward message
* DMP: Runtime API for accessing inbound messages
* OCD
Small clean ups
* DMP: fix the naming of the error
* DMP: add caution about a non-existent recipient
* annoying whitespaces
* update guide
Add `CheckValidationOutputs` runtime api and also change the
candidate-validation stuff
* promote ValidationOutputs to global primitives
i.e. move it from node specific primitives to global v1 primitives. This
will be needed when we share it later in the runtime inclusion module
* refactor acceptance checks in the inclusion module
factor out the common code to share it during the block inclusion and
for the forthcoming CheckValidationOutputs runtime api.
Also note that the acceptance criteria was updated to incorporate checks
that exist now in candidate-validation
* plumb the runtime api outside
* extract validation_data from ValidationOutputs
* use runtime-api to check validation outputs
apart from that refactor, update docs and tidy a bit
* Update the maxium code size
This is to fix a test that performs an upgrade.
* update primitives
* correct parent_head field
* make hrmp field pub
* refactor validation data: runtime
* refactor validation data: messages
* add arguments to full_validation_data runtime API
* port runtime API
* mostly port over candidate validation
* remove some parameters from ValidationParams
* guide: update candidate validation
* update candidate outputs
* update ValidationOutputs in primitives
* port over candidate validation
* add a new test for no-transient behavior
* update util runtime API wrappers
* candidate backing
* fix missing imports
* change some fields of validation data around
* runtime API impl
* update candidate validation
* fix backing tests
* grumbles from review
* fix av-store tests
* fix some more crates
* fix provisioner tests
* fix availability distribution tests
* port collation-generation to new validation data
* fix overseer tests
* Update roadmap/implementers-guide/src/node/utility/candidate-validation.md
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
* service-new: cosmetic changes
* overseer: draft of prometheus metrics
* metrics: update active_leaves metrics
* metrics: extract into functions
* metrics: resolve XXX
* metrics: it's ugly, but it works
* Bump Substrate
* metrics: move a bunch of code around
* Bumb substrate again
* metrics: fix a warning
* fix a warning in runtime
* metrics: statements signed
* metrics: statements impl RegisterMetrics
* metrics: refactor Metrics trait
* metrics: add Metrics assoc type to JobTrait
* metrics: move Metrics trait to util
* metrics: fix overseer
* metrics: fix backing
* metrics: fix candidate validation
* metrics: derive Default
* metrics: docs
* metrics: add stubs for other subsystems
* metrics: add more stubs and fix compilation
* metrics: fix doctest
* metrics: move to subsystem
* metrics: fix candidate validation
* metrics: bitfield signing
* metrics: av store
* metrics: chain API
* metrics: runtime API
* metrics: stub for avad
* metrics: candidates seconded
* metrics: ok I gave up
* metrics: provisioner
* metrics: remove a clone by requiring Metrics: Sync
* metrics: YAGNI
* metrics: remove another TODO
* metrics: for later
* metrics: add parachain_ prefix
* metrics: s/signed_statement/signed_statements
* utils: add a comment for job metrics
* metrics: address review comments
* metrics: oops
* metrics: make sure to save files before commit 😅
* use _total suffix for requests metrics
Co-authored-by: Max Inden <mail@max-inden.de>
* metrics: add tests for overseer
* update Cargo.lock
* overseer: add a test for CollationGeneration
* collation-generation: impl metrics
* collation-generation: use kebab-case for name
* collation-generation: add a constructor
Co-authored-by: Gav Wood <gavin@parity.io>
Co-authored-by: Ashley Ruglys <ashley.ruglys@gmail.com>
Co-authored-by: Max Inden <mail@max-inden.de>
* sketch out provisioner basics
* handle provisionable data
* stub out select_inherent_data
* split runtime APIs into sub-chapters to improve linkability
* explain SignedAvailabilityBitfield semantics
* add internal link to further documentation
* some more work figuring out how the provisioner can do its thing
* fix broken link
* don't import enum variants where it's one layer deep
* make request_availability_cores a free fn in util
* document more precisely what should happen on block production
* finish first-draft implementation of provisioner
* start working on the full and proper backed candidate selection rule
* Pass number of block under construction via RequestInherentData
* Revert "Pass number of block under construction via RequestInherentData"
This reverts commit 850fe62cc0dfb04252580c21a985962000e693c8.
That initially looked like the better approach--it spent the time
budget for fetching the block number in the proposer, instead of
the provisioner, and that felt more appropriate--but it turns out
not to be obvious how to get the block number of the block under
construction from within the proposer. The Chain API may be less
ideal, but it should be easier to implement.
* wip: get the block under production from the Chain API
* add ChainApiMessage to AllMessages
* don't break the run loop if a provisionable data channel closes
* clone only those backed candidates which are coherent
* propagate chain_api subsystem through various locations
* add delegated_subsystem! macro to ease delegating subsystems
Unfortunately, it doesn't work right:
```
error[E0446]: private type `CandidateBackingJob` in public interface
--> node/core/backing/src/lib.rs:775:1
|
86 | struct CandidateBackingJob {
| - `CandidateBackingJob` declared as private
...
775 | delegated_subsystem!(CandidateBackingJob as CandidateBackingSubsystem);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ can't leak private type
```
I'm not sure precisely what's going wrong, here; I suspect the problem is
the use of `$job as JobTrait>::RunArgs` and `::ToJob`; the failure would be
that it's not reifying the types to verify that the actual types are public,
but instead referring to them via `CandidateBackingJob`, which is in fact private;
that privacy is the point.
Going to see if I can generic my way out of this, but we may be headed for a
quick revert here.
* fix delegated_subsystem
The invocation is a bit more verbose than I'd prefer, but it's also
more explicit about what types need to be public. I'll take it as a win.
* add provisioning subsystem; reduce public interface of provisioner
* deny missing docs in provisioner
* refactor core selection per code review suggestion
This is twice as much code when measured by line, but IMO it is
in fact somewhat clearer to read, so overall a win.
Also adds an improved rule for selecting availability bitfields,
which (unlike the previous implementation) guarantees that the
appropriate postconditions hold there.
* fix bad merge double-declaration
* update guide with (hopefully) complete provisioner candidate selection procedure
* clarify candidate selection algorithm
* Revert "clarify candidate selection algorithm"
This reverts commit c68a02ac9cf42b3a4a28eb197d38633a40d0e3e6.
* clarify candidate selection algorithm
* update provisioner to implement candidate selection per the guide
* add test that no more than one bitfield is selected per validator
* add test that each selected bitfield corresponds to an occupied core
* add test that more set bits win conflicts
* add macro for specializing runtime requests; specailize all runtime requests
* add tests harness for select_candidates tests
* add first real select_candidates test, fix test_harness
* add mock overseer and test that success is possible
* add test that the candidate selection algorithm picks the right ones
* make candidate selection test somewhat more stringent
* polkadot-subsystem: update runtime API message types
* update all networking subsystems to use fallible runtime APIs
* fix bitfield-signing and make it use new runtime APIs
* port candidate-backing to handle runtime API errors and new types
* remove old runtime API messages
* remove unused imports
* fix grumbles
* fix backing tests
* Initial commit
* WIP
* Make atomic transactions
* Remove pruning code
* Fix build and add a Nop to bridge
* Fixes from review
* Move config struct around for clarity
* Rename constructor and warn on missing docs
* Fix a test and rename a message
* Fix some more reviews
* Obviously failed to rebase cleanly
* add ActiveLeavesUpdate, remove StartWork, StopWork
* replace StartWork, StopWork in subsystem crate tests
* mechanically update OverseerSignal in other modules
* convert overseer to take advantage of new multi-hash update abilities
Note: this does not yet convert the tests; some of the tests now freeze:
test tests::overseer_start_stop_works ... test tests::overseer_start_stop_works has been running for over 60 seconds
test tests::overseer_finalize_works ... test tests::overseer_finalize_works has been running for over 60 seconds
* fix broken overseer tests
* manually impl PartialEq for ActiveLeavesUpdate, rm trait Equivalent
This cleans up the code a bit and makes it easier in the future to
do the right thing when comparing ALUs.
* use target in all network bridge logging
* reduce spamming of and