Close#2992
Breaking changes:
- rpc server grafana metric `substrate_rpc_requests_started` is removed
(not possible to implement anymore)
- rpc server grafana metric `substrate_rpc_requests_finished` is removed
(not possible to implement anymore)
- rpc server ws ping/pong not ACK:ed within 30 seconds more than three
times then the connection will be closed
Added
- rpc server grafana metric `substrate_rpc_sessions_time` is added to
get the duration for each websocket session
Initial implementation for the plan discussed here: https://github.com/paritytech/polkadot-sdk/issues/701
Built on top of https://github.com/paritytech/polkadot-sdk/pull/1178
v0: https://github.com/paritytech/polkadot/pull/7554,
## Overall idea
When approval-voting checks a candidate and is ready to advertise the
approval, defer it in a per-relay chain block until we either have
MAX_APPROVAL_COALESCE_COUNT candidates to sign or a candidate has stayed
MAX_APPROVALS_COALESCE_TICKS in the queue, in both cases we sign what
candidates we have available.
This should allow us to reduce the number of approvals messages we have
to create/send/verify. The parameters are configurable, so we should
find some values that balance:
- Security of the network: Delaying broadcasting of an approval
shouldn't but the finality at risk and to make sure that never happens
we won't delay sending a vote if we are past 2/3 from the no-show time.
- Scalability of the network: MAX_APPROVAL_COALESCE_COUNT = 1 &
MAX_APPROVALS_COALESCE_TICKS =0, is what we have now and we know from
the measurements we did on versi, it bottlenecks
approval-distribution/approval-voting when increase significantly the
number of validators and parachains
- Block storage: In case of disputes we have to import this votes on
chain and that increase the necessary storage with
MAX_APPROVAL_COALESCE_COUNT * CandidateHash per vote. Given that
disputes are not the normal way of the network functioning and we will
limit MAX_APPROVAL_COALESCE_COUNT in the single digits numbers, this
should be good enough. Alternatively, we could try to create a better
way to store this on-chain through indirection, if that's needed.
## Other fixes:
- Fixed the fact that we were sending random assignments to
non-validators, that was wrong because those won't do anything with it
and they won't gossip it either because they do not have a grid topology
set, so we would waste the random assignments.
- Added metrics to be able to debug potential no-shows and
mis-processing of approvals/assignments.
## TODO:
- [x] Get feedback, that this is moving in the right direction. @ordian
@sandreim @eskimor @burdges, let me know what you think.
- [x] More and more testing.
- [x] Test in versi.
- [x] Make MAX_APPROVAL_COALESCE_COUNT &
MAX_APPROVAL_COALESCE_WAIT_MILLIS a parachain host configuration.
- [x] Make sure the backwards compatibility works correctly
- [x] Make sure this direction is compatible with other streams of work:
https://github.com/paritytech/polkadot-sdk/issues/635 &
https://github.com/paritytech/polkadot-sdk/issues/742
- [x] Final versi burn-in before merging
---------
Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
This moves the macro related re-exports to `__private` to make it more
obvious for downstream users that they are using an internal api.
---------
Co-authored-by: command-bot <>
Adds a `NodeFeatures` bitfield value to the runtime `HostConfiguration`,
with the purpose of coordinating the enabling of node-side features,
such as: https://github.com/paritytech/polkadot-sdk/issues/628 and
https://github.com/paritytech/polkadot-sdk/issues/598.
These are features that require all validators enable them at the same
time, assuming all/most nodes have upgraded their node versions.
This PR doesn't add any feature yet. These are coming in future PRs.
Also adds a runtime API for querying the state of the client features
and an extrinsic for setting/unsetting a feature by its index in the bitfield.
Note: originally part of:
https://github.com/paritytech/polkadot-sdk/pull/1644, but posted as
standalone to be reused by other PRs until the initial PR is merged
This PR contains some fixes and cleanups for parachain nodes:
1. When using async backing, node no longer complains about being unable
to reach the prospective-parachain subsystem.
2. Parachain warp sync now informs users that the finalized para block
has been retrieved.
```
2023-11-08 13:24:42 [Parachain] 🎉 Received finalized parachain header #5747719 (0xa0aa…674b) from the relay chain.
```
3. When a user supplied an invalid `--relay-chain-rpc-url`, we were
crashing with a very verbose message. Removed the `expect` and improved
the error message.
```
2023-11-08 13:57:56 [Parachain] No valid RPC url found. Stopping RPC worker.
2023-11-08 13:57:56 [Parachain] Essential task `relay-chain-rpc-worker` failed. Shutting down service.
Error: Service(Application(WorkerCommunicationError("RPC worker channel closed. This can hint and connectivity issues with the supplied RPC endpoints. Message: oneshot canceled")))
```
- Async-backing related primitives are stable `primitives::v6`
- Async-backing API is now part of `api_version(7)`
- It's enabled on Rococo and Westend runtimes
---------
Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
Co-authored-by: Andrei Sandu <54316454+sandreim@users.noreply.github.com>
* move min backing votes const to runtime
also cache it per-session in the backing subsystem
Signed-off-by: alindima <alin@parity.io>
* add runtime migration
* introduce api versioning for min_backing votes
also enable it for rococo/versi for testing
* also add min_backing_votes runtime calls to statement-distribution
this dependency has been recently introduced by async backing
* remove explicit version runtime API call
this is not needed, as the RuntimeAPISubsystem already takes care
of versioning and will return NotSupported if the version is not
right.
* address review comments
- parametrise backing votes runtime API with session index
- remove RuntimeInfo usage in backing subsystem, as runtime API
caches the min backing votes by session index anyway.
- move the logic for adjusting the configured needed backing votes with the size of the backing group
to a primitives helper.
- move the legacy min backing votes value to a primitives helper.
- mark JoinMultiple error as fatal, since the Canceled (non-multiple) counterpart is also fatal.
- make backing subsystem handle fatal errors for new leaves update.
- add HostConfiguration consistency check for zeroed backing votes threshold
- add cumulus accompanying change
* fix cumulus test compilation
* fix tests
* more small fixes
* fix merge
* bump runtime api version for westend and rollback version for rococo
---------
Signed-off-by: alindima <alin@parity.io>
Co-authored-by: Javier Viola <javier@parity.io>
* Update substrate & polkadot
* min changes to make async backing compile
* (async backing) parachain-system: track limitations for unincluded blocks (#2438)
* unincluded segment draft
* read para head from storage proof
* read_para_head -> read_included_para_head
* Provide pub interface
* add errors
* fix unincluded segment update
* BlockTracker -> Ancestor
* add a dmp limit
* Read para head depending on the storage switch
* doc comments
* storage items docs
* add a sanity check on block initialize
* Check watermark
* append to the segment on block finalize
* Move segment update into set_validation_data
* Resolve para head todo
* option watermark
* fix comment
* Drop dmq check
* fix weight
* doc-comments on inherent invariant
* Remove TODO
* add todo
* primitives tests
* pallet tests
* doc comments
* refactor unincluded segment length into a ConsensusHook (#2501)
* refactor unincluded segment length into a ConsensusHook
* add docs
* refactor bandwidth_out calculation
Co-authored-by: Chris Sosnin <48099298+slumber@users.noreply.github.com>
* test for limits from impl
* fmt
* make tests compile
* update comment
* uncomment test
* fix collator test by adding parent to state proof
* patch HRMP watermark rules for unincluded segment
* get consensus-common tests to pass, using unincluded segment
* fix unincluded segment tests
* get all tests passing
* fmt
* rustdoc CI
* aura-ext: limit the number of authored blocks per slot (#2551)
* aura_ext consensus hook
* reverse dependency
* include weight into hook
* fix tests
* remove stray println
Co-authored-by: Chris Sosnin <48099298+slumber@users.noreply.github.com>
* fix test warning
* fix doc link
---------
Co-authored-by: Chris Sosnin <48099298+slumber@users.noreply.github.com>
Co-authored-by: Chris Sosnin <chris125_@live.com>
* parachain-system: ignore go ahead signal once upgrade is processed (#2594)
* handle goahead signal for unincluded segment
* doc comment
* add test
* parachain-system: drop processed messages from inherent data (#2590)
* implement `drop_processed_messages`
* drop messages based on relay parent number
* adjust tests
* drop changes to mqc
* fix comment
* drop test
* drop more dead code
* clippy
* aura-ext: check slot in consensus hook and remove all `CheckInherents` logic (#2658)
* aura-ext: check slot in consensus hook
* convert relay chain slot
* Make relay chain slot duration generic
* use fixed velocity hook for pallets with aura
* purge timestamp inherent
* fix warning
* adjust runtime tests
* fix slots in tests
* Make `xcm-emulator` test pass for new consensus hook (#2722)
* add pallets on_initialize
* tests pass
* add AuraExt on_init
* ".git/.scripts/commands/fmt/fmt.sh"
---------
Co-authored-by: command-bot <>
---------
Co-authored-by: Ignacio Palacios <ignacio.palacios.santos@gmail.com>
* update polkadot git refs
* CollationGenerationConfig closure is now optional (#2772)
* CollationGenerationConfig closure is now optional
* fix test
* propagate network-protocol-staging feature (#2899)
* Feature Flagging Consensus Hook Type Parameter (#2911)
* First pass
* fmt
* Added as default feature in tomls
* Changed to direct dependency feature
* Dealing with clippy error
* Update pallets/parachain-system/src/lib.rs
Co-authored-by: asynchronous rob <rphmeier@gmail.com>
---------
Co-authored-by: asynchronous rob <rphmeier@gmail.com>
* fmt
* bump deps and remove warning
* parachain-system: update RelevantMessagingState according to the unincluded segment (#2948)
* mostly address 2471 with a bug introduced
* adjust relevant messaging state after computing total
* fmt
* max -> min
* fix test implementation of xcmp source
* add test
* fix test message sending logic
* fix + test
* add more to unincluded segment test
* fmt
---------
Co-authored-by: Chris Sosnin <chris125_@live.com>
* Integrate new Aura / Parachain Consensus Logic in Parachain-Template / Polkadot-Parachain (#2864)
* add a comment
* refactor client/service utilities
* deprecate start_collator
* update parachain-template
* update test-service in the same way
* update polkadot-parachain crate
* fmt
* wire up new SubmitCollation message
* some runtime utilities for implementing unincluded segment runtime APIs
* allow parachains to configure their level of sybil-resistance when starting the network
* make aura-ext compile
* update to specify sybil resistance levels
* fmt
* specify relay chain slot duration in milliseconds
* update Aura to explicitly produce Send futures
also, make relay_chain_slot_duration a Duration
* add authoring duration to basic collator and document params
* integrate new basic collator into parachain-template
* remove assert_send used for testing
* basic-aura: only author when parent included
* update polkadot-parachain-bin
* fmt
* some fixes
* fixes
* add a RelayNumberMonotonicallyIncreases
* add a utility function for initializing subsystems
* some logging for timestamp adjustment
* fmt
* some fixes for lookahead collator
* add a log
* update `find_potential_parents` to account for sessions
* bound the loop
* restore & deprecate old start_collator and start_full_node functions.
* remove unnecessary await calls
* fix warning
* clippy
* more clippy
* remove unneeded logic
* ci
* update comment
Co-authored-by: Marcin S. <marcin@bytedude.com>
* (async backing) restore `CheckInherents` for backwards-compatibility (#2977)
* bring back timestamp
* Restore CheckInherents
* revert to empty CheckInherents
* make CheckInherents optional
* attempt
* properly end system blocks
* add some more comments
* ignore failing system parachain tests
* update refs after main feature branch merge
* comment out the offending tests because CI runs ignored tests
* fix warnings
* fmt
* revert to polkadot master
* cargo update -p polkadot-primitives -p sp-io
---------
Co-authored-by: asynchronous rob <rphmeier@gmail.com>
Co-authored-by: Ignacio Palacios <ignacio.palacios.santos@gmail.com>
Co-authored-by: Bradley Olson <34992650+BradleyOlson64@users.noreply.github.com>
Co-authored-by: Marcin S. <marcin@bytedude.com>
Co-authored-by: eskimor <eskimor@users.noreply.github.com>
Co-authored-by: Andronik <write@reusable.software>
* Use primitives reexported from `polkadot_primitives` crate root
* restart CI
* Fixes after merge
* update lockfile for {"polkadot", "substrate"}
Co-authored-by: parity-processbot <>
* Allow specification of multiple urls for relay chain rpc nodes
* Add pooled RPC client basics
* Add list of clients to pooled client
* Improve
* Forward requests to dispatcher
* Switch clients on error
* Implement rotation logic
* Improve subscription handling
* Error handling cleanup
* Remove retry from rpc-client
* Improve naming
* Improve documentation
* Improve `ClientManager` abstraction
* Adjust zombienet test
* Add more comments
* fmt
* Apply reviewers comments
* Extract reconnection to extra method
* Add comment to reconnection method
* Clean up some dependencies
* Fix build
* fmt
* Provide alias for cli argument
* Apply review comments
* Rename P* to Relay*
* Improve zombienet test
* fmt
* Fix zombienet sleep
* Simplify zombienet test
* Reduce log clutter and fix starting position
* Do not distribute duplicated imported and finalized blocks
* fmt
* Apply code review suggestions
* Move building of relay chain interface to `cumulus-client-service`
* Refactoring to not push back into channel
* FMT
* Add minimal overseer gen with dummy subsystems
* Fix dependencies
* no-compile: only client transaction pool missing
* Remove unused imports
* Continue to hack towards PoC
* Continue
* Make mini node compile
* Compiling version with blockchainevents trait
* Continue
* Check in lockfile
* Block with tokio
* update patches
* Update polkadot patches
* Use polkadot-primitives v2
* Fix build problems
* First working version
* Adjust cargo.lock
* Add integration test
* Make integration test work
* Allow startinc collator without relay-chain args
* Make OverseerRuntimeClient async
* Create separate integration test
* Remove unused ChainSelection code
* Remove unused parameters on new-mini
* Connect collator node in test to relay chain nodes
* Make BlockChainRPCClient obsolete
* Clean up
* Clean up
* Reimplement blockchain-rpc-events
* Revert "Allow startinc collator without relay-chain args"
This reverts commit f22c70e16521f375fe125df5616d48ceea926b1a.
* Add `strict_record_validation` to AuthorityDiscovery
* Move network to cumulus
* Remove BlockchainRPCEvents
* Remove `BlockIdTo` and `BlockchainEvents`
* Make AuthorityDiscovery async
* Use hash in OverseerRuntime
* Adjust naming of runtime client trait
* Implement more rpc-client methods
* Improve error handling for `ApiError`
* Extract authority-discovery creationand cleanup
* RPC -> Rpc
* Extract bitswap
* Adjust to changes on master
* Implement `hash` method
* Introduce DummyChainSync, remove ProofProvider and BlockBackend
* Remove `HeaderMetadata` from blockchain-rpc-client
* Make ChainSync work
* Implement NetworkHeaderBackend
* Cleanup
* Adjustments after master merge
* Remove ImportQueue from network parameters
* Remove cargo patches
* Eliminate warnings
* Revert to HeaderBackend
* Add zombienet test
* Implement `status()` method
* Add more comments, improve readability
* Remove patches from Cargo.toml
* Remove integration test in favor of zombienet
* Remove unused dependencies, rename minimal node crate
* Adjust to latest master changes
* fmt
* Execute zombienet test on gitlab ci
* Reuse network metrics
* Chainsync metrics
* fmt
* Feed RPC node as boot node to the relay chain minimal node
* fmt
* Add bootnodes to zombienet collators
* Allow specification of relay chain args
* Apply review suggestions
* Remove unnecessary casts
* Enable PoV recovery for rpc full nodes
* Revert unwanted changes
* Make overseerHandle non-optional
* Add availability-store subsystem
* Add AuxStore and ChainApiSubsystem
* Add availability distribution subsystem
* Improve pov-recovery logging and add RPC nodes to tests
* fmt
* Make availability config const
* lock
* Enable debug logs for pov-recovery in zombienet
* Add log filters to test binary
* Allow wss
* Address review comments
* Apply reviewer comments
* Adjust to master changes
* Apply reviewer suggestions
* Bump polkadot
* Add builder method for minimal node
* Bump substrate and polkadot
* Clean up overseer building
* Add bootnode to two in pov_recovery test
* Fix missing quote in pov recovery zombienet test
* Improve zombienet pov test
* More debug logs for pov-recovery
* Remove reserved nodes like on original test
* Revert zombienet test to master
* Extract json-rpc-client and introduce worker
* Initial rpc worker
* Add error handling
* Use bounded channels for listeners
* Improve naming and clean up
* Use tracing channels
* Improve code readability
* Decrease channel size limit
* Remove unused dependency
* Fix docs
* RPC -> Rpc
* Start worker in initialization method
* Print error in case a distribution channel is full
* Fix docs
* Make `RpcStreamWorker` private
Co-authored-by: Davide Galassi <davxy@datawok.net>
* Use tokio channels and add TODO item
* Remove `Option` from `to_worker_channel`
Co-authored-by: Davide Galassi <davxy@datawok.net>
* Initial network interface preparations
* Implement get_storage_by_key
* Implement `validators` and `session_index_for_child`
* Implement persisted_validation_data and candidate_pending_availability
* Fix method name for persisted_validation_data and add encoded params
* Implement `retrieve_dmq_contents` and `retrieve_all_inbound_hrmp_channel_contents`
* Implement `prove_read`
* Introduce separate RPC client, expose JsonRpSee errors
* Simplify closure in call_remote_runtime_function
* Implement import stream, upgrade JsonRpSee
* Implement finality stream
* Remove unused method from interface
* Implement `is_major_syncing`
* Implement `wait_on_block`
* Fix tests
* Unify error handling `ApiError`
* Replace WaitError with RelayChainError
* Wrap BlockChainError in RelayChainError
* Unify error handling in relay chain intefaces
* Fix return type of proof method
* Improve error handling of new methods
* Improve error handling and move logging outside of interface
* Clean up
* Remove unwanted changes, clean up
* Remove unused import
* Add format for StatemachineError and remove nused From trait
* Use 'thiserror' crate to simplify error handling
* Expose error for overseer, further simplify error handling
* Reintroduce network interface
* Implement cli option
* Adjust call_state method to use hashes
* Disable PoV recovery when RPC is used
* Add integration test for network full node
* Use Hash instead of BlockId to ensure compatibility with RPC interface
* Fix cargo check warnings
* Implement retries
* Remove `expect` statements from code
* Update jsonrpsee to 0.8.0 and make collator keys optional
* Make cli arguments conflicting
* Remove unused `block_status` method
* Add clippy fixes
* Cargo fmt
* Validate relay chain rpc url
* Clean up dependencies and add one more integration test
* Clean up
* Clean up dependencies of relay-chain-network
* Use hash instead of blockid for rpc methods
* Fix tests
* Update client/cli/src/lib.rs
Co-authored-by: Koute <koute@users.noreply.github.com>
* Improve error message of cli validation
* Add rpc client constructor
* Do not use debug formatting for errors
* Improve logging for remote runtime methods
* Only retry on transport problems
* Use PHash by value, rename test
* Improve tracing, return error on relay-chain-interface build
* Fix naming, use generics instead of deserializing manually
* Rename RelayChainLocal and RelayChainNetwork
* lock
* Format
* Use impl trait for encodable runtime payload
* Only instantiate full node in tests when we need it
* Upgrade scale-codec to 3.0.0
* Improve expect log
Co-authored-by: Koute <koute@users.noreply.github.com>