[litep2p](https://github.com/altonen/litep2p) is a libp2p-compatible P2P
networking library. It supports all of the features of `rust-libp2p`
that are currently being utilized by Polkadot SDK.
Compared to `rust-libp2p`, `litep2p` has a quite different architecture
which is why the new `litep2p` network backend is only able to use a
little of the existing code in `sc-network`. The design has been mainly
influenced by how we'd wish to structure our networking-related code in
Polkadot SDK: independent higher-levels protocols directly communicating
with the network over links that support bidirectional backpressure. A
good example would be `NotificationHandle`/`RequestResponseHandle`
abstractions which allow, e.g., `SyncingEngine` to directly communicate
with peers to announce/request blocks.
I've tried running `polkadot --network-backend litep2p` with a few
different peer configurations and there is a noticeable reduction in
networking CPU usage. For high load (`--out-peers 200`), networking CPU
usage goes down from ~110% to ~30% (80 pp) and for normal load
(`--out-peers 40`), the usage goes down from ~55% to ~18% (37 pp).
These should not be taken as final numbers because:
a) there are still some low-hanging optimization fruits, such as
enabling [receive window
auto-tuning](https://github.com/libp2p/rust-yamux/pull/176), integrating
`Peerset` more closely with `litep2p` or improving memory usage of the
WebSocket transport
b) fixing bugs/instabilities that incorrectly cause `litep2p` to do less
work will increase the networking CPU usage
c) verification in a more diverse set of tests/conditions is needed
Nevertheless, these numbers should give an early estimate for CPU usage
of the new networking backend.
This PR consists of three separate changes:
* introduce a generic `PeerId` (wrapper around `Multihash`) so that we
don't have use `NetworkService::PeerId` in every part of the code that
uses a `PeerId`
* introduce `NetworkBackend` trait, implement it for the libp2p network
stack and make Polkadot SDK generic over `NetworkBackend`
* implement `NetworkBackend` for litep2p
The new library should be considered experimental which is why
`rust-libp2p` will remain as the default option for the time being. This
PR currently depends on the master branch of `litep2p` but I'll cut a
new release for the library once all review comments have been
addresses.
---------
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Co-authored-by: Dmitry Markin <dmitry@markin.tech>
Co-authored-by: Alexandru Vasile <60601340+lexnv@users.noreply.github.com>
Co-authored-by: Alexandru Vasile <alexandru.vasile@parity.io>
Runtime release 1.2 includes bumping of the ParachainHost APIs up to
v10, so let's move all the released APIs out of vstaging folder, this PR
does not include any logic changes only renaming of the modules and some
moving around.
Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
The PR adds two things:
1. Runtime API exposing the whole claim queue
2. Consumes the API in `collation-generation` to fetch the next
scheduled `ParaEntry` for an occupied core.
Related to https://github.com/paritytech/polkadot-sdk/issues/1797
Initial implementation for the plan discussed here: https://github.com/paritytech/polkadot-sdk/issues/701
Built on top of https://github.com/paritytech/polkadot-sdk/pull/1178
v0: https://github.com/paritytech/polkadot/pull/7554,
## Overall idea
When approval-voting checks a candidate and is ready to advertise the
approval, defer it in a per-relay chain block until we either have
MAX_APPROVAL_COALESCE_COUNT candidates to sign or a candidate has stayed
MAX_APPROVALS_COALESCE_TICKS in the queue, in both cases we sign what
candidates we have available.
This should allow us to reduce the number of approvals messages we have
to create/send/verify. The parameters are configurable, so we should
find some values that balance:
- Security of the network: Delaying broadcasting of an approval
shouldn't but the finality at risk and to make sure that never happens
we won't delay sending a vote if we are past 2/3 from the no-show time.
- Scalability of the network: MAX_APPROVAL_COALESCE_COUNT = 1 &
MAX_APPROVALS_COALESCE_TICKS =0, is what we have now and we know from
the measurements we did on versi, it bottlenecks
approval-distribution/approval-voting when increase significantly the
number of validators and parachains
- Block storage: In case of disputes we have to import this votes on
chain and that increase the necessary storage with
MAX_APPROVAL_COALESCE_COUNT * CandidateHash per vote. Given that
disputes are not the normal way of the network functioning and we will
limit MAX_APPROVAL_COALESCE_COUNT in the single digits numbers, this
should be good enough. Alternatively, we could try to create a better
way to store this on-chain through indirection, if that's needed.
## Other fixes:
- Fixed the fact that we were sending random assignments to
non-validators, that was wrong because those won't do anything with it
and they won't gossip it either because they do not have a grid topology
set, so we would waste the random assignments.
- Added metrics to be able to debug potential no-shows and
mis-processing of approvals/assignments.
## TODO:
- [x] Get feedback, that this is moving in the right direction. @ordian
@sandreim @eskimor @burdges, let me know what you think.
- [x] More and more testing.
- [x] Test in versi.
- [x] Make MAX_APPROVAL_COALESCE_COUNT &
MAX_APPROVAL_COALESCE_WAIT_MILLIS a parachain host configuration.
- [x] Make sure the backwards compatibility works correctly
- [x] Make sure this direction is compatible with other streams of work:
https://github.com/paritytech/polkadot-sdk/issues/635 &
https://github.com/paritytech/polkadot-sdk/issues/742
- [x] Final versi burn-in before merging
---------
Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
This commit introduces a new concept called `NotificationService` which
allows Polkadot protocols to communicate with the underlying
notification protocol implementation directly, without routing events
through `NetworkWorker`. This implies that each protocol has its own
service which it uses to communicate with remote peers and that each
`NotificationService` is unique with respect to the underlying
notification protocol, meaning `NotificationService` for the transaction
protocol can only be used to send and receive transaction-related
notifications.
The `NotificationService` concept introduces two additional benefits:
* allow protocols to start using custom handshakes
* allow protocols to accept/reject inbound peers
Previously the validation of inbound connections was solely the
responsibility of `ProtocolController`. This caused issues with light
peers and `SyncingEngine` as `ProtocolController` would accept more
peers than `SyncingEngine` could accept which caused peers to have
differing views of their own states. `SyncingEngine` would reject excess
peers but these rejections were not properly communicated to those peers
causing them to assume that they were accepted.
With `NotificationService`, the local handshake is not sent to remote
peer if peer is rejected which allows it to detect that it was rejected.
This commit also deprecates the use of `NetworkEventStream` for all
notification-related events and going forward only DHT events are
provided through `NetworkEventStream`. If protocols wish to follow each
other's events, they must introduce additional abtractions, as is done
for GRANDPA and transactions protocols by following the syncing protocol
through `SyncEventStream`.
Fixes https://github.com/paritytech/polkadot-sdk/issues/512
Fixes https://github.com/paritytech/polkadot-sdk/issues/514
Fixes https://github.com/paritytech/polkadot-sdk/issues/515
Fixes https://github.com/paritytech/polkadot-sdk/issues/554
Fixes https://github.com/paritytech/polkadot-sdk/issues/556
---
These changes are transferred from
https://github.com/paritytech/substrate/pull/14197 but there are no
functional changes compared to that PR
---------
Co-authored-by: Dmitry Markin <dmitry@markin.tech>
Co-authored-by: Alexandru Vasile <60601340+lexnv@users.noreply.github.com>
Adds a `NodeFeatures` bitfield value to the runtime `HostConfiguration`,
with the purpose of coordinating the enabling of node-side features,
such as: https://github.com/paritytech/polkadot-sdk/issues/628 and
https://github.com/paritytech/polkadot-sdk/issues/598.
These are features that require all validators enable them at the same
time, assuming all/most nodes have upgraded their node versions.
This PR doesn't add any feature yet. These are coming in future PRs.
Also adds a runtime API for querying the state of the client features
and an extrinsic for setting/unsetting a feature by its index in the bitfield.
Note: originally part of:
https://github.com/paritytech/polkadot-sdk/pull/1644, but posted as
standalone to be reused by other PRs until the initial PR is merged
Collators were previously reencoding the available data and checking the
erasure root.
Replace that with just checking the PoV hash, which consumes much less
CPU and takes less time.
We also don't need to check the `PersistedValidationData` hash, as
collators don't use it.
Reason:
https://github.com/paritytech/polkadot-sdk/issues/575#issuecomment-1806572230
After systematic chunks recovery is merged, collators will no longer do
any reed-solomon encoding/decoding, which has proven to be a great CPU
consumer.
Signed-off-by: alindima <alin@parity.io>
This PR contains some fixes and cleanups for parachain nodes:
1. When using async backing, node no longer complains about being unable
to reach the prospective-parachain subsystem.
2. Parachain warp sync now informs users that the finalized para block
has been retrieved.
```
2023-11-08 13:24:42 [Parachain] 🎉 Received finalized parachain header #5747719 (0xa0aa…674b) from the relay chain.
```
3. When a user supplied an invalid `--relay-chain-rpc-url`, we were
crashing with a very verbose message. Removed the `expect` and improved
the error message.
```
2023-11-08 13:57:56 [Parachain] No valid RPC url found. Stopping RPC worker.
2023-11-08 13:57:56 [Parachain] Essential task `relay-chain-rpc-worker` failed. Shutting down service.
Error: Service(Application(WorkerCommunicationError("RPC worker channel closed. This can hint and connectivity issues with the supplied RPC endpoints. Message: oneshot canceled")))
```
When running with `--relay-chain-rpc-url` we received multiple reports
of high traffic that disappears when `--in-peers-light 0` is set. Indeed
it does not make much sense for light clients to connect to the minimal
node since it is not running the block announce protocol and the
request/response protocol for light clients.
This is intended to alleviate the traffic issues for now.
closes#1896
probably related https://github.com/paritytech/cumulus/issues/2563
- Async-backing related primitives are stable `primitives::v6`
- Async-backing API is now part of `api_version(7)`
- It's enabled on Rococo and Westend runtimes
---------
Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
Co-authored-by: Andrei Sandu <54316454+sandreim@users.noreply.github.com>
* polkadot: propagate UnpinHandle to ActiveLeafUpdate
Also extract the leaf creation for tests
into a common function.
* dispute-coordinator: try pinned blocks for slashin
* apparently 1.72 is smarter than 1.70
* address nits
* rename fresh_leaf to new_leaf
* move min backing votes const to runtime
also cache it per-session in the backing subsystem
Signed-off-by: alindima <alin@parity.io>
* add runtime migration
* introduce api versioning for min_backing votes
also enable it for rococo/versi for testing
* also add min_backing_votes runtime calls to statement-distribution
this dependency has been recently introduced by async backing
* remove explicit version runtime API call
this is not needed, as the RuntimeAPISubsystem already takes care
of versioning and will return NotSupported if the version is not
right.
* address review comments
- parametrise backing votes runtime API with session index
- remove RuntimeInfo usage in backing subsystem, as runtime API
caches the min backing votes by session index anyway.
- move the logic for adjusting the configured needed backing votes with the size of the backing group
to a primitives helper.
- move the legacy min backing votes value to a primitives helper.
- mark JoinMultiple error as fatal, since the Canceled (non-multiple) counterpart is also fatal.
- make backing subsystem handle fatal errors for new leaves update.
- add HostConfiguration consistency check for zeroed backing votes threshold
- add cumulus accompanying change
* fix cumulus test compilation
* fix tests
* more small fixes
* fix merge
* bump runtime api version for westend and rollback version for rococo
---------
Signed-off-by: alindima <alin@parity.io>
Co-authored-by: Javier Viola <javier@parity.io>
* Update substrate & polkadot
* min changes to make async backing compile
* (async backing) parachain-system: track limitations for unincluded blocks (#2438)
* unincluded segment draft
* read para head from storage proof
* read_para_head -> read_included_para_head
* Provide pub interface
* add errors
* fix unincluded segment update
* BlockTracker -> Ancestor
* add a dmp limit
* Read para head depending on the storage switch
* doc comments
* storage items docs
* add a sanity check on block initialize
* Check watermark
* append to the segment on block finalize
* Move segment update into set_validation_data
* Resolve para head todo
* option watermark
* fix comment
* Drop dmq check
* fix weight
* doc-comments on inherent invariant
* Remove TODO
* add todo
* primitives tests
* pallet tests
* doc comments
* refactor unincluded segment length into a ConsensusHook (#2501)
* refactor unincluded segment length into a ConsensusHook
* add docs
* refactor bandwidth_out calculation
Co-authored-by: Chris Sosnin <48099298+slumber@users.noreply.github.com>
* test for limits from impl
* fmt
* make tests compile
* update comment
* uncomment test
* fix collator test by adding parent to state proof
* patch HRMP watermark rules for unincluded segment
* get consensus-common tests to pass, using unincluded segment
* fix unincluded segment tests
* get all tests passing
* fmt
* rustdoc CI
* aura-ext: limit the number of authored blocks per slot (#2551)
* aura_ext consensus hook
* reverse dependency
* include weight into hook
* fix tests
* remove stray println
Co-authored-by: Chris Sosnin <48099298+slumber@users.noreply.github.com>
* fix test warning
* fix doc link
---------
Co-authored-by: Chris Sosnin <48099298+slumber@users.noreply.github.com>
Co-authored-by: Chris Sosnin <chris125_@live.com>
* parachain-system: ignore go ahead signal once upgrade is processed (#2594)
* handle goahead signal for unincluded segment
* doc comment
* add test
* parachain-system: drop processed messages from inherent data (#2590)
* implement `drop_processed_messages`
* drop messages based on relay parent number
* adjust tests
* drop changes to mqc
* fix comment
* drop test
* drop more dead code
* clippy
* aura-ext: check slot in consensus hook and remove all `CheckInherents` logic (#2658)
* aura-ext: check slot in consensus hook
* convert relay chain slot
* Make relay chain slot duration generic
* use fixed velocity hook for pallets with aura
* purge timestamp inherent
* fix warning
* adjust runtime tests
* fix slots in tests
* Make `xcm-emulator` test pass for new consensus hook (#2722)
* add pallets on_initialize
* tests pass
* add AuraExt on_init
* ".git/.scripts/commands/fmt/fmt.sh"
---------
Co-authored-by: command-bot <>
---------
Co-authored-by: Ignacio Palacios <ignacio.palacios.santos@gmail.com>
* update polkadot git refs
* CollationGenerationConfig closure is now optional (#2772)
* CollationGenerationConfig closure is now optional
* fix test
* propagate network-protocol-staging feature (#2899)
* Feature Flagging Consensus Hook Type Parameter (#2911)
* First pass
* fmt
* Added as default feature in tomls
* Changed to direct dependency feature
* Dealing with clippy error
* Update pallets/parachain-system/src/lib.rs
Co-authored-by: asynchronous rob <rphmeier@gmail.com>
---------
Co-authored-by: asynchronous rob <rphmeier@gmail.com>
* fmt
* bump deps and remove warning
* parachain-system: update RelevantMessagingState according to the unincluded segment (#2948)
* mostly address 2471 with a bug introduced
* adjust relevant messaging state after computing total
* fmt
* max -> min
* fix test implementation of xcmp source
* add test
* fix test message sending logic
* fix + test
* add more to unincluded segment test
* fmt
---------
Co-authored-by: Chris Sosnin <chris125_@live.com>
* Integrate new Aura / Parachain Consensus Logic in Parachain-Template / Polkadot-Parachain (#2864)
* add a comment
* refactor client/service utilities
* deprecate start_collator
* update parachain-template
* update test-service in the same way
* update polkadot-parachain crate
* fmt
* wire up new SubmitCollation message
* some runtime utilities for implementing unincluded segment runtime APIs
* allow parachains to configure their level of sybil-resistance when starting the network
* make aura-ext compile
* update to specify sybil resistance levels
* fmt
* specify relay chain slot duration in milliseconds
* update Aura to explicitly produce Send futures
also, make relay_chain_slot_duration a Duration
* add authoring duration to basic collator and document params
* integrate new basic collator into parachain-template
* remove assert_send used for testing
* basic-aura: only author when parent included
* update polkadot-parachain-bin
* fmt
* some fixes
* fixes
* add a RelayNumberMonotonicallyIncreases
* add a utility function for initializing subsystems
* some logging for timestamp adjustment
* fmt
* some fixes for lookahead collator
* add a log
* update `find_potential_parents` to account for sessions
* bound the loop
* restore & deprecate old start_collator and start_full_node functions.
* remove unnecessary await calls
* fix warning
* clippy
* more clippy
* remove unneeded logic
* ci
* update comment
Co-authored-by: Marcin S. <marcin@bytedude.com>
* (async backing) restore `CheckInherents` for backwards-compatibility (#2977)
* bring back timestamp
* Restore CheckInherents
* revert to empty CheckInherents
* make CheckInherents optional
* attempt
* properly end system blocks
* add some more comments
* ignore failing system parachain tests
* update refs after main feature branch merge
* comment out the offending tests because CI runs ignored tests
* fix warnings
* fmt
* revert to polkadot master
* cargo update -p polkadot-primitives -p sp-io
---------
Co-authored-by: asynchronous rob <rphmeier@gmail.com>
Co-authored-by: Ignacio Palacios <ignacio.palacios.santos@gmail.com>
Co-authored-by: Bradley Olson <34992650+BradleyOlson64@users.noreply.github.com>
Co-authored-by: Marcin S. <marcin@bytedude.com>
Co-authored-by: eskimor <eskimor@users.noreply.github.com>
Co-authored-by: Andronik <write@reusable.software>
* Don't pass `leaves()` to `Overseer::builder()`
This is a companion for https://github.com/paritytech/polkadot/pull/6727
* update lockfile for {"substrate", "polkadot"}
---------
Co-authored-by: parity-processbot <>
* Use primitives reexported from `polkadot_primitives` crate root
* restart CI
* Fixes after merge
* update lockfile for {"polkadot", "substrate"}
Co-authored-by: parity-processbot <>
* BlockId removal: refactor: HeaderBackend::status
It changes the arguments of `HeaderBackend::status` method from: `BlockId<Block>` to: `Block::Hash`
This PR is part of BlockId::Number refactoring analysis (paritytech/substrate#11292)
* update lockfile for {"polkadot", "substrate"}
Co-authored-by: parity-processbot <>
* BlockId removal: refactor: HeaderBackend::header
It changes the arguments of:
- `HeaderBackend::header`,
- `Client::header`
methods from: `BlockId<Block>` to: `Block::Hash`
This PR is part of BlockId::Number refactoring analysis (paritytech/substrate#11292)
* update lockfile for {"polkadot", "substrate"}
Co-authored-by: parity-processbot <>
* Allow specification of multiple urls for relay chain rpc nodes
* Add pooled RPC client basics
* Add list of clients to pooled client
* Improve
* Forward requests to dispatcher
* Switch clients on error
* Implement rotation logic
* Improve subscription handling
* Error handling cleanup
* Remove retry from rpc-client
* Improve naming
* Improve documentation
* Improve `ClientManager` abstraction
* Adjust zombienet test
* Add more comments
* fmt
* Apply reviewers comments
* Extract reconnection to extra method
* Add comment to reconnection method
* Clean up some dependencies
* Fix build
* fmt
* Provide alias for cli argument
* Apply review comments
* Rename P* to Relay*
* Improve zombienet test
* fmt
* Fix zombienet sleep
* Simplify zombienet test
* Reduce log clutter and fix starting position
* Do not distribute duplicated imported and finalized blocks
* fmt
* Apply code review suggestions
* Move building of relay chain interface to `cumulus-client-service`
* Refactoring to not push back into channel
* FMT