[litep2p](https://github.com/altonen/litep2p) is a libp2p-compatible P2P
networking library. It supports all of the features of `rust-libp2p`
that are currently being utilized by Polkadot SDK.
Compared to `rust-libp2p`, `litep2p` has a quite different architecture
which is why the new `litep2p` network backend is only able to use a
little of the existing code in `sc-network`. The design has been mainly
influenced by how we'd wish to structure our networking-related code in
Polkadot SDK: independent higher-levels protocols directly communicating
with the network over links that support bidirectional backpressure. A
good example would be `NotificationHandle`/`RequestResponseHandle`
abstractions which allow, e.g., `SyncingEngine` to directly communicate
with peers to announce/request blocks.
I've tried running `polkadot --network-backend litep2p` with a few
different peer configurations and there is a noticeable reduction in
networking CPU usage. For high load (`--out-peers 200`), networking CPU
usage goes down from ~110% to ~30% (80 pp) and for normal load
(`--out-peers 40`), the usage goes down from ~55% to ~18% (37 pp).
These should not be taken as final numbers because:
a) there are still some low-hanging optimization fruits, such as
enabling [receive window
auto-tuning](https://github.com/libp2p/rust-yamux/pull/176), integrating
`Peerset` more closely with `litep2p` or improving memory usage of the
WebSocket transport
b) fixing bugs/instabilities that incorrectly cause `litep2p` to do less
work will increase the networking CPU usage
c) verification in a more diverse set of tests/conditions is needed
Nevertheless, these numbers should give an early estimate for CPU usage
of the new networking backend.
This PR consists of three separate changes:
* introduce a generic `PeerId` (wrapper around `Multihash`) so that we
don't have use `NetworkService::PeerId` in every part of the code that
uses a `PeerId`
* introduce `NetworkBackend` trait, implement it for the libp2p network
stack and make Polkadot SDK generic over `NetworkBackend`
* implement `NetworkBackend` for litep2p
The new library should be considered experimental which is why
`rust-libp2p` will remain as the default option for the time being. This
PR currently depends on the master branch of `litep2p` but I'll cut a
new release for the library once all review comments have been
addresses.
---------
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Co-authored-by: Dmitry Markin <dmitry@markin.tech>
Co-authored-by: Alexandru Vasile <60601340+lexnv@users.noreply.github.com>
Co-authored-by: Alexandru Vasile <alexandru.vasile@parity.io>
Working towards migrating the `parity-bridges-common` repo inside
`polkadot-sdk`. This PR upgrades some dependencies in order to align
them with the versions used in `parity-bridges-common`
Related to
https://github.com/paritytech/parity-bridges-common/issues/2538
Runtime release 1.2 includes bumping of the ParachainHost APIs up to
v10, so let's move all the released APIs out of vstaging folder, this PR
does not include any logic changes only renaming of the modules and some
moving around.
Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
**Update:** Pushed additional changes based on the review comments.
**This pull request fixes various spelling mistakes in this
repository.**
Most of the changes are contained in the first **3** commits:
- `Fix spelling mistakes in comments and docs`
- `Fix spelling mistakes in test names`
- `Fix spelling mistakes in error messages, panic messages, logs and
tracing`
Other source code spelling mistakes are separated into individual
commits for easier reviewing:
- `Fix the spelling of 'authority'`
- `Fix the spelling of 'REASONABLE_HEADERS_IN_JUSTIFICATION_ANCESTRY'`
- `Fix the spelling of 'prev_enqueud_messages'`
- `Fix the spelling of 'endpoint'`
- `Fix the spelling of 'children'`
- `Fix the spelling of 'PenpalSiblingSovereignAccount'`
- `Fix the spelling of 'PenpalSudoAccount'`
- `Fix the spelling of 'insufficient'`
- `Fix the spelling of 'PalletXcmExtrinsicsBenchmark'`
- `Fix the spelling of 'subtracted'`
- `Fix the spelling of 'CandidatePendingAvailability'`
- `Fix the spelling of 'exclusive'`
- `Fix the spelling of 'until'`
- `Fix the spelling of 'discriminator'`
- `Fix the spelling of 'nonexistent'`
- `Fix the spelling of 'subsystem'`
- `Fix the spelling of 'indices'`
- `Fix the spelling of 'committed'`
- `Fix the spelling of 'topology'`
- `Fix the spelling of 'response'`
- `Fix the spelling of 'beneficiary'`
- `Fix the spelling of 'formatted'`
- `Fix the spelling of 'UNKNOWN_PROOF_REQUEST'`
- `Fix the spelling of 'succeeded'`
- `Fix the spelling of 'reopened'`
- `Fix the spelling of 'proposer'`
- `Fix the spelling of 'InstantiationNonce'`
- `Fix the spelling of 'depositor'`
- `Fix the spelling of 'expiration'`
- `Fix the spelling of 'phantom'`
- `Fix the spelling of 'AggregatedKeyValue'`
- `Fix the spelling of 'randomness'`
- `Fix the spelling of 'defendant'`
- `Fix the spelling of 'AquaticMammal'`
- `Fix the spelling of 'transactions'`
- `Fix the spelling of 'PassingTracingSubscriber'`
- `Fix the spelling of 'TxSignaturePayload'`
- `Fix the spelling of 'versioning'`
- `Fix the spelling of 'descendant'`
- `Fix the spelling of 'overridden'`
- `Fix the spelling of 'network'`
Let me know if this structure is adequate.
**Note:** The usage of the words `Merkle`, `Merkelize`, `Merklization`,
`Merkelization`, `Merkleization`, is somewhat inconsistent but I left it
as it is.
~~**Note:** In some places the term `Receival` is used to refer to
message reception, IMO `Reception` is the correct word here, but I left
it as it is.~~
~~**Note:** In some places the term `Overlayed` is used instead of the
more acceptable version `Overlaid` but I also left it as it is.~~
~~**Note:** In some places the term `Applyable` is used instead of the
correct version `Applicable` but I also left it as it is.~~
**Note:** Some usage of British vs American english e.g. `judgement` vs
`judgment`, `initialise` vs `initialize`, `optimise` vs `optimize` etc.
are both present in different places, but I suppose that's
understandable given the number of contributors.
~~**Note:** There is a spelling mistake in `.github/CODEOWNERS` but it
triggers errors in CI when I make changes to it, so I left it as it
is.~~
Adds availability-write regression tests.
The results for the `availability-distribution` subsystem are volatile,
so I had to reduce the precision of the test.
Fixes https://github.com/paritytech/polkadot-sdk/issues/3528
```rust
latency:
mean_latency_ms = 30 // common sense
std_dev = 2.0 // common sense
n_validators = 300 // max number of validators, from chain config
n_cores = 60 // 300/5
max_validators_per_core = 5 // default
min_pov_size = 5120 // max
max_pov_size = 5120 // max
peer_bandwidth = 44040192 // from the Parity's kusama validators
bandwidth = 44040192 // from the Parity's kusama validators
connectivity = 90 // we need to be connected to 90-95% of peers
```
### What's been done
- `subsystem-bench` has been split into two parts: a cli benchmark
runner and a library.
- The cli runner is quite simple. It just allows us to run `.yaml` based
test sequences. Now it should only be used to run benchmarks during
development.
- The library is used in the cli runner and in regression tests. Some
code is changed to make the library independent of the runner.
- Added first regression tests for availability read and write that
replicate existing test sequences.
### How we run regression tests
- Regression tests are simply rust integration tests without the
harnesses.
- They should only be compiled under the `subsystem-benchmarks` feature
to prevent them from running with other tests.
- This doesn't work when running tests with `nextest` in CI, so
additional filters have been added to the `nextest` runs.
- Each benchmark run takes a different time in the beginning, so we
"warm up" the tests until their CPU usage differs by only 1%.
- After the warm-up, we run the benchmarks a few more times and compare
the average with the exception using a precision.
### What is still wrong?
- I haven't managed to set up approval voting tests. The spread of their
results is too large and can't be narrowed down in a reasonable amount
of time in the warm-up phase.
- The tests start an unconfigurable prometheus endpoint inside, which
causes errors because they use the same 9999 port. I disable it with a
flag, but I think it's better to extract the endpoint launching outside
the test, as we already do with `valgrind` and `pyroscope`. But we still
use `prometheus` inside the tests.
### Future work
* https://github.com/paritytech/polkadot-sdk/issues/3528
* https://github.com/paritytech/polkadot-sdk/issues/3529
* https://github.com/paritytech/polkadot-sdk/issues/3530
* https://github.com/paritytech/polkadot-sdk/issues/3531
---------
Co-authored-by: Alexander Samusev <41779041+alvicsam@users.noreply.github.com>
Lifting some more dependencies to the workspace. Just using the
most-often updated ones for now.
It can be reproduced locally.
```sh
# First you can check if there would be semver incompatible bumps (looks good in this case):
$ zepter transpose dependency lift-to-workspace --ignore-errors syn quote thiserror "regex:^serde.*"
# Then apply the changes:
$ zepter transpose dependency lift-to-workspace --version-resolver=highest syn quote thiserror "regex:^serde.*" --fix
# And format the changes:
$ taplo format --config .config/taplo.toml
```
---------
Signed-off-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io>
Previously, it was only possible to retry the same request on a
different protocol name that had the exact same binary payloads.
Introduce a way of trying a different request on a different protocol if
the first one fails with Unsupported protocol.
This helps with adding new req-response versions in polkadot while
preserving compatibility with unupgraded nodes.
The way req-response protocols were bumped previously was that they were
bundled with some other notifications protocol upgrade, like for async
backing (but that is more complicated, especially if the feature does
not require any changes to a notifications protocol). Will be needed for
implementing https://github.com/polkadot-fellows/RFCs/pull/47
TODO:
- [x] add tests
- [x] add guidance docs in polkadot about req-response protocol
versioning
We currently use a bit of a hack in `.cargo/config` to make sure that
clippy isn't too annoying by specifying the list of lints.
There is now a stable way to define lints for a workspace. The only down
side is that every crate seems to have to opt into this so there's a
*few* files modified in this PR.
Dependencies:
- [x] PR that upgrades CI to use rust 1.74 is merged.
---------
Co-authored-by: joe petrowski <25483142+joepetrowski@users.noreply.github.com>
Co-authored-by: Branislav Kontur <bkontur@gmail.com>
Co-authored-by: Liam Aharon <liam.aharon@hotmail.com>
* polkadot: propagate UnpinHandle to ActiveLeafUpdate
Also extract the leaf creation for tests
into a common function.
* dispute-coordinator: try pinned blocks for slashin
* apparently 1.72 is smarter than 1.70
* address nits
* rename fresh_leaf to new_leaf
* Happy New Year!
* Remove year entierly
Co-authored-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io>
* Remove years from copyright notice in the entire repo
---------
Co-authored-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io>
* Pass the PerLeafSpan as mutable reference to handle_new_head function
* cargo +nightly fmt --all
* Add mock span for test
* cargo +nightly fmt --all
* add new-blocks-hashes to span
* ref span in match statement, set span to disabled if not passed
* remove second match clause, make handle_new_head_span mutable
* cargo +nightly fmt --all
* improve tag on error and warning
* add imported blocks and info span
* cargo +nightly fmt --all
* Improve error for imported_blocks_and_info trace
* format tags on get_header_span
* add lost-to-finality tag
* add missing bracket
* - Add bitfield child span
- Add block db insertion span
* - fix update-bitfield span tag
* - Fix type conversion to u64
- Add missing argument
* - Cargo fmt
* - Test add_follows_from
* - Revert as relationship between spans not working correctly
* - use drop to test if parent-child relationship can be re-established
* - remove bitfield span, check if parent-child relationship can be reestablished
* - Remove dangling bitfield span which is not used, to see if parent-child relationship can be re-established
* Another dangling bitfield span
* cargo fmt
* - add imported blocks and info span
- add candidate span per candidate
* add tags before moving block_header to push scope
* - Add db-insertion span
* cargo fmt
* fix types
* * Pass mutable reference to span in handle_new_head
* Change get-header-span tags in handle_new_head
* Create cache-session-info span in handle_new_head
* Create optional argument in determine_new_blocks
* Pass mutable reference to handle_new_head_span in determine_new_blocks in handle_new_head function
* Add candidate-hash, candidate-number, lost-to-finality tags to candidate_span in handle_new_head function
* Manually drop db_insertion_span and remove superfluous tags to it, only keeping approved-bitfields tag
* Add ApprovalVoting stage in jaeger
* * Pass mutable reference to jaeger::Span in stead of PerLeafSpan
* Add block-import span
* *Pass optional_span (optional argument) to determine_new_blocks util function
* * Add num-candidates int tag to block_import_span
* * Add head tag to cache_session_span
* * Create PerLeafSpan in handle_from_overseer (this is required to establish parent-child relationship between approval-voting span, and leaf-activated root span)
* * Add candidate-import-span as child of block-import-span
* Add candidate-hash and num-approval tags to candidate-import-span
* * Fix num-candidate tag to bitvec-len tag in candidate-import-span
* *Fix imported_blocKs_and_info span to create new-block-span as not dealing with candidates
* Consider the future::select! block
* Use HashMap<Hash, jaeger::PerLeafSpan>
* Remove Stage 9
* Add missing spans
* cargo +nightly fmt --all
* Remove optional span argument for determine_new_blocks
* * Remove no-longer needed default PerLeafSpan implementation
* Remove no-longer necessary mock span given re-factoring of handle_new_head() no longer neeing mutable span
* Split validation-result and request-data (availability and validation code) spans into two by dropping request_validation_data_spans
* Remove drop statements for cache_session_info_span
*
* Remove unnecessary span
* Remove another excessively spammy span
* Add missing spans from State in import tests
* Use functional approach to get spans
* - Add functional approach for the approval-voting span
- Add doc on block_numbers given labelling ambiguity
- Add span pruning logic
- Use .add_para_id on validation_result_span
* Replace for hash_set in hash_set_iter with map closure
* cargo +nightly fmt --all
* Change from unconsumed `map` to `.for_each`
* cargo +nightly fmt --all
* Refactor add_para_id to validation_result_span
* cargo +nightly fmt --all
* Remove duplicate tag
* Add missing tag to handle-approved-ancestor span
* Refactor span pruning to only invoke retain once
* Typo in span name
* - Replace unwrap_or with unwrap_or_else due to lazy evaluation of trace-identifier in polkadot_node_jaeger
- Remove some redundant spans
* Add approval-distribution spans
* - Add unwrap_or_else on note-approved-in-chain-selection
- Use child_with_trace_id to add traceID string tag on span (note this does not change the traceID, but just adds a tag)
* cargo +nightly fmt --all
* - Add traceID tags were necessary in approval-voting and availability-distribution
- Always use block-hash tag in stead of relay-parent tag in approval-distribution
* Remove schedule-wakeup span as it will duplicate spans on existing wakeups (which should be a no-op)
* Remove a couple of warnings related to mutability
* Fix failing tests in availability distribution
* Add traceID tag to launch-approval and validation-result
* Reshuffle the validation and validation result spans to where more appropriate and add block-hash tag
* - Add tranche and should-trigger tag to process-wakeup span
- Add candidate-hash and traceID to check-and-import-approval span
* cargo fmt
* - Adjustments after PR comments
* Move span pruning after other pruning logic
* Remove DerefMut - no longer needed
* Relabel request-chunk spans
* - Fix typo in span label
- Add docs for drops
* Add new approval-voting span pruning logic
* Undo removal of !
* cargo fmt
* rust 1.64 enables workspace properties
* add edition, repository and authors.
* of course, update the version in one place.
Co-authored-by: Andronik <write@reusable.software>
* westend: update transaction version
* polkadot: update transaction version
* kusama: update transaction version
* Bump spec_version to 9330
* bump versions to 0.9.33
* Change best effort queue behaviour in `dispute-coordinator`
Use the same type of queue (`BTreeMap<CandidateComparator,
ParticipationRequest>`) for best effort and priority in
`dispute-coordinator`.
Rework `CandidateComparator` to handle unavailable parent
block numbers.
Best effort queue will order disputes the same way as priority does - by
parent's block height. Disputes on candidates for which the parent's
block number can't be obtained will be treated with the lowest priority.
* Fix tests: Handle `ChainApiMessage::BlockNumber` in `handle_sync_queries`
* Some tests are deadlocking on sending messages via overseer so change `SingleItemSink`to `mpsc::Sender` with a buffer of 1
* Fix a race in test after adding a buffered queue for overseer messages
* Fix the rest of the tests
* Guide update - best-effort queue
* Guide update: clarification about spam votes
* Fix tests in `availability-distribution`
* Update comments
* Add `make_buffered_subsystem_context` in `subsystem-test-helpers`
* Code review feedback
* Code review feedback
* Code review feedback
* Don't add best effort candidate if it is already in priority queue
* Remove an old comment
* Fix insert in best_effort
* Bump crate versions
* Bump spec_version to 9280 for kusama
* Bump spec_version to 9280 for polkadot
* Bump spec_version to 9280 for rococo
* Bump spec_version to 9280 for westend
* update Cargo.lock
Co-authored-by: parity-processbot <>
* foo
* rolling session window
* fixup
* remove use statemetn
* fmt
* split NetworkBridge into two subsystems
Pending cleanup
* split
* chore: reexport OrchestraError as OverseerError
* chore: silence warnings
* fixup tests
* chore: add default timenout of 30s to subsystem test helper ctx handle
* single item channel
* fixins
* fmt
* cleanup
* remove dead code
* remove sync bounds again
* wire up shared state
* deal with some FIXMEs
* use distinct tags
Co-authored-by: Andrei Sandu <54316454+sandreim@users.noreply.github.com>
* use tag
Co-authored-by: Andrei Sandu <54316454+sandreim@users.noreply.github.com>
* address naming
tx and rx are common in networking and also have an implicit meaning regarding networking
compared to incoming and outgoing which are already used with subsystems themselvesq
* remove unused sync oracle
* remove unneeded state
* fix tests
* chore: fmt
* do not try to register twice
* leak Metrics type
Co-authored-by: Andrei Sandu <54316454+sandreim@users.noreply.github.com>
Co-authored-by: Andronik <write@reusable.software>