Implements the idea from
https://github.com/paritytech/polkadot-sdk/pull/3899
- Removed latencies
- Number of runs reduced from 50 to 5, according to local runs it's
quite enough
- Network message is always sent in a spawned task, even if latency is
zero. Without it, CPU time sometimes spikes.
- Removed the `testnet` profile because we probably don't need that
debug additions.
After the local tests I can't say that it brings a significant
improvement in the stability of the results. However, I belive it is
worth trying and looking at the results over time.
This tiny PR extends the `on_validated_block_announce` log with the bad
PeerID.
Used to identify if the peerID is malicious by correlating with other
logs (ie peer-set).
While at it, have removed the `\n` from a multiline log, which did not
play well with
[sub-triage-logs](https://github.com/lexnv/sub-triage-logs/tree/master).
cc @paritytech/networking
---------
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Co-authored-by: Bastian Köcher <git@kchr.de>
[litep2p](https://github.com/altonen/litep2p) is a libp2p-compatible P2P
networking library. It supports all of the features of `rust-libp2p`
that are currently being utilized by Polkadot SDK.
Compared to `rust-libp2p`, `litep2p` has a quite different architecture
which is why the new `litep2p` network backend is only able to use a
little of the existing code in `sc-network`. The design has been mainly
influenced by how we'd wish to structure our networking-related code in
Polkadot SDK: independent higher-levels protocols directly communicating
with the network over links that support bidirectional backpressure. A
good example would be `NotificationHandle`/`RequestResponseHandle`
abstractions which allow, e.g., `SyncingEngine` to directly communicate
with peers to announce/request blocks.
I've tried running `polkadot --network-backend litep2p` with a few
different peer configurations and there is a noticeable reduction in
networking CPU usage. For high load (`--out-peers 200`), networking CPU
usage goes down from ~110% to ~30% (80 pp) and for normal load
(`--out-peers 40`), the usage goes down from ~55% to ~18% (37 pp).
These should not be taken as final numbers because:
a) there are still some low-hanging optimization fruits, such as
enabling [receive window
auto-tuning](https://github.com/libp2p/rust-yamux/pull/176), integrating
`Peerset` more closely with `litep2p` or improving memory usage of the
WebSocket transport
b) fixing bugs/instabilities that incorrectly cause `litep2p` to do less
work will increase the networking CPU usage
c) verification in a more diverse set of tests/conditions is needed
Nevertheless, these numbers should give an early estimate for CPU usage
of the new networking backend.
This PR consists of three separate changes:
* introduce a generic `PeerId` (wrapper around `Multihash`) so that we
don't have use `NetworkService::PeerId` in every part of the code that
uses a `PeerId`
* introduce `NetworkBackend` trait, implement it for the libp2p network
stack and make Polkadot SDK generic over `NetworkBackend`
* implement `NetworkBackend` for litep2p
The new library should be considered experimental which is why
`rust-libp2p` will remain as the default option for the time being. This
PR currently depends on the master branch of `litep2p` but I'll cut a
new release for the library once all review comments have been
addresses.
---------
Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Co-authored-by: Dmitry Markin <dmitry@markin.tech>
Co-authored-by: Alexandru Vasile <60601340+lexnv@users.noreply.github.com>
Co-authored-by: Alexandru Vasile <alexandru.vasile@parity.io>
With Coretime enabled we can no longer assume there is a static 1:1
mapping between core index and para id. This mapping should be obtained
from the scheduler/claimqueue on block by block basis.
This PR modifies `para_id()` (from `CoreState`) to return the scheduled
`ParaId` for occupied cores and removes its usages in the code.
Closes https://github.com/paritytech/polkadot-sdk/issues/3948
---------
Co-authored-by: Andrei Sandu <54316454+sandreim@users.noreply.github.com>
Remove `fetch_next_scheduled_on_core` in favor of new wrapper and
methods for accessing it.
---------
Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
Working towards migrating the `parity-bridges-common` repo inside
`polkadot-sdk`. This PR upgrades some dependencies in order to align
them with the versions used in `parity-bridges-common`
Related to
https://github.com/paritytech/parity-bridges-common/issues/2538
This outputs:
```
2024-04-02 14:36:02.135 ERROR tokio-runtime-worker beefy: 🥩 for session starting at block 21990151
no BEEFY authority key found in store, you must generate valid session keys
(https://wiki.polkadot.network/docs/maintain-guides-how-to-validate-polkadot#generating-the-session-keys)
```
error log entry, once every session, for nodes running with
`Role::Authority` that have no public BEEFY key in their keystore
---------
Co-authored-by: Bastian Köcher <git@kchr.de>
Rejoice! Rejoice! The story is nearly over.
This PR removes stale migrations, auxiliary structures, and package
dependencies, thus making Rococo and Westend totally free from any
`im-online`-related stuff.
`im-online` still stays a part of the Substrate node and its runtime:
https://github.com/paritytech/polkadot-sdk/blob/0d9324847391e902bb42f84f0e76096b1f764efe/substrate/bin/node/runtime/src/lib.rs#L2276-L2277
I'm not sure if it makes sense to remove it from there considering that
we're not removing `im-online` from FRAME. Please share your opinion.
Runtime release 1.2 includes bumping of the ParachainHost APIs up to
v10, so let's move all the released APIs out of vstaging folder, this PR
does not include any logic changes only renaming of the modules and some
moving around.
Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
The metric records the current protocol_version of the validator that
just connected with the peer_map.len(), which contains all peers that
connected, that has the effect the metric will be wrong since it won't
tell us how many peers we have connected per version because it will
always record the total number of peers
Fix this by counting by version inside peer_map, additionally because
that might be a bit heavier than len(), publish it only on-active
leaves.
---------
Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
This works only for collators that implement the `collator_fn` allowing
`collation-generation` subsystem to pull collations triggered on new
heads.
Also enables
`request_v2::CollationFetchingResponse::CollationWithParentHeadData` for
test adder/undying collators.
TODO:
- [x] fix tests
- [x] new tests
- [x] PR doc
---------
Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
Fixes#3826.
The docs on the `candidates` field of `BlockEntry` were incorrectly
stating that they are sorted by core index. The (incorrect) optimization
was introduced in #3747 based on this assumption. The actual ordering is
based on `CandidateIncluded` events ordering in the runtime. We revert
this optimization here.
- [x] verify the underlying issue
- [x] add a regression test
---------
Co-authored-by: Bastian Köcher <git@kchr.de>
The PR provides API for obtaining:
- the weight required to execute an XCM message,
- a list of acceptable `AssetId`s for message execution payment,
- the cost of the weight in the specified acceptable `AssetId`.
It is meant to address an issue where one has to guess how much fee to
pay for execution. Also, at the moment, a client has to guess which
assets are acceptable for fee execution payment.
See the related issue
https://github.com/paritytech/polkadot-sdk/issues/690.
With this API, a client is supposed to query the list of the supported
asset IDs (in the XCM version format the client understands), weigh the
XCM program the client wants to execute and convert the weight into one
of the acceptable assets. Note that the client is supposed to know what
program will be executed on what chains. However, having a small
companion JS library for the pallet-xcm and xtokens should be enough to
determine what XCM programs will be executed and where (since these
pallets compose a known small set of programs).
```Rust
pub trait XcmPaymentApi<Call>
where
Call: Codec,
{
/// Returns a list of acceptable payment assets.
///
/// # Arguments
///
/// * `xcm_version`: Version.
fn query_acceptable_payment_assets(xcm_version: Version) -> Result<Vec<VersionedAssetId>, Error>;
/// Returns a weight needed to execute a XCM.
///
/// # Arguments
///
/// * `message`: `VersionedXcm`.
fn query_xcm_weight(message: VersionedXcm<Call>) -> Result<Weight, Error>;
/// Converts a weight into a fee for the specified `AssetId`.
///
/// # Arguments
///
/// * `weight`: convertible `Weight`.
/// * `asset`: `VersionedAssetId`.
fn query_weight_to_asset_fee(weight: Weight, asset: VersionedAssetId) -> Result<u128, Error>;
/// Get delivery fees for sending a specific `message` to a `destination`.
/// These always come in a specific asset, defined by the chain.
///
/// # Arguments
/// * `message`: The message that'll be sent, necessary because most delivery fees are based on the
/// size of the message.
/// * `destination`: The destination to send the message to. Different destinations may use
/// different senders that charge different fees.
fn query_delivery_fees(destination: VersionedLocation, message: VersionedXcm<()>) -> Result<VersionedAssets, Error>;
}
```
An
[example](https://gist.github.com/PraetorP/4bc323ff85401abe253897ba990ec29d)
of a client side code.
---------
Co-authored-by: Francisco Aguirre <franciscoaguirreperez@gmail.com>
Co-authored-by: Adrian Catangiu <adrian@parity.io>
Co-authored-by: Daniel Shiposha <mrshiposha@gmail.com>
**Update:** Pushed additional changes based on the review comments.
**This pull request fixes various spelling mistakes in this
repository.**
Most of the changes are contained in the first **3** commits:
- `Fix spelling mistakes in comments and docs`
- `Fix spelling mistakes in test names`
- `Fix spelling mistakes in error messages, panic messages, logs and
tracing`
Other source code spelling mistakes are separated into individual
commits for easier reviewing:
- `Fix the spelling of 'authority'`
- `Fix the spelling of 'REASONABLE_HEADERS_IN_JUSTIFICATION_ANCESTRY'`
- `Fix the spelling of 'prev_enqueud_messages'`
- `Fix the spelling of 'endpoint'`
- `Fix the spelling of 'children'`
- `Fix the spelling of 'PenpalSiblingSovereignAccount'`
- `Fix the spelling of 'PenpalSudoAccount'`
- `Fix the spelling of 'insufficient'`
- `Fix the spelling of 'PalletXcmExtrinsicsBenchmark'`
- `Fix the spelling of 'subtracted'`
- `Fix the spelling of 'CandidatePendingAvailability'`
- `Fix the spelling of 'exclusive'`
- `Fix the spelling of 'until'`
- `Fix the spelling of 'discriminator'`
- `Fix the spelling of 'nonexistent'`
- `Fix the spelling of 'subsystem'`
- `Fix the spelling of 'indices'`
- `Fix the spelling of 'committed'`
- `Fix the spelling of 'topology'`
- `Fix the spelling of 'response'`
- `Fix the spelling of 'beneficiary'`
- `Fix the spelling of 'formatted'`
- `Fix the spelling of 'UNKNOWN_PROOF_REQUEST'`
- `Fix the spelling of 'succeeded'`
- `Fix the spelling of 'reopened'`
- `Fix the spelling of 'proposer'`
- `Fix the spelling of 'InstantiationNonce'`
- `Fix the spelling of 'depositor'`
- `Fix the spelling of 'expiration'`
- `Fix the spelling of 'phantom'`
- `Fix the spelling of 'AggregatedKeyValue'`
- `Fix the spelling of 'randomness'`
- `Fix the spelling of 'defendant'`
- `Fix the spelling of 'AquaticMammal'`
- `Fix the spelling of 'transactions'`
- `Fix the spelling of 'PassingTracingSubscriber'`
- `Fix the spelling of 'TxSignaturePayload'`
- `Fix the spelling of 'versioning'`
- `Fix the spelling of 'descendant'`
- `Fix the spelling of 'overridden'`
- `Fix the spelling of 'network'`
Let me know if this structure is adequate.
**Note:** The usage of the words `Merkle`, `Merkelize`, `Merklization`,
`Merkelization`, `Merkleization`, is somewhat inconsistent but I left it
as it is.
~~**Note:** In some places the term `Receival` is used to refer to
message reception, IMO `Reception` is the correct word here, but I left
it as it is.~~
~~**Note:** In some places the term `Overlayed` is used instead of the
more acceptable version `Overlaid` but I also left it as it is.~~
~~**Note:** In some places the term `Applyable` is used instead of the
correct version `Applicable` but I also left it as it is.~~
**Note:** Some usage of British vs American english e.g. `judgement` vs
`judgment`, `initialise` vs `initialize`, `optimise` vs `optimize` etc.
are both present in different places, but I suppose that's
understandable given the number of contributors.
~~**Note:** There is a spelling mistake in `.github/CODEOWNERS` but it
triggers errors in CI when I make changes to it, so I left it as it
is.~~
Adds availability-write regression tests.
The results for the `availability-distribution` subsystem are volatile,
so I had to reduce the precision of the test.
Related to
https://github.com/paritytech/parity-bridges-common/issues/2538
This PR doesn't contain any functional changes.
The PR moves specific bridged chain definitions from
`bridges/primitives` to `bridges/chains` folder in order to facilitate
the migration of the `parity-bridges-repo` into `polkadot-sdk` as
discussed in https://hackmd.io/LprWjZ0bQXKpFeveYHIRXw?view
Apart from this it also includes some cosmetic changes to some
`Cargo.toml` files as a result of running `diener workspacify`.
Small refactoring to reduce the algorithmic complexity of the initial
message distribution in approval voting after a sync from O(n_candidates
^ 2) to O(n_candidates).
The PR adds two things:
1. Runtime API exposing the whole claim queue
2. Consumes the API in `collation-generation` to fetch the next
scheduled `ParaEntry` for an occupied core.
Related to https://github.com/paritytech/polkadot-sdk/issues/1797
Introduces `CryptoBytes` type defined as:
```rust
pub struct CryptoBytes<const N: usize, Tag = ()>(pub [u8; N], PhantomData<fn() -> Tag>);
```
The type implements a bunch of methods and traits which are typically
expected from a byte array newtype
(NOTE: some of the methods and trait implementations IMO are a bit
redundant, but I decided to maintain them all to not change too much
stuff in this PR)
It also introduces two (generic) typical consumers of `CryptoBytes`:
`PublicBytes` and `SignatureBytes`.
```rust
pub struct PublicTag;
pub PublicBytes<const N: usize, CryptoTag> = CryptoBytes<N, (PublicTag, CryptoTag)>;
pub struct SignatureTag;
pub SignatureBytes<const N: usize, CryptoTag> = CryptoBytes<N, (SignatureTag, CryptoTag)>;
```
Both of them use a tag to differentiate the two types at a higher level.
Downstream specializations will further specialize using a dedicated
crypto tag. For example in ECDSA:
```rust
pub struct EcdsaTag;
pub type Public = PublicBytes<PUBLIC_KEY_SERIALIZED_SIZE, EcdsaTag>;
pub type Signature = PublicBytes<PUBLIC_KEY_SERIALIZED_SIZE, EcdsaTag>;
```
Overall we have a cleaner and most importantly **consistent** code for
all the types involved
All these details are opaque to the end user which can use `Public` and
`Signature` for the cryptos as before
On top of #3302.
We want the validators to upgrade first before we add changes to the
collation side to send the new variants, which is why this part is
extracted into a separate PR.
The detection of when to send the parent head is based on the core
assignments at the relay parent of the candidate. We probably want to
make it more flexible in the future, but for now, it will work for a
simple use case when a para always has multiple cores assigned to it.
---------
Signed-off-by: Matteo Muraca <mmuraca247@gmail.com>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Matteo Muraca <56828990+muraca@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Juan Ignacio Rios <54085674+JuaniRios@users.noreply.github.com>
Co-authored-by: Branislav Kontur <bkontur@gmail.com>
Co-authored-by: Bastian Köcher <git@kchr.de>
Fixes#3128.
This introduces a new variant for the collation response from the
collator that includes the parent head data. For now, collators won't
send this new variant. We'll need to change the collator side of the
collator protocol to detect all the cores assigned to a para and send
the parent head data in the case when it's more than 1 core.
- [x] validate approach
- [x] check head data hash
This is printed every 10 minutes, I see no reason why it shouldn't be in
all the logs, it would give us valuable information about what is going
on with node connectivity when validators come-back to us to report
issues.
Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
Sometimes we see nodes printing this warning:
```
cannot query the runtime API version: Api called for an unknown Block: State already discarded for
```
The log is harmless, but let's print the api we got this for, so that we
can track its call site and truly confirm it is harmless or fix it.
Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
This PR adds a new PolkaVM-based executor to Substrate.
- The executor can now be used to actually run a PolkaVM-based runtime,
and successfully produces blocks.
- The executor is always compiled-in, but is disabled by default.
- The `SUBSTRATE_ENABLE_POLKAVM` environment variable must be set to `1`
to enable the executor, in which case the node will accept both WASM and
PolkaVM program blobs (otherwise it'll default to WASM-only). This is
deliberately undocumented and not explicitly exposed anywhere (e.g. in
the command line arguments, or in the API) to disincentivize anyone from
enabling it in production. If/when we'll move this into production usage
I'll remove the environment variable and do it "properly".
- I did not use our legacy runtime allocator for the PolkaVM executor,
so currently every allocation inside of the runtime will leak guest
memory until that particular instance is destroyed. The idea here is
that I will work on the https://github.com/polkadot-fellows/RFCs/pull/4
which will remove the need for the legacy allocator under WASM, and that
will also allow us to use a proper non-leaking allocator under PolkaVM.
- I also did some minor cleanups of the WASM executor and deleted some
dead code.
No prdocs included since this is not intended to be an end-user feature,
but an unofficial experiment, and shouldn't affect any current
production user. Once this is production-ready a full Polkadot
Fellowship RFC will be necessary anyway.
Fixes https://github.com/paritytech/polkadot-sdk/issues/3528
```rust
latency:
mean_latency_ms = 30 // common sense
std_dev = 2.0 // common sense
n_validators = 300 // max number of validators, from chain config
n_cores = 60 // 300/5
max_validators_per_core = 5 // default
min_pov_size = 5120 // max
max_pov_size = 5120 // max
peer_bandwidth = 44040192 // from the Parity's kusama validators
bandwidth = 44040192 // from the Parity's kusama validators
connectivity = 90 // we need to be connected to 90-95% of peers
```
Looking at rococo-asset-hub
https://github.com/paritytech/polkadot-sdk/issues/3519 there seems to be
a lot of instances where collator did not advertise their collations,
while there are multiple problems there, one of it is that we are
connecting and disconnecting to our assigned validators every block,
because on reconnect_timeout every 4s we call connect_to_validators and
that will produce 0 validators when all went well, so set_reseverd_peers
called from validator discovery will disconnect all our peers.
More details here:
https://github.com/paritytech/polkadot-sdk/issues/3519#issuecomment-1972667343
Now, this shouldn't be a problem, but it stacks with an existing bug in
our network stack where if disconnect from a peer the peer might not
notice it, so it won't detect the reconnect either and it won't send us
the necessary view updates, so we won't advertise the collation to it
more details here:
https://github.com/paritytech/polkadot-sdk/issues/3519#issuecomment-1972958276
To avoid hitting this condition that often, let's keep the peers in the
reserved set for the entire duration we are allocated to a backing
group. Backing group sizes(1 rococo, 3 kusama, 5 polkadot) are really
small, so this shouldn't lead to that many connections. Additionally,
the validators would disconnect us any way if we don't advertise
anything for 4 blocks.
## TODO
- [x] More testing.
- [x] Confirm on rococo that this is improving the situation. (It
doesn't but just because other things are going wrong there).
---------
Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>