pezkuwi-subxt

mirror of https://github.com/pezkuwichain/pezkuwi-subxt.git synced 2026-04-26 02:57:57 +00:00

Author	SHA1	Message	Date
Aaro Altonen	80616f6d03	Integrate litep2p into Polkadot SDK (#2944 ) [litep2p](https://github.com/altonen/litep2p) is a libp2p-compatible P2P networking library. It supports all of the features of `rust-libp2p` that are currently being utilized by Polkadot SDK. Compared to `rust-libp2p`, `litep2p` has a quite different architecture which is why the new `litep2p` network backend is only able to use a little of the existing code in `sc-network`. The design has been mainly influenced by how we'd wish to structure our networking-related code in Polkadot SDK: independent higher-levels protocols directly communicating with the network over links that support bidirectional backpressure. A good example would be `NotificationHandle`/`RequestResponseHandle` abstractions which allow, e.g., `SyncingEngine` to directly communicate with peers to announce/request blocks. I've tried running `polkadot --network-backend litep2p` with a few different peer configurations and there is a noticeable reduction in networking CPU usage. For high load (`--out-peers 200`), networking CPU usage goes down from ~110% to ~30% (80 pp) and for normal load (`--out-peers 40`), the usage goes down from ~55% to ~18% (37 pp). These should not be taken as final numbers because: a) there are still some low-hanging optimization fruits, such as enabling [receive window auto-tuning](https://github.com/libp2p/rust-yamux/pull/176), integrating `Peerset` more closely with `litep2p` or improving memory usage of the WebSocket transport b) fixing bugs/instabilities that incorrectly cause `litep2p` to do less work will increase the networking CPU usage c) verification in a more diverse set of tests/conditions is needed Nevertheless, these numbers should give an early estimate for CPU usage of the new networking backend. This PR consists of three separate changes: * introduce a generic `PeerId` (wrapper around `Multihash`) so that we don't have use `NetworkService::PeerId` in every part of the code that uses a `PeerId` * introduce `NetworkBackend` trait, implement it for the libp2p network stack and make Polkadot SDK generic over `NetworkBackend` * implement `NetworkBackend` for litep2p The new library should be considered experimental which is why `rust-libp2p` will remain as the default option for the time being. This PR currently depends on the master branch of `litep2p` but I'll cut a new release for the library once all review comments have been addresses. --------- Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io> Co-authored-by: Dmitry Markin <dmitry@markin.tech> Co-authored-by: Alexandru Vasile <60601340+lexnv@users.noreply.github.com> Co-authored-by: Alexandru Vasile <alexandru.vasile@parity.io>	2024-04-08 16:44:13 +00:00
Alexandru Gheorghe	d6f68bb906	primitives: Move out of staging released APIs (#3925 ) Runtime release 1.2 includes bumping of the ParachainHost APIs up to v10, so let's move all the released APIs out of vstaging folder, this PR does not include any logic changes only renaming of the modules and some moving around. Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>	2024-04-01 13:03:26 +00:00
Andrei Eres	fd79b3b08a	[subsystem-benchmarks] Save results to json (#3829 ) Here we add the ability to save subsystem benchmark results in JSON format to display them as graphs To draw graphs, CI team will use [github-action-benchmark](https://github.com/benchmark-action/github-action-benchmark). Since we are using custom benchmarks, we need to prepare [a specific data type](https://github.com/benchmark-action/github-action-benchmark?tab=readme-ov-file#examples): ``` [ { "name": "CPU Load", "unit": "Percent", "value": 50 } ] ``` Then we'll get graphs like this: ![example](https://raw.githubusercontent.com/rhysd/ss/master/github-action-benchmark/main.png) [A live page with graphs](https://benchmark-action.github.io/github-action-benchmark/dev/bench/) --------- Co-authored-by: ordian <write@reusable.software>	2024-03-26 15:51:47 +00:00
Dcompoze	002d9260f9	Fix spelling mistakes across the whole repository (#3808 ) Update: Pushed additional changes based on the review comments. This pull request fixes various spelling mistakes in this repository. Most of the changes are contained in the first 3 commits: - `Fix spelling mistakes in comments and docs` - `Fix spelling mistakes in test names` - `Fix spelling mistakes in error messages, panic messages, logs and tracing` Other source code spelling mistakes are separated into individual commits for easier reviewing: - `Fix the spelling of 'authority'` - `Fix the spelling of 'REASONABLE_HEADERS_IN_JUSTIFICATION_ANCESTRY'` - `Fix the spelling of 'prev_enqueud_messages'` - `Fix the spelling of 'endpoint'` - `Fix the spelling of 'children'` - `Fix the spelling of 'PenpalSiblingSovereignAccount'` - `Fix the spelling of 'PenpalSudoAccount'` - `Fix the spelling of 'insufficient'` - `Fix the spelling of 'PalletXcmExtrinsicsBenchmark'` - `Fix the spelling of 'subtracted'` - `Fix the spelling of 'CandidatePendingAvailability'` - `Fix the spelling of 'exclusive'` - `Fix the spelling of 'until'` - `Fix the spelling of 'discriminator'` - `Fix the spelling of 'nonexistent'` - `Fix the spelling of 'subsystem'` - `Fix the spelling of 'indices'` - `Fix the spelling of 'committed'` - `Fix the spelling of 'topology'` - `Fix the spelling of 'response'` - `Fix the spelling of 'beneficiary'` - `Fix the spelling of 'formatted'` - `Fix the spelling of 'UNKNOWN_PROOF_REQUEST'` - `Fix the spelling of 'succeeded'` - `Fix the spelling of 'reopened'` - `Fix the spelling of 'proposer'` - `Fix the spelling of 'InstantiationNonce'` - `Fix the spelling of 'depositor'` - `Fix the spelling of 'expiration'` - `Fix the spelling of 'phantom'` - `Fix the spelling of 'AggregatedKeyValue'` - `Fix the spelling of 'randomness'` - `Fix the spelling of 'defendant'` - `Fix the spelling of 'AquaticMammal'` - `Fix the spelling of 'transactions'` - `Fix the spelling of 'PassingTracingSubscriber'` - `Fix the spelling of 'TxSignaturePayload'` - `Fix the spelling of 'versioning'` - `Fix the spelling of 'descendant'` - `Fix the spelling of 'overridden'` - `Fix the spelling of 'network'` Let me know if this structure is adequate. Note: The usage of the words `Merkle`, `Merkelize`, `Merklization`, `Merkelization`, `Merkleization`, is somewhat inconsistent but I left it as it is. ~~Note: In some places the term `Receival` is used to refer to message reception, IMO `Reception` is the correct word here, but I left it as it is.~~ ~~Note: In some places the term `Overlayed` is used instead of the more acceptable version `Overlaid` but I also left it as it is.~~ ~~Note: In some places the term `Applyable` is used instead of the correct version `Applicable` but I also left it as it is.~~ Note: Some usage of British vs American english e.g. `judgement` vs `judgment`, `initialise` vs `initialize`, `optimise` vs `optimize` etc. are both present in different places, but I suppose that's understandable given the number of contributors. ~~Note: There is a spelling mistake in `.github/CODEOWNERS` but it triggers errors in CI when I make changes to it, so I left it as it is.~~	2024-03-26 13:57:57 +00:00
Andrei Eres	cc1e6ac301	[subsystem-benchmarks] Fix availability-write regression tests (#3698 ) Adds availability-write regression tests. The results for the `availability-distribution` subsystem are volatile, so I had to reduce the precision of the test.	2024-03-25 15:57:46 +00:00
Andrei Eres	256546b7b9	Add subsystems regression tests to CI (#3527 ) Fixes https://github.com/paritytech/polkadot-sdk/issues/3530	2024-03-13 16:54:41 +00:00
Andrei Eres	05381afcb2	subsystem-bench: adjust test config to Kusama (#3583 ) Fixes https://github.com/paritytech/polkadot-sdk/issues/3528 ```rust latency: mean_latency_ms = 30 // common sense std_dev = 2.0 // common sense n_validators = 300 // max number of validators, from chain config n_cores = 60 // 300/5 max_validators_per_core = 5 // default min_pov_size = 5120 // max max_pov_size = 5120 // max peer_bandwidth = 44040192 // from the Parity's kusama validators bandwidth = 44040192 // from the Parity's kusama validators connectivity = 90 // we need to be connected to 90-95% of peers ```	2024-03-11 16:17:45 +00:00
cuinix	ea458d0b95	fix some typos (#3587 ) Signed-off-by: cuinix <915115094@qq.com> Co-authored-by: Bastian Köcher <git@kchr.de>	2024-03-08 21:28:04 +00:00
Andrei Eres	f0e589d72e	subsystem-bench: add regression tests for availability read and write (#3311 ) ### What's been done - `subsystem-bench` has been split into two parts: a cli benchmark runner and a library. - The cli runner is quite simple. It just allows us to run `.yaml` based test sequences. Now it should only be used to run benchmarks during development. - The library is used in the cli runner and in regression tests. Some code is changed to make the library independent of the runner. - Added first regression tests for availability read and write that replicate existing test sequences. ### How we run regression tests - Regression tests are simply rust integration tests without the harnesses. - They should only be compiled under the `subsystem-benchmarks` feature to prevent them from running with other tests. - This doesn't work when running tests with `nextest` in CI, so additional filters have been added to the `nextest` runs. - Each benchmark run takes a different time in the beginning, so we "warm up" the tests until their CPU usage differs by only 1%. - After the warm-up, we run the benchmarks a few more times and compare the average with the exception using a precision. ### What is still wrong? - I haven't managed to set up approval voting tests. The spread of their results is too large and can't be narrowed down in a reasonable amount of time in the warm-up phase. - The tests start an unconfigurable prometheus endpoint inside, which causes errors because they use the same 9999 port. I disable it with a flag, but I think it's better to extract the endpoint launching outside the test, as we already do with `valgrind` and `pyroscope`. But we still use `prometheus` inside the tests. ### Future work * https://github.com/paritytech/polkadot-sdk/issues/3528 * https://github.com/paritytech/polkadot-sdk/issues/3529 * https://github.com/paritytech/polkadot-sdk/issues/3530 * https://github.com/paritytech/polkadot-sdk/issues/3531 --------- Co-authored-by: Alexander Samusev <41779041+alvicsam@users.noreply.github.com>	2024-03-01 14:30:43 +00:00
Andrei Eres	cbd684673b	subsystem-bench: polish imports (#3262 )	2024-02-12 09:17:29 +00:00
Andrei Eres	07f8592974	subsystem-bench: run cli benchmarks only using config files (#3239 ) This PR removes the configuration of subsystem benchmarks via CLI arguments. After this, we only keep configurations only in yaml files. It removes unnecessary code duplication	2024-02-08 13:33:32 +00:00
Andrei Eres	9e6298e7f7	subsystem-bench: Prepare CI output (#3158 ) 1. Benchmark results are collected in a single struct. 2. The output of the results is prettified. 3. The result struct used to save the output as a yaml and store it in artifacts in a CI job. ``` $ cargo run -p polkadot-subsystem-bench --release -- test-sequence --path polkadot/node/subsystem-bench/examples/availability_read.yaml \| tee output.txt $ cat output.txt polkadot/node/subsystem-bench/examples/availability_read.yaml #1 Network usage, KiB total per block Received from peers 510796.000 170265.333 Sent to peers 221.000 73.667 CPU usage, s total per block availability-recovery 38.671 12.890 Test environment 0.255 0.085 polkadot/node/subsystem-bench/examples/availability_read.yaml #2 Network usage, KiB total per block Received from peers 413633.000 137877.667 Sent to peers 353.000 117.667 CPU usage, s total per block availability-recovery 52.630 17.543 Test environment 0.271 0.090 polkadot/node/subsystem-bench/examples/availability_read.yaml #3 Network usage, KiB total per block Received from peers 424379.000 141459.667 Sent to peers 703.000 234.333 CPU usage, s total per block availability-recovery 51.128 17.043 Test environment 0.502 0.167 ``` ``` $ cargo run -p polkadot-subsystem-bench --release -- --ci test-sequence --path polkadot/node/subsystem-bench/examples/availability_read.yaml \| tee output.txt $ cat output.txt - benchmark_name: 'polkadot/node/subsystem-bench/examples/availability_read.yaml #1' network: - resource: Received from peers total: 509011.0 per_block: 169670.33333333334 - resource: Sent to peers total: 220.0 per_block: 73.33333333333333 cpu: - resource: availability-recovery total: 31.845848445 per_block: 10.615282815 - resource: Test environment total: 0.23582828799999941 per_block: 0.07860942933333313 - benchmark_name: 'polkadot/node/subsystem-bench/examples/availability_read.yaml #2' network: - resource: Received from peers total: 411738.0 per_block: 137246.0 - resource: Sent to peers total: 351.0 per_block: 117.0 cpu: - resource: availability-recovery total: 18.93596025099999 per_block: 6.31198675033333 - resource: Test environment total: 0.2541994199999979 per_block: 0.0847331399999993 - benchmark_name: 'polkadot/node/subsystem-bench/examples/availability_read.yaml #3' network: - resource: Received from peers total: 424548.0 per_block: 141516.0 - resource: Sent to peers total: 703.0 per_block: 234.33333333333334 cpu: - resource: availability-recovery total: 16.54178526900001 per_block: 5.513928423000003 - resource: Test environment total: 0.43960946299999537 per_block: 0.14653648766666513 ``` --------- Co-authored-by: Andrei Sandu <54316454+sandreim@users.noreply.github.com>	2024-02-06 16:57:38 +00:00
Alexandru Gheorghe	f9f886886b	Introduce approval-voting/distribution benchmark (#2621 ) ## Summary Built on top of the tooling and ideas introduced in https://github.com/paritytech/polkadot-sdk/pull/2528, this PR introduces a synthetic benchmark for measuring and assessing the performance characteristics of the approval-voting and approval-distribution subsystems. Currently this allows, us to simulate the behaviours of these systems based on the following dimensions: ``` TestConfiguration: # Test 1 - objective: !ApprovalsTest last_considered_tranche: 89 min_coalesce: 1 max_coalesce: 6 enable_assignments_v2: true send_till_tranche: 60 stop_when_approved: false coalesce_tranche_diff: 12 workdir_prefix: "/tmp" num_no_shows_per_candidate: 0 approval_distribution_expected_tof: 6.0 approval_distribution_cpu_ms: 3.0 approval_voting_cpu_ms: 4.30 n_validators: 500 n_cores: 100 n_included_candidates: 100 min_pov_size: 1120 max_pov_size: 5120 peer_bandwidth: 524288000000 bandwidth: 524288000000 latency: min_latency: secs: 0 nanos: 1000000 max_latency: secs: 0 nanos: 100000000 error: 0 num_blocks: 10 ``` ## The approach 1. We build a real overseer with the real implementations for approval-voting and approval-distribution subsystems. 2. For a given network size, for each validator we pre-computed all potential assignments and approvals it would send, because this a computation heavy operation this will be cached on a file on disk and be re-used if the generation parameters don't change. 3. The messages will be sent accordingly to the configured parameters and those are split into 3 main benchmarking scenarios. ## Benchmarking scenarios ### Best case scenario approvals_throughput_best_case.yaml It send to the approval-distribution only the minimum required tranche to gathered the needed_approvals, so that a candidate is approved. ### Behaviour in the presence of no-shows approvals_no_shows.yaml It sends the tranche needed to approve a candidate when we have a maximum of num_no_shows_per_candidate tranches with no-shows for each candidate. ### Maximum throughput approvals_throughput.yaml It sends all the tranches for each block and measures the used CPU and necessary network bandwidth. by the approval-voting and approval-distribution subsystem. ## How to run it ``` cargo run -p polkadot-subsystem-bench --release -- test-sequence --path polkadot/node/subsystem-bench/examples/approvals_throughput.yaml ``` ## Evaluating performance ### Use the real subsystems metrics If you follow the steps in https://github.com/paritytech/polkadot-sdk/tree/master/polkadot/node/subsystem-bench#install-grafana for installing locally prometheus and grafana, all real metrics for the `approval-distribution`, `approval-voting` and overseer are available. E.g: <img width="2149" alt="Screenshot 2023-12-05 at 11 07 46" src="https://github.com/paritytech/polkadot-sdk/assets/49718502/cb8ae2dd-178b-4922-bfa4-dc37e572ed38"> <img width="2551" alt="Screenshot 2023-12-05 at 11 09 42" src="https://github.com/paritytech/polkadot-sdk/assets/49718502/8b4542ba-88b9-46f9-9b70-cc345366081b"> <img width="2154" alt="Screenshot 2023-12-05 at 11 10 15" src="https://github.com/paritytech/polkadot-sdk/assets/49718502/b8874d8d-632e-443a-9840-14ad8e90c54f"> <img width="2535" alt="Screenshot 2023-12-05 at 11 10 52" src="https://github.com/paritytech/polkadot-sdk/assets/49718502/779a439f-fd18-4985-bb80-85d5afad78e2"> ### Profile with pyroscope 1. Setup pyroscope following the steps in https://github.com/paritytech/polkadot-sdk/tree/master/polkadot/node/subsystem-bench#install-pyroscope, then run any of the benchmark scenario with `--profile` as the arguments. 2. Open the pyroscope dashboard in grafana, e.g: <img width="2544" alt="Screenshot 2024-01-09 at 17 09 58" src="https://github.com/paritytech/polkadot-sdk/assets/49718502/58f50c99-a910-4d20-951a-8b16639303d9"> ### Useful logs 1. Network bandwidth requirements: ``` Payload bytes received from peers: 503993 KiB total, 50399 KiB/block Payload bytes sent to peers: 629971 KiB total, 62997 KiB/block ``` 2. Cpu usage by the approval-distribution/approval-voting subsystems. ``` approval-distribution CPU usage 84.061s approval-distribution CPU usage per block 8.406s approval-voting CPU usage 96.532s approval-voting CPU usage per block 9.653s ``` 3. Time passed until a given block is approved ``` Chain selection approved after 3500 ms hash=0x0101010101010101010101010101010101010101010101010101010101010101 Chain selection approved after 4500 ms hash=0x0202020202020202020202020202020202020202020202020202020202020202 ``` ### Using benchmark to quantify improvements from https://github.com/paritytech/polkadot-sdk/pull/1178 + https://github.com/paritytech/polkadot-sdk/pull/1191 Using a versi-node we compare the scenarios where all new optimisations are disabled with a scenarios where tranche0 assignments are sent in a single message and a conservative simulation where the coalescing of approvals gives us just 50% reduction in the number of messages we send. Overall, what we see is a speedup of around 30-40% in the time it takes to process the necessary messages and a 30-40% reduction in the necessary bandwidth. #### Best case scenario comparison(minimum required tranches sent). Unoptimised ``` Number of blocks: 10 Payload bytes received from peers: 53289 KiB total, 5328 KiB/block Payload bytes sent to peers: 52489 KiB total, 5248 KiB/block approval-distribution CPU usage 6.732s approval-distribution CPU usage per block 0.673s approval-voting CPU usage 9.523s approval-voting CPU usage per block 0.952s ``` vs Optimisation enabled ``` Number of blocks: 10 Payload bytes received from peers: 32141 KiB total, 3214 KiB/block Payload bytes sent to peers: 37314 KiB total, 3731 KiB/block approval-distribution CPU usage 4.658s approval-distribution CPU usage per block 0.466s approval-voting CPU usage 6.236s approval-voting CPU usage per block 0.624s ``` #### Worst case all tranches sent, very unlikely happens when sharding breaks. Unoptimised ``` Number of blocks: 10 Payload bytes received from peers: 746393 KiB total, 74639 KiB/block Payload bytes sent to peers: 729151 KiB total, 72915 KiB/block approval-distribution CPU usage 118.681s approval-distribution CPU usage per block 11.868s approval-voting CPU usage 124.118s approval-voting CPU usage per block 12.412s ``` vs optimised ``` Number of blocks: 10 Payload bytes received from peers: 503993 KiB total, 50399 KiB/block Payload bytes sent to peers: 629971 KiB total, 62997 KiB/block approval-distribution CPU usage 84.061s approval-distribution CPU usage per block 8.406s approval-voting CPU usage 96.532s approval-voting CPU usage per block 9.653s ``` ## TODOs [x] Polish implementation. [x] Use what we have so far to evaluate https://github.com/paritytech/polkadot-sdk/pull/1191 before merging. [x] List of features and additional dimensions we want to use for benchmarking. [x] Run benchmark on hardware similar with versi and kusama nodes. [ ] Add benchmark to be run in CI for catching regression in performance. [ ] Rebase on latest changes for network emulation. --------- Signed-off-by: Andrei Sandu <andrei-mihail@parity.io> Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io> Co-authored-by: Andrei Sandu <andrei-mihail@parity.io> Co-authored-by: Andrei Sandu <54316454+sandreim@users.noreply.github.com>	2024-02-05 06:46:22 +00:00
Alexander Samusev	5b7f24fca1	[ci] Update rust in ci image (1.75 and 2024-01-22) (#3016 ) cc https://github.com/paritytech/ci_cd/issues/926 --------- Signed-off-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io> Co-authored-by: command-bot <> Co-authored-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io> Co-authored-by: Bastian Köcher <git@kchr.de> Co-authored-by: Liam Aharon <liam.aharon@hotmail.com> Co-authored-by: Dónal Murray <donal.murray@parity.io> Co-authored-by: Vladimir Istyufeev <vladimir@parity.io>	2024-01-30 23:17:44 +00:00
Andrei Sandu	47e46d178b	Add subsystem benchmarks for `availability-distribution` and `biftield-distribution` (availability write) (#2970 ) Introduce a new test objective : `DataAvailabilityWrite`. The new benchmark measures the network and cpu usage of `availability-distribution`, `biftield-distribution` and `availability-store` subsystems from the perspective of a validator node during the process when candidates are made available. Additionally I refactored the networking emulation to support bandwidth acounting and limits of incoming and outgoing requests. Screenshot of succesful run <img width="1293" alt="Screenshot 2024-01-17 at 19 17 44" src="https://github.com/paritytech/polkadot-sdk/assets/54316454/fde11280-e25b-4dc3-9dc9-d4b9752f9b7a"> --------- Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>	2024-01-25 17:02:24 +00:00
Andrei Eres	ec7bfae00a	subsystem-bench: cache misses profiling (#2893 ) ## Why we need it To provide another level of understanding to why polkadot's subsystems may perform slower than expected. Cache misses occur when processing large amounts of data, such as during availability recovery. ## Why Cachegrind Cachegrind has many drawbacks: it is slow, it uses its own cache simulation, which is very basic. But unlike `perf`, which is a great tool, Cachegrind can run in a virtual machine. This means we can easily run it in remote installations and even use it in CI/CD to catch possible regressions. Why Cachegrind and not Callgrind, another part of Valgrind? It is simply empirically proven that profiling runs faster with Cachegrind. ## First results First results have been obtained while testing of the approach. Here is an example. ``` $ target/testnet/subsystem-bench --n-cores 10 --cache-misses data-availability-read $ cat cachegrind_report.txt I refs: 64,622,081,485 I1 misses: 3,018,168 LLi misses: 437,654 I1 miss rate: 0.00% LLi miss rate: 0.00% D refs: 12,161,833,115 (9,868,356,364 rd + 2,293,476,751 wr) D1 misses: 167,940,701 ( 71,060,073 rd + 96,880,628 wr) LLd misses: 33,550,018 ( 16,685,853 rd + 16,864,165 wr) D1 miss rate: 1.4% ( 0.7% + 4.2% ) LLd miss rate: 0.3% ( 0.2% + 0.7% ) LL refs: 170,958,869 ( 74,078,241 rd + 96,880,628 wr) LL misses: 33,987,672 ( 17,123,507 rd + 16,864,165 wr) LL miss rate: 0.0% ( 0.0% + 0.7% ) ``` The CLI output shows that 1.4% of the L1 data cache missed, which is not so bad, given that the last-level cache had that data most of the time missing only 0.3%. Instruction data of the L1 has 0.00% misses of the time. Looking at an output file with `cg_annotate` shows that most of the misses occur during reed-solomon, which is expected.	2024-01-16 17:14:29 +00:00
Alin Dima	f2a750ee86	add fallback request for req-response protocols (#2771 ) Previously, it was only possible to retry the same request on a different protocol name that had the exact same binary payloads. Introduce a way of trying a different request on a different protocol if the first one fails with Unsupported protocol. This helps with adding new req-response versions in polkadot while preserving compatibility with unupgraded nodes. The way req-response protocols were bumped previously was that they were bundled with some other notifications protocol upgrade, like for async backing (but that is more complicated, especially if the feature does not require any changes to a notifications protocol). Will be needed for implementing https://github.com/polkadot-fellows/RFCs/pull/47 TODO: - [x] add tests - [x] add guidance docs in polkadot about req-response protocol versioning	2024-01-10 15:19:50 +02:00
Tsvetomir Dimitrov	cea7024de1	Fix clippy warnings (#2861 ) Fix some issues reported by clippy	2024-01-05 20:31:56 +01:00
Andrei Eres	526c81b138	subsystem benchmarks: add cpu profiling (#2734 ) Ready-to-merge version of https://github.com/paritytech/polkadot-sdk/pull/2601 - Added optional CPU profiling - Updated instructions how to set up Prometheus, Pyroscope and Graphana - Added a flamegraph dashboard <img width="1470" alt="image" src="https://github.com/paritytech/polkadot-sdk/assets/27277055/c8f3b33d-3c01-4ec0-ac34-72d52325b6e6"> --------- Co-authored-by: ordian <write@reusable.software>	2023-12-19 12:14:22 +01:00
Andrei Sandu	8a6e9ef189	Introduce subsystem benchmarking tool (#2528 ) This tool makes it easy to run parachain consensus stress/performance testing on your development machine or in CI. ## Motivation The parachain consensus node implementation spans across many modules which we call subsystems. Each subsystem is responsible for a small part of logic of the parachain consensus pipeline, but in general the most load and performance issues are localized in just a few core subsystems like `availability-recovery`, `approval-voting` or `dispute-coordinator`. In the absence of such a tool, we would run large test nets to load/stress test these parts of the system. Setting up and making sense of the amount of data produced by such a large test is very expensive, hard to orchestrate and is a huge development time sink. ## PR contents - CLI tool - Data Availability Read test - reusable mockups and components needed so far - Documentation on how to get started ### Data Availability Read test An overseer is built with using a real `availability-recovery` susbsytem instance while dependent subsystems like `av-store`, `network-bridge` and `runtime-api` are mocked. The network bridge will emulate all the network peers and their answering to requests. The test is going to be run for a number of blocks. For each block it will generate send a “RecoverAvailableData” request for an arbitrary number of candidates. We wait for the subsystem to respond to all requests before moving to the next block. At the same time we collect the usual subsystem metrics and task CPU metrics and show some nice progress reports while running. ### Here is how the CLI looks like: ``` [2023-11-28T13:06:27Z INFO subsystem_bench::core::display] n_validators = 1000, n_cores = 20, pov_size = 5120 - 5120, error = 3, latency = Some(PeerLatency { min_latency: 1ms, max_latency: 100ms }) [2023-11-28T13:06:27Z INFO subsystem-bench::availability] Generating template candidate index=0 pov_size=5242880 [2023-11-28T13:06:27Z INFO subsystem-bench::availability] Created test environment. [2023-11-28T13:06:27Z INFO subsystem-bench::availability] Pre-generating 60 candidates. [2023-11-28T13:06:30Z INFO subsystem-bench::core] Initializing network emulation for 1000 peers. [2023-11-28T13:06:30Z INFO subsystem-bench::availability] Current block 1/3 [2023-11-28T13:06:30Z INFO substrate_prometheus_endpoint] 〽️ Prometheus exporter started at 127.0.0.1:9999 [2023-11-28T13:06:30Z INFO subsystem_bench::availability] 20 recoveries pending [2023-11-28T13:06:37Z INFO subsystem_bench::availability] Block time 6262ms [2023-11-28T13:06:37Z INFO subsystem-bench::availability] Sleeping till end of block (0ms) [2023-11-28T13:06:37Z INFO subsystem-bench::availability] Current block 2/3 [2023-11-28T13:06:37Z INFO subsystem_bench::availability] 20 recoveries pending [2023-11-28T13:06:43Z INFO subsystem_bench::availability] Block time 6369ms [2023-11-28T13:06:43Z INFO subsystem-bench::availability] Sleeping till end of block (0ms) [2023-11-28T13:06:43Z INFO subsystem-bench::availability] Current block 3/3 [2023-11-28T13:06:43Z INFO subsystem_bench::availability] 20 recoveries pending [2023-11-28T13:06:49Z INFO subsystem_bench::availability] Block time 6194ms [2023-11-28T13:06:49Z INFO subsystem-bench::availability] Sleeping till end of block (0ms) [2023-11-28T13:06:49Z INFO subsystem_bench::availability] All blocks processed in 18829ms [2023-11-28T13:06:49Z INFO subsystem_bench::availability] Throughput: 102400 KiB/block [2023-11-28T13:06:49Z INFO subsystem_bench::availability] Block time: 6276 ms [2023-11-28T13:06:49Z INFO subsystem_bench::availability] Total received from network: 415 MiB Total sent to network: 724 KiB Total subsystem CPU usage 24.00s CPU usage per block 8.00s Total test environment CPU usage 0.15s CPU usage per block 0.05s ``` ### Prometheus/Grafana stack in action <img width="1246" alt="Screenshot 2023-11-28 at 15 11 10" src="https://github.com/paritytech/polkadot-sdk/assets/54316454/eaa47422-4a5e-4a3a-aaef-14ca644c1574"> <img width="1246" alt="Screenshot 2023-11-28 at 15 12 01" src="https://github.com/paritytech/polkadot-sdk/assets/54316454/237329d6-1710-4c27-8f67-5fb11d7f66ea"> <img width="1246" alt="Screenshot 2023-11-28 at 15 12 38" src="https://github.com/paritytech/polkadot-sdk/assets/54316454/a07119e8-c9f1-4810-a1b3-f1b7b01cf357"> --------- Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>	2023-12-14 12:57:17 +02:00

20 Commits