Commit Graph

219 Commits

Author SHA1 Message Date
Robert Klotzner 24d1eb40cc Increase PoV timeout slightly. (#3144) 2021-05-31 22:12:18 -05:00
Robert Habermeier 963993d288 Reversion Safety tools for overseer and subsystems (#3104)
* guide: reversion safety

* guide: manage reversion safety in subsystems

* add leaf status to ActivatedLeaf

* add an LRU-cache to overseer for staleness detection

* update ActivatedLeaf usages in tests to contain status field

* add variant where missed accidentally

* add some helpers to LeafStatus

* address grumbles
2021-05-31 20:54:05 +02:00
dependabot[bot] 5316cbbc66 Bump tracing from 0.1.25 to 0.1.26 (#3120)
Bumps [tracing](https://github.com/tokio-rs/tracing) from 0.1.25 to 0.1.26.
- [Release notes](https://github.com/tokio-rs/tracing/releases)
- [Commits](https://github.com/tokio-rs/tracing/compare/tracing-0.1.25...tracing-0.1.26)

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-05-27 09:51:35 +02:00
Andronik Ordian 701be9aa03 validator_discovery: small tweak in retrying logic (#3102)
* validator_discovery: small tweak in retrying logic

* validator_discovery: use timeouts instead
2021-05-25 20:59:14 +02:00
Andronik Ordian 44d02faa62 network-bridge: downgrade log level of benefit rep change (#3068)
* network-bridge: downgrade log level of benefit rep change

* remove it as we log it at higher level
2021-05-21 19:16:20 -05:00
Bernhard Schuster e8652e73db cargo spellcheck (#3067) 2021-05-22 00:15:47 +00:00
Robert Klotzner 9b06a38bb6 State can be finished due to Share message. (#3070)
* State can be finished due to `Share` message.

Therefoe a task can still be running in that state. Removed panic and
changed state name to reflect possibility of `Share` message.

* bump spec versions in kusama, polkadot and westend again III

* properly bump for the upcoming release

Co-authored-by: Andronik Ordian <write@reusable.software>
2021-05-21 20:04:43 +02:00
Pierre Krieger 17907c7e6c Add parachain_desired_peer_count metric (#3035) 2021-05-21 16:47:04 +02:00
Andronik Ordian 2e70f4ea08 validator-discovery: basic retrying logic (#3059)
* validator_discovery: less flexible, but simpler design

* fix test

* remove unused struct

* smol optimization

* validator_discovery: basic retrying logic

* add a test

* add more tests

* update the guide

* more test logic

* Require at least 2/3 connectivity.

* Fix test.

* Update node/network/gossip-support/src/lib.rs

Co-authored-by: André Silva <123550+andresilva@users.noreply.github.com>

* Update node/network/gossip-support/src/lib.rs

Co-authored-by: André Silva <123550+andresilva@users.noreply.github.com>

Co-authored-by: Robert Klotzner <robert.klotzner@gmx.at>
Co-authored-by: Robert Klotzner <eskimor@users.noreply.github.com>
Co-authored-by: André Silva <123550+andresilva@users.noreply.github.com>
2021-05-20 10:05:44 +00:00
Andronik Ordian 98c06f5b57 validator_discovery: less flexible, but simpler design (#3052)
* validator_discovery: less flexible, but simpler design

* fix test

* remove unused struct

* smol optimization
2021-05-19 18:54:13 +02:00
Robert Klotzner 44c03a3633 Actually connect to new validators at session boundary. (#3055)
* Actually connect to new validators at session boundary.

* Add tracing.
2021-05-19 11:29:55 +00:00
Pierre Krieger 78b87c47a8 Grab stream of networking events earlier (#3025) 2021-05-14 17:51:44 +02:00
Andronik Ordian 60fbca3c2a validator_discovery: simplification (#3009)
* validator_discovery: simplification

* compilation fixes

* compilation fixes II

* compilation fixes III

* compilation fixes IV
2021-05-13 11:31:15 +02:00
Robert Klotzner 1e5f193765 Fix flaky test (#3002)
* Reporting peer might come first

before trying to request data, depending on scheduler.

* Better test.
2021-05-10 15:16:08 +00:00
Robert Klotzner c0df140ec6 Flood protection for large statements. (#2984)
* Flood protection for large statements.

* Add test for flood protection.

* Doc improvements.
2021-05-06 20:41:05 +02:00
Pierre Krieger 64c8b913c3 Companion PR for #8682 (#2958)
* Companion PR for #8682

* Compilation fix

* Update beefy

* update Substrate

Co-authored-by: parity-processbot <>
2021-05-06 16:41:28 +02:00
Robert Klotzner 1508024a47 Some overdue cleanup (#2989)
* Cleanup obsolete code.

* Move session cache to requester.
2021-05-06 16:15:23 +02:00
Robert Klotzner 601a3781c1 Always connect in collator protocol. (#2980)
* Always connect in collator protocol.

* Fix unused import.

* We always issue connection requests now.

* Fix stupid boolean logic with one variable.

* Fix CI.
2021-05-06 12:54:54 +02:00
Robert Klotzner 795a526e6d Do peer connect later (as it happens in reality). (#2971)
Otherwise peer connect events occassionally happen before
`StatementFetchingReceiver` message.
2021-05-03 21:50:32 +02:00
Robert Klotzner 0dbdfef95e More secure Signed implementation (#2963)
* Remove signature verification in backing.

`SignedFullStatement` now signals that the signature has already been
checked.

* Remove unused check_payload function.

* Introduced unchecked signed variants.

* Fix inclusion to use unchecked variant.

* More unchecked variants.

* Use unchecked variants in protocols.

* Start fixing statement-distribution.

* Fixup statement distribution.

* Fix inclusion.

* Fix warning.

* Fix backing properly.

* Fix bitfield distribution.

* Make crypto store optional for `RuntimeInfo`.

* Factor out utility functions.

* get_group_rotation_info

* WIP: Collator cleanup + check signatures.

* Convenience signature checking functions.

* Check signature on collator-side.

* Fix warnings.

* Fix collator side tests.

* Get rid of warnings.

* Better Signed/UncheckedSigned implementation.

Also get rid of Encode/Decode for Signed! *party*

* Get rid of dead code.

* Move Signed in its own module.

* into_checked -> try_into_checked

* Fix merge.
2021-05-03 21:41:14 +02:00
Robert Klotzner c86a774b9d Send statements to own backing group first (#2927)
* Factor out runtime module into utils.

* First fatal error design.

* Better error handling infra.

* Error handling cleanup.

* Send to peers of our group first.

* Finish backing group prioritization.

* Little cleanup.

* More cleanup.

* Forgot to checkin error.rs.

* Notes.

* Runtime -> RuntimeInfo

* qed in debug assert.

* PolkaErr -> Fault.
2021-04-27 21:47:32 +02:00
François Garillot d4ddf8d7e8 Simplify some Option / Result / ? operator patterns (#2920)
* Simplify some Option / Result / ? operator patterns

When they identically match a combinator on those types.

Tool-aided by [comby-rust](https://github.com/huitseeker/comby-rust).

* adjust review comments

Co-authored-by: Shawn Tabrizi <shawntabrizi@gmail.com>
2021-04-25 20:51:01 +00:00
Robert Klotzner dacde443f7 Infrastructure improvements (#2897)
* Factor out runtime module into utils.

* Add maybe_authority information to `PeerConnected` event.

We already gather this information in authority discovery, so we might
as well share it with others.

This opens up an easy path to trigger validators differently from normal
nodes, e.g. for prioritization. This change has become more important
now, that we just connect to all validators and therefore just have a
long peer list without any information about those nodes.

* Test fix.
2021-04-16 21:42:20 +02:00
Robert Klotzner d31c8a0dac Only accept requests from peers that got notified by us. (#2889) 2021-04-14 23:32:37 +02:00
Robert Klotzner d1d33abdf8 More tests for new request based statement distribution (#2875)
* More test coverage.

* Preserve peer order.

* Better test coverage.

* Even more test coverage.

* Add doc comment to `IndexMap`.

* Fix flaky test.

* Review remarks.

* Review remarks.
2021-04-12 16:15:25 +02:00
Robert Klotzner dd3733261b Max notification size -> 100k. (#2735) 2021-04-12 10:26:11 +02:00
Andronik Ordian e62939ad7a distribution: handle sqrt peer view updates (#2871)
* distribution: handle sqrt peer view updates

* someone please put rustc into my brain

* guide updates
2021-04-10 00:45:25 +02:00
Robert Klotzner 305375e1e4 Req/res optimization for statement distribution (#2803)
* Wip

* Increase proposer timeout.

* WIP.

* Better timeout values now that we are going to be connected to all nodes. (#2778)

* Better timeout values.

* Fix typo.

* Fix validator bandwidth.

* Fix compilation.

* Better and more consistent sizes.

Most importantly code size is now 5 Meg, which is the limit we currently
want to support in statement distribution.

* Introduce statement fetching request.

* WIP

* Statement cache retrieval logic.

* Review remarks by @rphmeier

* Fixes.

* Better requester logic.

* WIP: Handle requester messages.

* Missing dep.

* Fix request launching logic.

* Finish fetching logic.

* Sending logic.

* Redo code size calculations.

Now that max code size is compressed size.

* Update Cargo.lock (new dep)

* Get request receiver to statement distribution.

* Expose new functionality for responding to requests.

* Cleanup.

* Responder logic.

* Fixes + Cleanup.

* Cargo.lock

* Whitespace.

* Add lost copyright.

* Launch responder task.

* Typo.

* info -> warn

* Typo.

* Fix.

* Fix.

* Update comment.

* Doc fix.

* Better large statement heuristics.

* Fix tests.

* Fix network bridge tests.

* Add test for size estimate.

* Very simple tests that checks we get LargeStatement.

* Basic check, that fetching of large candidates is performed.

* More tests.

* Basic metrics for responder.

* More metrics.

* Use Encode::encoded_size().

* Some useful spans.

* Get rid of redundant metrics.

* Don't add peer on duplicate.

* Properly check hash

instead of relying on signatures alone.

* Preserve ordering + better flood protection.

* Get rid of redundant clone.

* Don't shutdown responder on failed query.

And add test for this.

* Smaller fixes.

* Quotes.

* Better queue size calculation.

* A bit saner response sizes.

* Fixes.
2021-04-09 21:30:12 +00:00
Andronik Ordian 8961628e7c bitfield-dist: check signatures once (#2868) 2021-04-09 17:31:34 +00:00
Andronik Ordian 78a46da384 silence some parachain warnings (#2859)
* silence some parachain warnings

* that was actually a bug that Rob is fixing
2021-04-09 00:25:46 +02:00
Robert Habermeier 896ec8dbc3 Code, PoV compression and remove CompressedPoV struct (#2852)
* use compressed blob in candidate-validation

* add some tests for compressed code blobs

* remove CompressedPoV and apply compression in collation-generation

* decompress BlockData before executing

* don't produce oversized collations

* add test for PoV decompression failure

* fix tests and clean up

* fix test

* address review and fix CI

* take this )
2021-04-08 22:09:36 +02:00
Andronik Ordian 936f0d7c59 statement-distribution: do not verify signatures for duplicate statements (#2823)
* stmnt-dist: do not verify signature on duplicates

* renames

* more renames
2021-04-06 17:32:57 +02:00
Pierre Krieger fa0142ac8f Properly remove peers from sets and merge the two Network traits (#2821)
* Properly remove peers from sets

* Actually rename all, I guess

* Merge the two Network traits

* Rename function

* Update node/network/bridge/src/network.rs

Co-authored-by: Andronik Ordian <write@reusable.software>

* Fix erroneous change

* Update node/network/bridge/src/network.rs

Co-authored-by: Andronik Ordian <write@reusable.software>
2021-04-05 21:46:39 +02:00
Andronik Ordian 2ff5c9b995 tests: use future::join instead of future::select (#2813)
* tests/av-store: use future::join instead of future::select

* tests/backing: use future::join instead of future::select

* tests/provisioner: use future::join instead of future::select

* tests/av-dist: use future::join instead of future::select

* tests/av-recovery: use future::join instead of future::select

* tests/bridge: use future::join instead of future::select

* tests/collator-protocol: use future::join instead of future::select

* tests/stmt-dist: use future::join instead of future::select

* fix tests
2021-04-05 18:30:27 +02:00
Robert Habermeier ec5ad35e14 Network bridge metrics (#2818)
* add metrics (unused) to network bridge

* fix test compilation

* trigger metrics messages

* add some more metrics

* track sent and received notifications

* restore metrics import

* integrate into service

* Update node/network/bridge/src/lib.rs

Co-authored-by: Andronik Ordian <write@reusable.software>

* Update node/network/bridge/src/lib.rs

Co-authored-by: Andronik Ordian <write@reusable.software>

Co-authored-by: Andronik Ordian <write@reusable.software>
2021-04-05 01:07:05 +02:00
Andronik Ordian 4df29e71ab bitfield-dist: fix state update on gossip (#2817)
* bitfield-dist: fix state update on gossip

* fixes

* doc fixes

* oops

* 2 lines of code change
2021-04-04 22:25:40 +00:00
Robert Habermeier bfc8f4fcf3 Collators: Declare to all peers (#2816)
* fix tests

* add test for rejecting declares on collators

* fix bad test
2021-04-04 16:59:00 +00:00
Robert Habermeier 11b8e4c821 Collation protocol: stricter validators (#2810)
* guide: declare one para as a collator

* add ParaId to Declare messages and clean up

* fix build

* fix the testerinos

* begin adding keystore to collator-protocol

* remove request_x_ctx

* add core_for_group

* add bump_rotation

* add some more helpers to subsystem-util

* change signing_key API to take ref

* determine current and next para assignments

* disconnect collators who are not on current or next para

* add collator peer count metric

* notes for later

* some fixes

* add data & keystore to test state

* add a test utility for answering runtime API requests

* fix existing collator tests

* add new tests

* remove sc_keystore

* update cargo lock

Co-authored-by: Andronik Ordian <write@reusable.software>
2021-04-03 21:48:58 +02:00
Andronik Ordian 94b0ccc8f1 approval-distribution: split peer knowledge into sent and received (#2809)
* approval-distribution: split peer knowledge into sent and received

* guide updates

* fixes

* revert doc changes
2021-04-03 04:29:15 +02:00
Andronik Ordian 98082c5326 gossip: move authorities request to runtime api subsystem (#2798) 2021-04-01 23:51:01 +02:00
Robert Habermeier 57b56770e0 Approval Voting improvements (#2781)
* extract database from av-store itself

* generalize approval-voting over database type

* modes (without handling) and pruning old wakeups

* rework approval importing

* add our_approval_sig to ApprovalEntry

* import assignment

* guide updates for check-full-approval changes

* some aux functions

* send messages when becoming active.

* guide: network bridge sends view updates only when done syncing

* network bridge: send view updates only when done syncing

* tests for new network-bridge behavior

* add a test for updating approval entry with sig

* fix some warnings

* test load-all-blocks

* instantiate new parachains DB

* fix network-bridge empty view updates

* tweak

* fix wasm build, i think

* Update node/core/approval-voting/src/lib.rs

Co-authored-by: Andronik Ordian <write@reusable.software>

* add some versioning to parachains_db

* warnings

* fix merge changes

* remove versioning again

Co-authored-by: Andronik Ordian <write@reusable.software>
2021-04-01 17:33:52 +00:00
Pierre Krieger 01badafba6 Companion PR for substrate#8510 (#2795)
* Companion PR for substrate#8510

* update Substrate

Co-authored-by: parity-processbot <>
2021-04-01 19:33:43 +02:00
Andronik Ordian 7a2e1ef6c1 gossip: do not try to connect if we are not validators (#2786)
* gossip: do not issue a connection request if we are not a validator

* guide updates

* use all relevant authorities when issuing a request

* use AuthorityDiscoveryApi instead

* update comments to the status quo
2021-04-01 18:11:43 +02:00
Robert Habermeier 5da762e728 Avoid querying the local validator in availability recovery (#2792)
* guide: don't request availability data from ourselves

* add QueryAllChunks message

* implement QueryAllChunks

* remove unused relay_parent from StoreChunk

* test QueryAllChunks

* fast paths make short roads

* test early exit behavior
2021-04-01 15:57:41 +02:00
Andronik Ordian caebf642dd statement-distribution: do not use OurViewChange (#2790)
* quickfix for statement-distribution

* some logs
2021-03-31 23:28:17 +02:00
Robert Klotzner eb6786ad05 Better timeout values now that we are going to be connected to all nodes. (#2778)
* Better timeout values.

* Fix typo.

* Fix validator bandwidth.

* Fix compilation.
2021-03-31 22:34:12 +02:00
Robert Habermeier e65cad69ec Fix future-polling loop in availability and add a better early-exit (#2779)
* onto the front

* fix early exit for waiting for requests

* add logging back
2021-03-31 17:35:17 +02:00
Andronik Ordian 9ac35d9f2b gossip: choose a random subset on send instead of limiting connections (#2776)
* gossip: choose random subset on send

* naming bikeshed
2021-03-30 20:59:53 +02:00
Andronik Ordian a3115401c3 network-bridge: elevate log level for connections (#2772) 2021-03-30 20:01:57 +02:00
Robert Habermeier 08d5b268a0 Retry availability until the receiver of the request is dropped (#2763)
* guide updates

* keep interactions alive until receivers drop

* retry indefinitely

* cancel approval tasks on finality

* use swap_remove instead of remove
2021-03-30 17:33:38 +02:00