Commit Graph

450 Commits

Author SHA1 Message Date
Andronik Ordian 9ac35d9f2b gossip: choose a random subset on send instead of limiting connections (#2776)
* gossip: choose random subset on send

* naming bikeshed
2021-03-30 20:59:53 +02:00
Andronik Ordian a3115401c3 network-bridge: elevate log level for connections (#2772) 2021-03-30 20:01:57 +02:00
Robert Habermeier 08d5b268a0 Retry availability until the receiver of the request is dropped (#2763)
* guide updates

* keep interactions alive until receivers drop

* retry indefinitely

* cancel approval tasks on finality

* use swap_remove instead of remove
2021-03-30 17:33:38 +02:00
Robert Klotzner 6514e00144 Add tags to pov-fetcher. (#2768)
* Add tags to pov-fetcher.

* Add stage as well.

* Get rid of redundant tags.
2021-03-30 15:07:07 +02:00
Andronik Ordian bdee5a3923 approval-distribution: add an assertion (#2761) 2021-03-30 14:18:34 +02:00
Bastian Köcher c047a2bde2 Companion for Substrate#8472 (#2759)
* Companion for Substrate#8472

https://github.com/paritytech/substrate/pull/8472

* update Substrate

Co-authored-by: parity-processbot <>
2021-03-30 12:15:50 +02:00
Robert Klotzner 0bc42785b4 availability-distribution: Retry failed fetches on next block. (#2762)
* availability-distribution: Retry on fail on next block.

Retry failed fetches on next block when still pending availability.

* Update node/network/availability-distribution/src/requester/fetch_task/mod.rs

Co-authored-by: Andronik Ordian <write@reusable.software>

* Fix existing tests.

* Add test for trying all validators.

* Add test for testing retries.

Co-authored-by: Andronik Ordian <write@reusable.software>
2021-03-30 00:28:43 +02:00
Robert Habermeier e906598e94 tracing for pending_known map (#2755)
* tracing for pending_known map

* fix condition in retain

* add block hash to pending tracing
2021-03-29 21:38:03 +02:00
Robert Habermeier 235c2e26b1 more logs in issue-approval (#2758) 2021-03-29 21:06:44 +02:00
Robert Habermeier 54074d2d76 send assignments even when we have an approval (#2757) 2021-03-29 16:34:14 +00:00
Robert Habermeier 7c21dbbdf4 use unbounded when notifying distribution of new blocks (#2752) 2021-03-29 14:30:20 +00:00
Robert Habermeier fc154d2ada Reduce signal channel sizes and more logging on approval-voting (#2751)
* reduce signal channel capacity

* more tracing for approval-voting
2021-03-29 15:46:12 +02:00
Robert Klotzner 0a9fe852df Move non runtime related stuff into node/primitives (#2743)
* Remove stuff out of the runtime that does not belong there.

There might be more, but it is a start.

* White space fixes.

* Fix tests.

* Leave whitespace in ui tests alone.

* Add back zstd for no reason.

* Fix browser wasm (hopefully)
2021-03-29 02:15:44 +02:00
Robert Habermeier 8ebbe19d10 Split NetworkBridge and break cycles with Unbounded (#2736)
* overseer: pass messages directly between subsystems

* test that message is held on to

* Update node/overseer/src/lib.rs

Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>

* give every subsystem an unbounded sender too

* remove metered_channel::name

1. we don't provide good names
2. these names are never used anywhere

* unused mut

* remove unnecessary &mut

* subsystem unbounded_send

* remove unused MaybeTimer

We have channel size metrics that serve the same purpose better now and the implementation of message timing was pretty ugly.

* remove comment

* split up senders and receivers

* update metrics

* fix tests

* fix test subsystem context

* use SubsystemSender in jobs system now

* refactor of awful jobs code

* expose public `run` on JobSubsystem

* update candidate backing to new jobs & use unbounded

* bitfield signing

* candidate-selection

* provisioner

* approval voting: send unbounded for assignment/approvals

* async not needed

* begin bridge split

* split up network tasks into background worker

* port over network bridge

* Update node/network/bridge/src/lib.rs

Co-authored-by: Andronik Ordian <write@reusable.software>

* rename ValidationWorkerNotifications

Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
Co-authored-by: Andronik Ordian <write@reusable.software>
2021-03-29 01:18:53 +02:00
Andronik Ordian 6f464a360f approval-distribution: limit the amount of assignments on unify (#2737)
* approval-distribution: limit the amount of packets on unify

* guide: fix a typo

* compilation fix

* grammar

* Update roadmap/implementers-guide/src/node/approval/approval-distribution.md

Co-authored-by: David <dvdplm@gmail.com>

* more grammar

* propagate only local assignments/approvals after a certain depth

* increase the threshold

* guides update

Co-authored-by: David <dvdplm@gmail.com>
2021-03-28 23:30:06 +02:00
Andronik Ordian 171fc69961 approval-voting: more spans and metrics (#2742)
* approval-voting: more spans and metrics

* s/db/approval db
2021-03-28 23:28:49 +02:00
Pierre Krieger e3dc9024ce Call NetworkService::add_known_address before sending a request (#2726)
* Call NetworkService::add_known_address before sending a request

* Better doc

* Update Substrate

* Update Substrate

* Restore the import 🤷‍♀️ I don't know why it compiles locally

* imports correctly

Co-authored-by: Robert Habermeier <rphmeier@gmail.com>
2021-03-28 16:01:49 +00:00
Robert Habermeier 5952e790fa Overseer: subsystems communicate directly (#2227)
* overseer: pass messages directly between subsystems

* test that message is held on to

* Update node/overseer/src/lib.rs

Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>

* give every subsystem an unbounded sender too

* remove metered_channel::name

1. we don't provide good names
2. these names are never used anywhere

* unused mut

* remove unnecessary &mut

* subsystem unbounded_send

* remove unused MaybeTimer

We have channel size metrics that serve the same purpose better now and the implementation of message timing was pretty ugly.

* remove comment

* split up senders and receivers

* update metrics

* fix tests

* fix test subsystem context

* fix flaky test

* fix docs

* doc

* use select_biased to favor signals

* Update node/subsystem/src/lib.rs

Co-authored-by: Andronik Ordian <write@reusable.software>

Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
Co-authored-by: Andronik Ordian <write@reusable.software>
2021-03-28 15:55:10 +00:00
Robert Klotzner c6f07d8f31 Request based PoV distribution (#2640)
* Indentation fix.

* Prepare request-response for PoV fetching.

* Drop old PoV distribution.

* WIP: Fetch PoV directly from backing.

* Backing compiles.

* Runtime access and connection management for PoV distribution.

* Get rid of seemingly dead code.

* Implement PoV fetching.

Backing does not yet use it.

* Don't send `ConnectToValidators` for empty list.

* Even better - no need to check over and over again.

* PoV fetching implemented.

+ Typechecks
+ Should work

Missing:

- Guide
- Tests
- Do fallback fetching in case fetching from seconding validator fails.

* Check PoV hash upon reception.

* Implement retry of PoV fetching in backing.

* Avoid pointless validation spawning.

* Add jaeger span to pov requesting.

* Add back tracing.

* Review remarks.

* Whitespace.

* Whitespace again.

* Cleanup + fix tests.

* Log to log target in overseer.

* Fix more tests.

* Don't fail if group cannot be found.

* Simple test for PoV fetcher.

* Handle missing group membership better.

* Add test for retry functionality.

* Fix flaky test.

* Spaces again.

* Guide updates.

* Spaces.
2021-03-28 17:11:38 +02:00
Robert Habermeier ef816b089d Approval voting failsafe (#2675)
* add consensus log type

* origin and issue force_approve

* add origin in runtimes

* ref API

* scrape force_approve digest from header

* add parent_hash to BlockEntry

* add block_number to block entry and force_approve skeleton

* implement and plug in force-approve

* test force_approve

* test force_approve extraction

* westend runtime

* Update node/core/approval-voting/src/approval_db/v1/mod.rs

Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>

* rename

* Update runtime/parachains/src/initializer.rs

Co-authored-by: André Silva <123550+andresilva@users.noreply.github.com>

Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
Co-authored-by: André Silva <123550+andresilva@users.noreply.github.com>
2021-03-28 00:57:04 +00:00
Andronik Ordian dce20644c8 approval-distribution: moar metrics (#2734) 2021-03-28 00:23:32 +01:00
Andronik Ordian 71f1985172 approval-distribution: moar logs (#2732) 2021-03-27 23:21:25 +01:00
Robert Habermeier c503fbc2a0 duplicate logging fix (#2729)
* duplicate logging fix

* remove duplicate peer IDs
2021-03-27 16:17:35 +01:00
Robert Habermeier 15a956321a use a gauge for approval lag (#2725) 2021-03-27 16:13:34 +01:00
Robert Klotzner 6ea6299bca Reduce network bridge logging verbosity (#2717)
* Those should really be trace.

- Very spammy
- And they in fact trace the execution
- Should not be enabled lightly - will slow network bridge down.

* Make report peers debug again.
2021-03-27 00:19:43 +01:00
Bernhard Schuster 5b363358b8 chore: avoid glob imports (#2722) 2021-03-26 15:45:37 +01:00
Robert Habermeier 73b9247c10 Separate metrics for messages sent & received (#2721)
* metered channel - sent & received

* Add for readouts

* metrics for both sent & received

* retract on send failure
2021-03-26 14:11:01 +01:00
Robert Habermeier 064df81ee4 Add block number to activated leaves and associated fixes (#2718)
* add number to `ActivatedLeavesUpdate`

* update subsystem util and overseer

* use new ActivatedLeaf everywhere

* sort view

* sorted and limited view in network bridge

* use live block hash only if it's newer

* grumples
2021-03-26 13:06:40 +01:00
Robert Habermeier 2388141b25 overseer: AllSubsystems magic and subsystem channel sizes metrics (#2711)
* overseer: AllSubsystems magic and report subsystem channel sizes to prometheus

* fix tests
2021-03-25 21:21:55 +01:00
Robert Habermeier 8a396c678f Port availability recovery to use req/res (#2694)
* add AvailableDataFetchingRequest

* rename AvailabilityFetchingRequest to ChunkFetchingRequest

* rename AvailabilityFetchingResponse to Chunk_

* add AvailableDataFetching request

* add available data fetching request to availability recovery message

* remove availability recovery message

* fix

* update network bridge

* port availability recovery to request/response

* use validators.len(), not shuffling

* fix availability recovery tests

* update guide

* Update node/network/availability-recovery/src/lib.rs

Co-authored-by: Bernhard Schuster <bernhard@ahoi.io>

* Update node/network/availability-recovery/src/lib.rs

Co-authored-by: Arkadiy Paronyan <arkady.paronyan@gmail.com>

* remove println

Co-authored-by: Bernhard Schuster <bernhard@ahoi.io>
Co-authored-by: Arkadiy Paronyan <arkady.paronyan@gmail.com>
2021-03-25 15:34:24 +01:00
Robert Habermeier 349879df6b f+1 validators always approve (#2699)
* f+1 always approves

* guide

* grumbles

* grumbles

* fix test

* fix tests

* Update roadmap/implementers-guide/src/node/approval/approval-voting.md

Co-authored-by: Sergei Shulepov <sergei@parity.io>

Co-authored-by: Sergei Shulepov <sergei@parity.io>
2021-03-25 13:47:00 +01:00
Martin Pugh 9938513e71 Bump version , weights and substrate in prep for v0.8.30 (#2690)
* bump version and substrate

* bump old forgotten versions

* update weights

* bump substrate

* Revert "bump substrate"

This reverts commit 8b5004b6fe9ce9ccdf143d3fe878802931ea4f2f.

Co-authored-by: André Silva <andrerfosilva@gmail.com>
2021-03-25 11:29:11 +01:00
André Silva bfbb078525 collator-protocol: add message authentication (#2635)
* collator: authenticate collator protocol messages

* fix tests compilation

* node: verify collator protocol signatures in tests

* collator: fix tests

* implementers-guide: update CollatorProtocol messages

* collator: add test for verification of collator protocol signatures

* node: remove fixmes

* node: remove signature from advertisecollation message

* node: add magic constant to Declare message signature payload
2021-03-24 22:13:32 +01:00
Arkadiy Paronyan de85c05102 Tweaked logging (#2695)
* Tweaked logging

* Debug for Statement
2021-03-24 18:06:44 +00:00
Robert Habermeier e49b3e5ca9 Improve approval tracing (#2697)
* improve tracing for approval voting

* assignment criteria tracing

* new syntax
2021-03-24 17:46:01 +00:00
Arkadiy Paronyan d78e2fbf86 Additional logging (#2693) 2021-03-24 16:24:54 +00:00
Robert Klotzner fa11c6d785 Unify maximum supported PoV size a bit. (#2691)
* Unify maximum supported PoV size a bit.

* Use MAX_POV_SIZE also in `HostConfiguration`.

* Fix types.
2021-03-24 15:48:36 +01:00
Shawn Tabrizi a000a3351b Remove Parachains Stuff from Westend (#2689)
* Remove Parachains Stuff from Westend

* remove unused weights

* clean up genesis
2021-03-24 12:58:30 +00:00
Robert Habermeier b8867d71bc Evict inactive peers from the collator protocol peer-set (#2680)
* malicious reputation cost is fatal

* make ReportBad a malicious cost

* futures control-flow for cleaning up inactive collator peers

* guide: network bridge updates

* add `PeerDisconnected` message

* guide: update

* reverse order

* remember to match

* implement disconnect peer in network bridge

* implement disconnect_inactive_peers

* test

* remove println

* don't hardcore policy

* add fuse outside of loop

* use default eviction policy
2021-03-24 13:32:28 +01:00
Robert Klotzner 0f8b6f2f6e Bigger is better. (#2687)
* Bigger is better.

Made all request response sizes 10 times bigger.

* The smaller the better.

* Update comment.

* Ah, bigger is still better.

Max PoV size for rococo is around 50Meg, compression ratio is about 3.4.
With 30 Meg we should be fine, even with crypto kitties in the PoV.
2021-03-24 13:29:59 +01:00
Bernhard Schuster b1b178d186 fix/jaeger: avoid array debug repr (#2685) 2021-03-24 13:21:42 +01:00
Arkadiy Paronyan 5929d1ef15 Additional logging for polkadot network protocols (#2684)
* Additional logging for polkadot network protocols

* Additional log

* Update node/network/bitfield-distribution/src/lib.rs

Co-authored-by: Robert Habermeier <rphmeier@gmail.com>

* Update node/network/availability-distribution/src/responder.rs

* Added additional chunk info

* Added additional peer info

Co-authored-by: Robert Habermeier <rphmeier@gmail.com>
2021-03-24 11:55:50 +00:00
Bastian Köcher edb36153b1 Improve logging (#2669)
* Improve logging

* Review feedback

* Fix some warning and some further logging changes
2021-03-23 11:57:59 +01:00
Robert Habermeier a80b2bbf13 more approval voting instrumentation (#2663)
* more approval voting instrumentation

* fix `unapproved_candidates`

* Update node/core/approval-voting/src/lib.rs

Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
2021-03-23 11:10:54 +01:00
Pierre Krieger bbc3ad3cfc Add debug messaes to the bridge actions (#2668) 2021-03-23 10:52:30 +01:00
Bernhard Schuster ea6294fa79 restructure polkadot-node-jaeger (#2642) 2021-03-19 16:51:16 +01:00
Robert Klotzner 59640a38bc Don't accept incoming connections for collators (#2644)
* Don't accept incoming connections for collators

on the `Collation` peer set.

* Better docs.
2021-03-19 07:20:38 +00:00
Bastian Köcher 15ae5dd410 Improve the logging (#2645) 2021-03-18 23:28:43 +00:00
Arkadiy Paronyan 6437740acf Update for the new substrate client API (#2570)
* Update for the new substrate client API

* Code review suggestions

* Update substrate
2021-03-18 13:42:53 +01:00
Robert Klotzner 503e2b74f9 Request based collation fetching (#2621)
* Introduce collation fetching protocol

also move to mod.rs

* Allow `PeerId`s in requests to network bridge.

* Fix availability distribution tests.

* Move CompressedPoV to primitives.

* Request based collator protocol: validator side

- Missing: tests
- Collator side
- don't connect, if not connected

* Fixes.

* Basic request based collator side.

* Minor fix on collator side.

* Don't connect in requests in collation protocol.

Also some cleanup.

* Fix PoV distribution

* Bump substrate

* Add back metrics + whitespace fixes.

* Add back missing spans.

* More cleanup.

* Guide update.

* Fix tests

* Handle results in tests.

* Fix weird compilation issue.

* Add missing )

* Get rid of dead code.

* Get rid of redundant import.

* Fix runtime build.

* Cleanup.

* Fix wasm build.

* Format fixes.

Thanks @andronik !
2021-03-18 09:06:36 +01:00