Commit Graph

786 Commits

Author SHA1 Message Date
Andronik Ordian 98082c5326 gossip: move authorities request to runtime api subsystem (#2798) 2021-04-01 23:51:01 +02:00
Robert Habermeier 57b56770e0 Approval Voting improvements (#2781)
* extract database from av-store itself

* generalize approval-voting over database type

* modes (without handling) and pruning old wakeups

* rework approval importing

* add our_approval_sig to ApprovalEntry

* import assignment

* guide updates for check-full-approval changes

* some aux functions

* send messages when becoming active.

* guide: network bridge sends view updates only when done syncing

* network bridge: send view updates only when done syncing

* tests for new network-bridge behavior

* add a test for updating approval entry with sig

* fix some warnings

* test load-all-blocks

* instantiate new parachains DB

* fix network-bridge empty view updates

* tweak

* fix wasm build, i think

* Update node/core/approval-voting/src/lib.rs

Co-authored-by: Andronik Ordian <write@reusable.software>

* add some versioning to parachains_db

* warnings

* fix merge changes

* remove versioning again

Co-authored-by: Andronik Ordian <write@reusable.software>
2021-04-01 17:33:52 +00:00
Robert Habermeier 0794f69306 Add dispute types and change InclusionInherent to ParasInherent (#2791)
* dispute types

* add Debug to dispute primitives in std and InherentData

* use ParachainsInherentData on node-side

* change inclusion_inherent to paras_inherent

* RuntimeDebug

* add type parameter to PersistedValidationData users

* fix test client

* spaces

* fix collation-generation test

* fix provisioner tests

* remove references to inclusion inherent
2021-04-01 18:23:27 +02:00
Robert Habermeier 5da762e728 Avoid querying the local validator in availability recovery (#2792)
* guide: don't request availability data from ourselves

* add QueryAllChunks message

* implement QueryAllChunks

* remove unused relay_parent from StoreChunk

* test QueryAllChunks

* fast paths make short roads

* test early exit behavior
2021-04-01 15:57:41 +02:00
Robert Habermeier 08d5b268a0 Retry availability until the receiver of the request is dropped (#2763)
* guide updates

* keep interactions alive until receivers drop

* retry indefinitely

* cancel approval tasks on finality

* use swap_remove instead of remove
2021-03-30 17:33:38 +02:00
Robert Habermeier 235c2e26b1 more logs in issue-approval (#2758) 2021-03-29 21:06:44 +02:00
Robert Habermeier 7c21dbbdf4 use unbounded when notifying distribution of new blocks (#2752) 2021-03-29 14:30:20 +00:00
Robert Habermeier fc154d2ada Reduce signal channel sizes and more logging on approval-voting (#2751)
* reduce signal channel capacity

* more tracing for approval-voting
2021-03-29 15:46:12 +02:00
Robert Klotzner 0a9fe852df Move non runtime related stuff into node/primitives (#2743)
* Remove stuff out of the runtime that does not belong there.

There might be more, but it is a start.

* White space fixes.

* Fix tests.

* Leave whitespace in ui tests alone.

* Add back zstd for no reason.

* Fix browser wasm (hopefully)
2021-03-29 02:15:44 +02:00
Robert Habermeier 8ebbe19d10 Split NetworkBridge and break cycles with Unbounded (#2736)
* overseer: pass messages directly between subsystems

* test that message is held on to

* Update node/overseer/src/lib.rs

Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>

* give every subsystem an unbounded sender too

* remove metered_channel::name

1. we don't provide good names
2. these names are never used anywhere

* unused mut

* remove unnecessary &mut

* subsystem unbounded_send

* remove unused MaybeTimer

We have channel size metrics that serve the same purpose better now and the implementation of message timing was pretty ugly.

* remove comment

* split up senders and receivers

* update metrics

* fix tests

* fix test subsystem context

* use SubsystemSender in jobs system now

* refactor of awful jobs code

* expose public `run` on JobSubsystem

* update candidate backing to new jobs & use unbounded

* bitfield signing

* candidate-selection

* provisioner

* approval voting: send unbounded for assignment/approvals

* async not needed

* begin bridge split

* split up network tasks into background worker

* port over network bridge

* Update node/network/bridge/src/lib.rs

Co-authored-by: Andronik Ordian <write@reusable.software>

* rename ValidationWorkerNotifications

Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
Co-authored-by: Andronik Ordian <write@reusable.software>
2021-03-29 01:18:53 +02:00
Andronik Ordian 171fc69961 approval-voting: more spans and metrics (#2742)
* approval-voting: more spans and metrics

* s/db/approval db
2021-03-28 23:28:49 +02:00
Robert Klotzner c6f07d8f31 Request based PoV distribution (#2640)
* Indentation fix.

* Prepare request-response for PoV fetching.

* Drop old PoV distribution.

* WIP: Fetch PoV directly from backing.

* Backing compiles.

* Runtime access and connection management for PoV distribution.

* Get rid of seemingly dead code.

* Implement PoV fetching.

Backing does not yet use it.

* Don't send `ConnectToValidators` for empty list.

* Even better - no need to check over and over again.

* PoV fetching implemented.

+ Typechecks
+ Should work

Missing:

- Guide
- Tests
- Do fallback fetching in case fetching from seconding validator fails.

* Check PoV hash upon reception.

* Implement retry of PoV fetching in backing.

* Avoid pointless validation spawning.

* Add jaeger span to pov requesting.

* Add back tracing.

* Review remarks.

* Whitespace.

* Whitespace again.

* Cleanup + fix tests.

* Log to log target in overseer.

* Fix more tests.

* Don't fail if group cannot be found.

* Simple test for PoV fetcher.

* Handle missing group membership better.

* Add test for retry functionality.

* Fix flaky test.

* Spaces again.

* Guide updates.

* Spaces.
2021-03-28 17:11:38 +02:00
Robert Habermeier ef816b089d Approval voting failsafe (#2675)
* add consensus log type

* origin and issue force_approve

* add origin in runtimes

* ref API

* scrape force_approve digest from header

* add parent_hash to BlockEntry

* add block_number to block entry and force_approve skeleton

* implement and plug in force-approve

* test force_approve

* test force_approve extraction

* westend runtime

* Update node/core/approval-voting/src/approval_db/v1/mod.rs

Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>

* rename

* Update runtime/parachains/src/initializer.rs

Co-authored-by: André Silva <123550+andresilva@users.noreply.github.com>

Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
Co-authored-by: André Silva <123550+andresilva@users.noreply.github.com>
2021-03-28 00:57:04 +00:00
Robert Habermeier 064df81ee4 Add block number to activated leaves and associated fixes (#2718)
* add number to `ActivatedLeavesUpdate`

* update subsystem util and overseer

* use new ActivatedLeaf everywhere

* sort view

* sorted and limited view in network bridge

* use live block hash only if it's newer

* grumples
2021-03-26 13:06:40 +01:00
Robert Habermeier 349879df6b f+1 validators always approve (#2699)
* f+1 always approves

* guide

* grumbles

* grumbles

* fix test

* fix tests

* Update roadmap/implementers-guide/src/node/approval/approval-voting.md

Co-authored-by: Sergei Shulepov <sergei@parity.io>

Co-authored-by: Sergei Shulepov <sergei@parity.io>
2021-03-25 13:47:00 +01:00
Robert Habermeier e49b3e5ca9 Improve approval tracing (#2697)
* improve tracing for approval voting

* assignment criteria tracing

* new syntax
2021-03-24 17:46:01 +00:00
Bastian Köcher edb36153b1 Improve logging (#2669)
* Improve logging

* Review feedback

* Fix some warning and some further logging changes
2021-03-23 11:57:59 +01:00
Robert Habermeier a80b2bbf13 more approval voting instrumentation (#2663)
* more approval voting instrumentation

* fix `unapproved_candidates`

* Update node/core/approval-voting/src/lib.rs

Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
2021-03-23 11:10:54 +01:00
Bernhard Schuster ea6294fa79 restructure polkadot-node-jaeger (#2642) 2021-03-19 16:51:16 +01:00
Bastian Köcher 15ae5dd410 Improve the logging (#2645) 2021-03-18 23:28:43 +00:00
Robert Habermeier cd2b745b28 yet another set of logging improvements (#2638) 2021-03-17 15:53:37 -04:00
Robert Habermeier bb462683b9 add tracing when no assignment in candidate selection (#2623) 2021-03-14 21:28:06 +01:00
Robert Habermeier 94d50afd4e Backing and collator protocol traces including para-id (#2620)
* improve backing/provisioner spans

* span for collation requests

* add para_id to unbacked candidate spans

* differentiate validation-construction and find-assignment in selection

* better find-assignment spans

* organize unbacked-candidate spans directly under job root

* Update node/core/provisioner/src/lib.rs

Co-authored-by: Andronik Ordian <write@reusable.software>

Co-authored-by: Andronik Ordian <write@reusable.software>
2021-03-14 16:51:14 +00:00
Robert Habermeier bdb3256396 more diagnostic logs for approval-voting (#2618) 2021-03-12 17:24:35 -06:00
Robert Habermeier bd2f5b27dd some more metrics for approval voting (#2612)
* some more metrics for approval voting

* fix tests

Co-authored-by: Andronik Ordian <write@reusable.software>
2021-03-11 17:58:53 +00:00
Robert Habermeier b105d9acc0 more tracing for av-store (#2604)
* more tracing for av-store

* Update node/core/av-store/src/lib.rs

Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>

* Update node/core/av-store/src/lib.rs

Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>

* Update node/core/av-store/src/lib.rs

Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>

* Update tracing everywhere

* Fix build

* More fixes

* Push cargo.lock

* Update

Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
Co-authored-by: Bastian Köcher <info@kchr.de>
2021-03-11 13:12:34 +01:00
Cecile Tonglet 7dfb666ea5 Polkadot companion for #8143 (#2535)
Companion for https://github.com/paritytech/substrate/pull/8143
2021-03-11 12:11:04 +01:00
Robert Habermeier 40a584bebc Better error handling in approval-voting (#2603)
* make approval voting resilient to dropped requests

* some more

* skip whole chain if encountering spurious error
2021-03-10 14:53:12 -06:00
Robert Habermeier 9331e06eda remove statement::invalid (#2597) 2021-03-10 10:31:17 -06:00
Andronik Ordian baa691deb1 prefix parachain log targets with parachain:: (#2600)
* prefix parachain log targets with parachain::

* even more consistent
2021-03-10 17:07:56 +01:00
Ashley 956be35dd4 Companion PR for substrate PR 8072 - Add a config field to babe epochs (#2467)
* Add a config field to babe epochs

* Fix test

* Add BABE_GENESIS_EPOCH_CONFIG consts

* Use PrimaryAndSecondaryVRFSlots and remove newlines

* Make epoch_configs Some

* Fix tests

* Fix test service tests

* Add a BabeEpochConfigMigrations OnRuntimeUpgrade

* Apply suggestions

* Use PrimaryAndSecondaryPlainSlots in kusama

* Remove migration from test runtime and rococo

* Add HasPalletPrefix

* Rename to BabePalletPrefix and change BabeApi -> Babe

* "Update Substrate"

* Update substrate

* Resolve parantheses errors

Co-authored-by: parity-processbot <>
2021-03-10 09:39:08 +00:00
Andronik Ordian 287604cf7e approval-voting metrics (#2483)
* approval-voting metrics

* metric: approvals produced
2021-03-09 15:32:31 -06:00
Robert Habermeier 30e4a67f0c Add some magic to signed statements and approval votes (#2585)
* add a magic number to backing statements encoded

* fix fallout in statement table

* fix some fallout in backing

* add magic to approval votes

* remove last references to Candidate variant

* update size-hint
2021-03-09 17:17:30 +00:00
Robert Klotzner b6a78d2976 Mostly, let guide reflect #2579 (#2583)
* Statement distribution is now validator only.

* Avoid Arc creation where it is not necessarily needed.
2021-03-09 01:46:24 +01:00
Robert Klotzner 48409e5548 Request based availability distribution (#2423)
* WIP

* availability distribution, still very wip.

Work on the requesting side of things.

* Some docs on what I intend to do.

* Checkpoint of session cache implementation

as I will likely replace it with something smarter.

* More work, mostly on cache

and getting things to type check.

* Only derive MallocSizeOf and Debug for std.

* availability-distribution: Cache feature complete.

* Sketch out logic in `FetchTask` for actual fetching.

- Compile fixes.
- Cleanup.

* Format cleanup.

* More format fixes.

* Almost feature complete `fetch_task`.

Missing:

- Check for cancel
- Actual querying of peer ids.

* Finish FetchTask so far.

* Directly use AuthorityDiscoveryId in protocol and cache.

* Resolve `AuthorityDiscoveryId` on sending requests.

* Rework fetch_task

- also make it impossible to check the wrong chunk index.
- Export needed function in validator_discovery.

* From<u32> implementation for `ValidatorIndex`.

* Fixes and more integration work.

* Make session cache proper lru cache.

* Use proper lru cache.

* Requester finished.

* ProtocolState -> Requester

Also make sure to not fetch our own chunk.

* Cleanup + fixes.

* Remove unused functions

- FetchTask::is_finished
- SessionCache::fetch_session_info

* availability-distribution responding side.

* Cleanup + Fixes.

* More fixes.

* More fixes.

adder-collator is running!

* Some docs.

* Docs.

* Fix reporting of bad guys.

* Fix tests

* Make all tests compile.

* Fix test.

* Cleanup + get rid of some warnings.

* state -> requester

* Mostly doc fixes.

* Fix test suite.

* Get rid of now redundant message types.

* WIP

* Rob's review remarks.

* Fix test suite.

* core.relay_parent -> leaf for session request.

* Style fix.

* Decrease request timeout.

* Cleanup obsolete errors.

* Metrics + don't fail on non fatal errors.

* requester.rs -> requester/mod.rs

* Panic on invalid BadValidator report.

* Fix indentation.

* Use typed default timeout constant.

* Make channel size 0, as each sender gets one slot anyways.

* Fix incorrect metrics initialization.

* Fix build after merge.

* More fixes.

* Hopefully valid metrics names.

* Better metrics names.

* Some tests that already work.

* Slightly better docs.

* Some more tests.

* Fix network bridge test.
2021-02-26 11:58:07 -06:00
Robert Habermeier 49705026e0 some initial spans for approval voting (#2525)
* some initial spans for approval voting

* add stage earlier
2021-02-25 17:56:50 +00:00
Bastian Köcher 327a203dc7 Companion for Substrate #8185 (#2507)
* Companion for Substrate #8185

https://github.com/paritytech/substrate/pull/8185

* "Update Substrate"

Co-authored-by: parity-processbot <>
2021-02-24 22:31:54 +01:00
Robert Habermeier 3300b53306 Approval Checking Improvements Omnibus (#2480)
* add tracing to approval voting

* notify if session info is not working

* add dispute period to chain specs

* propagate genesis session to parachains runtime

* use `on_genesis_session`

* protect against zero cores in computation

* tweak voting rule to be based off of best and add logs

* genesis configuration should use VRF slots only

* swallow more keystore errors

* add some docs

* make validation-worker args non-optional and update clap

* better tracing for bitfield signing and provisioner

* pass amount of bits in bitfields to inclusion instead of recomputing

* debug -> warn for some logs

* better tracing for availability recovery

* a little av-store tracing

* bridge: forward availability recovery messages

* add missing try_from impl

* some more tracing

* improve approval distribution tracing

* guide: hold onto pending approval messages until NewBlocks

* Hold onto pending approval messages until NewBlocks

* guide: adjust comment

* process all actions for one wakeup at a time

* vec

* fix network bridge test

* replace randomness-collective-flip with Babe

* remove PairNotFound
2021-02-23 14:12:28 -06:00
Bastian Köcher 2584c121fb Substrate companion for #8163 (#2492)
* Substrate companion for #8163

https://github.com/paritytech/substrate/pull/8163

* "Update Substrate"

Co-authored-by: parity-processbot <>
2021-02-22 14:52:43 +00:00
Bernhard Schuster 49c6aa9a76 feat/jaeger: more spans, more stages (#2477)
* feat/jaeger: more spans, more stages

Stage numbers are still arbitrarily picked.

* feat/jaeger: additional spans

* chore/spellcheck: improve the dictionary

* fix/jaeger JaegerSpan -> jaeger::Span
2021-02-19 14:19:43 +00:00
Robert Habermeier 006602eff2 Replace AuxStore with custom RocksDB (#2471)
* Use KeyValueDB in approval-voting

* use KVDB instead of AuxStore

* add rocksdb to cargo toml

* add a Config struct

* create new DB in service

* fix dep for regular node

* make optional

* post merge fix

Co-authored-by: Andronik Ordian <write@reusable.software>
2021-02-18 09:24:46 -06:00
Bernhard Schuster 85489ceb36 [jaeger] unify all used tags, introduce builder pattern, additional… (#2473)
* feat/jaeger: unify all used tags, introduce builder pattern, additional candidate annotations

* chores

* fixes, incomplete fn rename

* another fix

* more fixes

* silly doctests
2021-02-18 15:07:17 +01:00
Robert Habermeier b7aac51341 A fast-path for requesting AvailableData from backing validators (#2453)
* guide changes for a fast-path requesting from backing validators

* add backing group to availability recovery message

* add new phase to interaction

* typos

* add full data messages

* handle new network messages

* dispatch full data requests

* cleanup

* check chunk index

* test for invalid recovery

* tests

* Typos.

* fix some grumbles

* be more explicit about error handling and control flow

* fast-path param

* use with_chunks_only in Service

Co-authored-by: Robert Klotzner <robert.klotzner@gmx.at>
2021-02-17 13:51:50 -06:00
Robert Habermeier 59e2a810bb remove unused RequestBlockAuthorshipData (#2455) 2021-02-17 11:54:51 -06:00
Andronik Ordian 4004217059 make approval voting work on a small testnet (#2421)
* insta-approval for low-node testnets

* fix

* handle 0 needed_approvals and add some logs

* downgrade logs to debug, per block

* fix a warning

* more useful logs

* test

* finish test 🎉

* not so fast

* the test passes, but is it enough?
2021-02-17 11:35:26 -06:00
Andronik Ordian 03a8566cba fix earliest_stored_session (#2461) 2021-02-17 10:43:15 -06:00
Robert Habermeier 9e525e296d Allow at most one candidate with a code upgrade in a block (#2456)
* guide: max one candidate with upgrade per block

* max one candidate with upgrade per block

* ignore on runtime level too
2021-02-17 10:42:06 -06:00
Andronik Ordian b28e1b4e74 fix approval import tests and some bugs (#2452)
* tests: use future::join

* fix panic in cache_session_info_for_head

* fix test assertion

* fix infinite loop in determine_new_blocks

* fix ordering in determine_new_blocks

* fix expected ancestry in tests
2021-02-16 19:08:35 +01:00
Robert Habermeier 4f21cc7e5c Integrate Approval Voting into Overseer / Service / GRANDPA (#2412)
* integrate approval voting into overseer

* expose public API and make keystore arc

* integrate overseer in service

* guide: `ApprovedAncestor` returns block number

* return block number along with hash from ApprovedAncestor

* introduce a voting rule for reporting on approval checking

* integrate the delay voting rule

* Rococo configuration

* fix compilation and add slack

* fix web-wasm build

* tweak parameterization

* migrate voting rules to asycn

* remove hack comment
2021-02-15 22:37:13 -06:00
Sergei Shulepov 5c68e6f9cc Mitigation of SIGBUS (#2440)
* Update shared-memory to new version & refactor

This two are combined in a single commit because the new version of
shared-memory doesn't provide the used functionality anymore.

Therefore in order to update the version of this crate we implement the
functionality that we need by ourselves, providing a cleaner API along
the way.

* Significantly decrease the required memory for a workspace

For some reason it was allocating an entire GiB of memory. I suspect
this has something to do with the current memory size limit of a PVF
execution environment (the prior name suggests that). However, we don't
need so much memory anywhere near that amount.

In fact, we could reduce the allocated size even more, but that maybe
for the next time.

* Unlink shmem just after opening

That will make sure that we don't leak the shmem accidentally.

* Do not compile workspace mod for androind and wasm

* Address some review comments

* Fix the test runner

* Fix missed +1 for the attached flag

* Use .expect rather than .unwrap

* Add a rustdoc for the workspace module

* fixup! Use .expect rather than .unwrap

* Add some doc comments to pub members

* Warn on error removing shm_unlink

* Change the alignment implementation

* Fix the comment nit
2021-02-15 14:40:26 -06:00