Commit Graph

100 Commits

Author SHA1 Message Date
Andronik Ordian 98082c5326 gossip: move authorities request to runtime api subsystem (#2798) 2021-04-01 23:51:01 +02:00
Robert Habermeier 0794f69306 Add dispute types and change InclusionInherent to ParasInherent (#2791)
* dispute types

* add Debug to dispute primitives in std and InherentData

* use ParachainsInherentData on node-side

* change inclusion_inherent to paras_inherent

* RuntimeDebug

* add type parameter to PersistedValidationData users

* fix test client

* spaces

* fix collation-generation test

* fix provisioner tests

* remove references to inclusion inherent
2021-04-01 18:23:27 +02:00
Robert Habermeier 5da762e728 Avoid querying the local validator in availability recovery (#2792)
* guide: don't request availability data from ourselves

* add QueryAllChunks message

* implement QueryAllChunks

* remove unused relay_parent from StoreChunk

* test QueryAllChunks

* fast paths make short roads

* test early exit behavior
2021-04-01 15:57:41 +02:00
Robert Klotzner 0a9fe852df Move non runtime related stuff into node/primitives (#2743)
* Remove stuff out of the runtime that does not belong there.

There might be more, but it is a start.

* White space fixes.

* Fix tests.

* Leave whitespace in ui tests alone.

* Add back zstd for no reason.

* Fix browser wasm (hopefully)
2021-03-29 02:15:44 +02:00
Robert Habermeier 5952e790fa Overseer: subsystems communicate directly (#2227)
* overseer: pass messages directly between subsystems

* test that message is held on to

* Update node/overseer/src/lib.rs

Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>

* give every subsystem an unbounded sender too

* remove metered_channel::name

1. we don't provide good names
2. these names are never used anywhere

* unused mut

* remove unnecessary &mut

* subsystem unbounded_send

* remove unused MaybeTimer

We have channel size metrics that serve the same purpose better now and the implementation of message timing was pretty ugly.

* remove comment

* split up senders and receivers

* update metrics

* fix tests

* fix test subsystem context

* fix flaky test

* fix docs

* doc

* use select_biased to favor signals

* Update node/subsystem/src/lib.rs

Co-authored-by: Andronik Ordian <write@reusable.software>

Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
Co-authored-by: Andronik Ordian <write@reusable.software>
2021-03-28 15:55:10 +00:00
Robert Klotzner c6f07d8f31 Request based PoV distribution (#2640)
* Indentation fix.

* Prepare request-response for PoV fetching.

* Drop old PoV distribution.

* WIP: Fetch PoV directly from backing.

* Backing compiles.

* Runtime access and connection management for PoV distribution.

* Get rid of seemingly dead code.

* Implement PoV fetching.

Backing does not yet use it.

* Don't send `ConnectToValidators` for empty list.

* Even better - no need to check over and over again.

* PoV fetching implemented.

+ Typechecks
+ Should work

Missing:

- Guide
- Tests
- Do fallback fetching in case fetching from seconding validator fails.

* Check PoV hash upon reception.

* Implement retry of PoV fetching in backing.

* Avoid pointless validation spawning.

* Add jaeger span to pov requesting.

* Add back tracing.

* Review remarks.

* Whitespace.

* Whitespace again.

* Cleanup + fix tests.

* Log to log target in overseer.

* Fix more tests.

* Don't fail if group cannot be found.

* Simple test for PoV fetcher.

* Handle missing group membership better.

* Add test for retry functionality.

* Fix flaky test.

* Spaces again.

* Guide updates.

* Spaces.
2021-03-28 17:11:38 +02:00
Robert Habermeier 064df81ee4 Add block number to activated leaves and associated fixes (#2718)
* add number to `ActivatedLeavesUpdate`

* update subsystem util and overseer

* use new ActivatedLeaf everywhere

* sort view

* sorted and limited view in network bridge

* use live block hash only if it's newer

* grumples
2021-03-26 13:06:40 +01:00
Robert Habermeier 8a396c678f Port availability recovery to use req/res (#2694)
* add AvailableDataFetchingRequest

* rename AvailabilityFetchingRequest to ChunkFetchingRequest

* rename AvailabilityFetchingResponse to Chunk_

* add AvailableDataFetching request

* add available data fetching request to availability recovery message

* remove availability recovery message

* fix

* update network bridge

* port availability recovery to request/response

* use validators.len(), not shuffling

* fix availability recovery tests

* update guide

* Update node/network/availability-recovery/src/lib.rs

Co-authored-by: Bernhard Schuster <bernhard@ahoi.io>

* Update node/network/availability-recovery/src/lib.rs

Co-authored-by: Arkadiy Paronyan <arkady.paronyan@gmail.com>

* remove println

Co-authored-by: Bernhard Schuster <bernhard@ahoi.io>
Co-authored-by: Arkadiy Paronyan <arkady.paronyan@gmail.com>
2021-03-25 15:34:24 +01:00
Robert Habermeier b8867d71bc Evict inactive peers from the collator protocol peer-set (#2680)
* malicious reputation cost is fatal

* make ReportBad a malicious cost

* futures control-flow for cleaning up inactive collator peers

* guide: network bridge updates

* add `PeerDisconnected` message

* guide: update

* reverse order

* remember to match

* implement disconnect peer in network bridge

* implement disconnect_inactive_peers

* test

* remove println

* don't hardcore policy

* add fuse outside of loop

* use default eviction policy
2021-03-24 13:32:28 +01:00
Robert Klotzner 503e2b74f9 Request based collation fetching (#2621)
* Introduce collation fetching protocol

also move to mod.rs

* Allow `PeerId`s in requests to network bridge.

* Fix availability distribution tests.

* Move CompressedPoV to primitives.

* Request based collator protocol: validator side

- Missing: tests
- Collator side
- don't connect, if not connected

* Fixes.

* Basic request based collator side.

* Minor fix on collator side.

* Don't connect in requests in collation protocol.

Also some cleanup.

* Fix PoV distribution

* Bump substrate

* Add back metrics + whitespace fixes.

* Add back missing spans.

* More cleanup.

* Guide update.

* Fix tests

* Handle results in tests.

* Fix weird compilation issue.

* Add missing )

* Get rid of dead code.

* Get rid of redundant import.

* Fix runtime build.

* Cleanup.

* Fix wasm build.

* Format fixes.

Thanks @andronik !
2021-03-18 09:06:36 +01:00
Andronik Ordian 4c1de66d5d subsystem for issuing background connection requests (#2538)
* initial subsystem for issuing connection requests

* finish the initial impl

* integrate with the overseer

* rename to gossip-support

* fix renamings leftover

* remove run_inner

* fix compilation

* random subset of sqrt
2021-03-02 10:40:06 +00:00
Robert Klotzner 48409e5548 Request based availability distribution (#2423)
* WIP

* availability distribution, still very wip.

Work on the requesting side of things.

* Some docs on what I intend to do.

* Checkpoint of session cache implementation

as I will likely replace it with something smarter.

* More work, mostly on cache

and getting things to type check.

* Only derive MallocSizeOf and Debug for std.

* availability-distribution: Cache feature complete.

* Sketch out logic in `FetchTask` for actual fetching.

- Compile fixes.
- Cleanup.

* Format cleanup.

* More format fixes.

* Almost feature complete `fetch_task`.

Missing:

- Check for cancel
- Actual querying of peer ids.

* Finish FetchTask so far.

* Directly use AuthorityDiscoveryId in protocol and cache.

* Resolve `AuthorityDiscoveryId` on sending requests.

* Rework fetch_task

- also make it impossible to check the wrong chunk index.
- Export needed function in validator_discovery.

* From<u32> implementation for `ValidatorIndex`.

* Fixes and more integration work.

* Make session cache proper lru cache.

* Use proper lru cache.

* Requester finished.

* ProtocolState -> Requester

Also make sure to not fetch our own chunk.

* Cleanup + fixes.

* Remove unused functions

- FetchTask::is_finished
- SessionCache::fetch_session_info

* availability-distribution responding side.

* Cleanup + Fixes.

* More fixes.

* More fixes.

adder-collator is running!

* Some docs.

* Docs.

* Fix reporting of bad guys.

* Fix tests

* Make all tests compile.

* Fix test.

* Cleanup + get rid of some warnings.

* state -> requester

* Mostly doc fixes.

* Fix test suite.

* Get rid of now redundant message types.

* WIP

* Rob's review remarks.

* Fix test suite.

* core.relay_parent -> leaf for session request.

* Style fix.

* Decrease request timeout.

* Cleanup obsolete errors.

* Metrics + don't fail on non fatal errors.

* requester.rs -> requester/mod.rs

* Panic on invalid BadValidator report.

* Fix indentation.

* Use typed default timeout constant.

* Make channel size 0, as each sender gets one slot anyways.

* Fix incorrect metrics initialization.

* Fix build after merge.

* More fixes.

* Hopefully valid metrics names.

* Better metrics names.

* Some tests that already work.

* Slightly better docs.

* Some more tests.

* Fix network bridge test.
2021-02-26 11:58:07 -06:00
Bernhard Schuster 31327eb0c7 test: add unit test to catch missing distribution to subsystems faster (#2495)
* test: add unit test to catch missing distribution to subsystems faster

* add a simple count

* introduce proc macro to generate dispatch type

* refactor

* refactor

* chore: add license

* fixup unit test

* fixup merge

* better errors

* better fmt

* fix error spans

* better docs

* better error messages

* ui test foo

* Update node/subsystem/dispatch-gen/src/lib.rs

Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>

* Update node/network/bridge/src/lib.rs

Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>

* Update node/subsystem/Cargo.toml

Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>

* Update node/subsystem/dispatch-gen/src/lib.rs

Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>

* Update node/subsystem/dispatch-gen/src/lib.rs

Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>

* Update node/network/bridge/src/lib.rs

Co-authored-by: Andronik Ordian <write@reusable.software>

* fix compilation

* use find_map

* drop the silly 2, use _inner instead

* Update node/network/bridge/src/lib.rs

Co-authored-by: Andronik Ordian <write@reusable.software>

* Update node/subsystem/dispatch-gen/src/lib.rs

Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>

* nail deps down

* more into()

* flatten

* missing use statement

* fix messages order

Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
Co-authored-by: Andronik Ordian <write@reusable.software>
2021-02-26 08:10:41 +00:00
Bernhard Schuster 49c6aa9a76 feat/jaeger: more spans, more stages (#2477)
* feat/jaeger: more spans, more stages

Stage numbers are still arbitrarily picked.

* feat/jaeger: additional spans

* chore/spellcheck: improve the dictionary

* fix/jaeger JaegerSpan -> jaeger::Span
2021-02-19 14:19:43 +00:00
Robert Habermeier b7aac51341 A fast-path for requesting AvailableData from backing validators (#2453)
* guide changes for a fast-path requesting from backing validators

* add backing group to availability recovery message

* add new phase to interaction

* typos

* add full data messages

* handle new network messages

* dispatch full data requests

* cleanup

* check chunk index

* test for invalid recovery

* tests

* Typos.

* fix some grumbles

* be more explicit about error handling and control flow

* fast-path param

* use with_chunks_only in Service

Co-authored-by: Robert Klotzner <robert.klotzner@gmx.at>
2021-02-17 13:51:50 -06:00
Robert Habermeier 59e2a810bb remove unused RequestBlockAuthorshipData (#2455) 2021-02-17 11:54:51 -06:00
Bernhard Schuster 1e2161258b refactor/reputation: unify the values used (#2462)
* refactor/reputation: unify the values used

* chore/rep: rename Annoy* to Cost*, make duplicate message Cost*Repeated

* fix/reputation: lost and found, convert at the boundary to substrate

* refactor/rep: move conversion to base reputation one level down, left conversions

* fix/rep: order of magnitude adjustments

Thanks pierre!

* remove spaces

* chore/rep: give rationale for order of magnitude

* refactor/rep: move UnifiedReputationChange to separate file

* fix/rep: order of magnitudes correction
2021-02-17 17:18:13 +01:00
Robert Habermeier a8d3aca13d Disputes High-level rewrite & Disputes runtime (#2424)
* REVERT: comment out graphviz

* rewrite most of protocol-disputes

* write about conclusion and  chain selection

* tie back in overview

* basic disputes module

* guide: InclusionInherent -> ParaInherent

* language

* add ParaInherentData type

* plug parainherentdata into provisioner

* provide_multi_dispute

* tweak

* inclusion pipeline logic for disputes

* be clearer about signature checking

* reject backing of disputed blocks

* some type rejigging

* known-disputes runtime API

* wire up inclusion

* Revert "REVERT: comment out graphviz"

This reverts commit 66203e362f7872cb413d258f74634a0aad70302b.

* timeouts

* include in initialization order

* address grumbles
2021-02-16 18:50:14 +00:00
Robert Habermeier 4f21cc7e5c Integrate Approval Voting into Overseer / Service / GRANDPA (#2412)
* integrate approval voting into overseer

* expose public API and make keystore arc

* integrate overseer in service

* guide: `ApprovedAncestor` returns block number

* return block number along with hash from ApprovedAncestor

* introduce a voting rule for reporting on approval checking

* integrate the delay voting rule

* Rococo configuration

* fix compilation and add slack

* fix web-wasm build

* tweak parameterization

* migrate voting rules to asycn

* remove hack comment
2021-02-15 22:37:13 -06:00
Bastian Köcher 4975521d48 Notify collators about seconded collation (#2430)
* Notify collators about seconded collation

This pr adds functionality to inform a collator that its collation was
seconded by a parachain validator. Before this signed statement was only
gossiped over the validation substream. Now, we explicitly send the
seconded statement to the collator after it was validated successfully.

Besides that it changes the `CollatorFn` to return an optional result
sender that is informed when the build collation was seconded by a
parachain validator.

* Add test

* Make sure we only send `Seconded` statements

* Make sure we only receive valid statements

* Review feedback
2021-02-14 17:36:04 +01:00
Robert Habermeier e48c687504 Implement Approval Voting Subsystem (#2112)
* skeleton

* skeleton aux-schema module

* start approval types

* start aux schema with aux store

* doc

* finish basic types

* start approval types

* doc

* finish basic types

* write out schema types

* add debug and codec impls to approval types

* add debug and codec impls to approval types

also add some key computation

* add debug and codec impls to approval types

* getters for block and candidate entries

* grumbles

* remove unused AssignmentId

* load_decode utility

* implement DB clearing

* function for adding new block entry to aux store

* start `canonicalize` implementation

* more skeleton

* finish implementing canonicalize

* tag TODO

* implement a test AuxStore

* add allow(unused)

* basic loading and deleting test

* block_entry test function

* add a test for `add_block_entry`

* ensure range is exclusive at end

* test clear()

* test that add_block sets children

* add a test for canonicalize

* extract Pre-digest from header

* utilities for extracting RelayVRFStory from the header-chain

* add approval voting message types

* approval distribution message type

* subsystem skeleton

* state struct

* add futures-timer

* prepare service for babe slot duration

* more skeleton

* better integrate AuxStore

* RelayVRF -> RelayVRFStory

* canonicalize

* implement some tick functionality

* guide: tweaks

* check_approval

* more tweaks and helpers

* guide: add core index to candidate event

* primitives: add core index to candidate event

* runtime: add core index to candidate events

* head handling (session window)

* implement `determine_new_blocks`

* add TODO

* change error type on functions

* compute RelayVRFModulo assignments

* compute RelayVRFDelay assignments

* fix delay tranche calc

* assignment checking

* pluralize

* some dummy code for fetching assignments

* guide: add babe epoch runtime API

* implement a current_epoch() runtime API

* compute assignments

* candidate events get backing group

* import blocks and assignments into DB

* push block approval meta

* add message types, no overseer integration yet

* notify approval distribution of new blocks

* refactor import into separate functions

* impl tranches_to_approve

* guide: improve function signatures

* guide: remove Tick from ApprovalEntry

* trigger and broadcast assignment

* most of approval launching

* remove byteorder crate

* load blocks back to finality, except on startup

* check unchecked assignments

* add claimed core to approval voting message

* fix checks

* assign only to backing group

* remove import_checked_assignment from guide

* newline

* import assignments

* abstract out a bit

* check and import approvals

* check full approvals from assignment import too

* comment

* create a Transaction utility

* must_use

* use transaction in `check_full_approvals`

* wire up wakeups

* add Ord to CandidateHash

* wakeup refactoring

* return candidate info from add_block_entry

* schedule wakeups

* background task: do candidate validation

* forward candidate validation requests

* issue approval votes when requested

* clean up a couple TODOs

* fix up session caching

* clean up last unimplemented!() items

* fix remaining warnings

* remove TODO

* implement handle_approved_ancestor

* update Cargo.lock

* fix runtime API tests

* guide: cleanup assignment checking

* use claimed candidate index instead of core

* extract time to a trait

* tests module

* write a mock clock for testing

* allow swapping out the clock

* make abstract over assignment criteria

* add some skeleton tests and simplify params

* fix backing group check

* do backing group check inside check_assignment_cert

* write some empty test functions to implement

* add a test for non-backing

* test that produced checks pass

* some empty test ideas

* runtime/inclusion: remove outdated TODO

* fix compilation

* av-store: fix tests

* dummy cert

* criteria tests

* move `TestStore` to main tests file

* fix unused warning

* test harness beginnings

* resolve slots renaming fallout

* more compilation fixes

* wip: extract pure data into a separate module

* wip: extract pure data into a separate module

* move types completely to v1

* add persisted_entries

* add conversion trait impls

* clean up some warnings

* extract import logic to own module

* schedule wakeups

* experiment with Actions

* uncomment approval-checking

* separate module for approval checking utilities

* port more code to use actions

* get approval pipeline using actions

* all logic is uncommented

* main loop processes actions

* all loop logic uncommented

* separate function for handling actions

* remove last unimplemented item

* clean up warnings

* State gives read-only access to underlying DB

* tests for approval checking

* tests for approval criteria

* skeleton test module for import

* list of import tests to do

* some test glue code

* test reject bad assignment

* test slot too far in future

* test reject assignment with unknown candidate

* remove loads_blocks tests

* determine_new_blocks back to finalized & harness

* more coverage for determining new blocks

* make `imported_block_info` have less reliance on State

* candidate_info tests

* tests for session caching

* remove println

* extricate DB and main TestStores

* rewrite approval checking logic to counteract early delays

* move state out of function

* update approval-checking tests

* tweak wakeups & scheduling logic

* rename check_full_approvals

* test that assignment import updates candidate

* some approval import tests

* some tests for check_and_apply_approval

* add 'full' qualifier to avoid confusion

* extract should-trigger logic to separate function

* some tests for all triggering

* tests for when we trigger assignments

* test wakeups

* add block utilities for testing

* some more tests for approval updates

* approved_ancestor tests

* new action type for launch approval

* process-wakeup tests

* clean up some warnings

* fix in_future test

* approval checking tests

* tighten up too-far-in-future

* special-case genesis when caching sessions

* fix bitfield len

Co-authored-by: Andronik Ordian <write@reusable.software>
2021-02-11 10:21:47 -06:00
Andronik Ordian 6981a1c366 validator_discovery: pass PeerSet to the request (#2372)
* validator_discovery: pass PeerSet to the request

* validator_discovery: track PeerSet of connected peers

* validator_discovery: fix tests

* validator_discovery: fix long line

* some fixes

* some validator_discovery logs

* log validator discovery request

* Also connect to validators on `DistributePoV`.

* validator_discovery: store the whole state per peer_set

* bump spec versions in kusama, polkadot and westend

* Correcting doc.

* validator_discovery: bump channel capacity

* pov-distribution: some cleanup

* this should fix the test, but it does not

* I just got some brain damage while fixing this

Why are you even reading this???

* wrap long line

* address some review nits

Co-authored-by: Robert Klotzner <robert.klotzner@gmx.at>
2021-02-08 08:57:59 +01:00
Robert Klotzner 0cb1ccd122 Generic request/response infrastructure for Polkadot (#2352)
* Move NetworkBridgeEvent to subsystem::messages.

It is not protocol related at all, it is in fact only part of the
subsystem communication as it gets wrapped into messages of each
subsystem.

* Request/response infrastructure is taking shape.

WIP: Does not compile.

* Multiplexer variant not supported by Rusts type system.

* request_response::request type checks.

* Cleanup.

* Minor fixes for request_response.

* Implement request sending + move multiplexer.

Request multiplexer is moved to bridge, as there the implementation is
more straight forward as we can specialize on `AllMessages` for the
multiplexing target.

Sending of requests is mostly complete, apart from a few `From`
instances. Receiving is also almost done, initializtion needs to be
fixed and the multiplexer needs to be invoked.

* Remove obsolete multiplexer.

* Initialize bridge with multiplexer.

* Finish generic request sending/receiving.

Subsystems are now able to receive and send requests and responses via
the overseer.

* Doc update.

* Fixes.

* Link issue for not yet implemented code.

* Fixes suggested by @ordian - thanks!

- start encoding at 0
- don't crash on zero protocols
- don't panic on not yet implemented request handling

* Update node/network/protocol/src/request_response/v1.rs

Use index 0 instead of 1.

Co-authored-by: Andronik Ordian <write@reusable.software>

* Update node/network/protocol/src/request_response.rs

Co-authored-by: Andronik Ordian <write@reusable.software>

* Fix existing tests.

* Better avoidance of division by zoro errors.

* Doc fixes.

* send_request -> start_request.

* Fix missing renamings.

* Update substrate.

* Pass TryConnect instead of true.

* Actually import `IfDisconnected`.

* Fix wrong import.

* Update node/network/bridge/src/lib.rs

typo

Co-authored-by: Pierre Krieger <pierre.krieger1708@gmail.com>

* Update node/network/bridge/src/multiplexer.rs

Remove redundant import.

Co-authored-by: Pierre Krieger <pierre.krieger1708@gmail.com>

* Stop doing tracing from within `From` instance.

Thanks for the catch @tomaka!

* Get rid of redundant import.

* Formatting cleanup.

* Fix tests.

* Add link to issue.

* Clarify comments some more.

* Fix tests.

* Formatting fix.

* tabs

* Fix link

Co-authored-by: Bernhard Schuster <bernhard@ahoi.io>

* Use map_err.

Co-authored-by: Bernhard Schuster <bernhard@ahoi.io>

* Improvements inspired by suggestions by @drahnr.

- Channel size is now determined by function.
- Explicitely scope NetworkService::start_request.

Co-authored-by: Andronik Ordian <write@reusable.software>
Co-authored-by: Pierre Krieger <pierre.krieger1708@gmail.com>
Co-authored-by: Bernhard Schuster <bernhard@ahoi.io>
2021-02-03 20:21:09 +00:00
Peter Goodspeed-Niklaus 25cfb884af misbehavior: report multiple offenses per validator as necessary (#2222)
* use proper descriptive generic type names

* cleanup

* Table stores a list of detected misbehavior per authority

* add Table::drain_misbehaviors_for

* WIP: unify misbehavior types; report multiple misbehaviors per validator

Code checks, but tests don't yet pass.

* update drain_misbehaviors: return authority id as well as specific misbehavior

* enable unchecked construction of Signed structs in tests

* remove test-features feature & unnecessary generic

* fix backing tests

This took a while to figure out, because where we'd previously been
passing around `SignedFullStatement`s, we now needed to construct
those on the fly within the test, to take advantage of the signature-
checking in the constructor. That, in turn, necessitated changing the
iterable type of `drain_misbehaviors` to return the validator index,
and passing that validator index along within the misbehavior report.

Once that was sorted, however, it became relatively straightforward:
just needed to add appropriate methods to deconstruct the misbehavior
reports, and then we could construct the signed statements directly.

* fix bad merge
2021-01-28 15:19:05 +01:00
Andronik Ordian 3f1e1a6ff7 impl approval distribution (#2160)
* initial impl approval distribution

* initial tests and fixes

* batching seems difficult: different peers have different needs

* bridge: fix test after merge

* some guide updates

* only send assignments to peers who know about the block

* fix a test, add approvals test

* simplify

* do not send assignment to peers for finalized blocks

* guide: protocol input and output

* one more test

* more comments, logs, initial metrics

* fix a typo

* one more thing: early return when reimporting a thing locally
2021-01-25 18:14:32 -05:00
Sergei Shulepov 226af6a877 Remove TransientValidationData (#2272)
* collation-generation: use persisted validation data

* node: remote FullValidationData API

* runtime: remove FullValidationData API

* backing tests: use persisted validation data

* FullCandidateReceipt: use persisted validation data

This is not a big change since this type is not used anywhere

* Remove ValidationData and TransientValidationData

Also update the guide
2021-01-18 18:57:09 -05:00
Fedor Sakharov 90a686266f Availability recovery subsystem (#2122)
* Adds message types

* Add code skeleton

* Adds subsystem code.

* Adds a first test

* Adds interaction result to availability_lru

* Use LruCache instead of a HashMap

* Whitespaces to tabs

* Do not ignore errors

* Change error type

* Add a timeout to chunk requests

* Add custom errors and log them

* Adds replace_availability_recovery method

* recovery_threshold computed by erasure crate

* change core to std

* adds docs to error type

* Adds a test for invalid reconstruction

* refactors interaction run into multiple methods

* Cleanup AwaitedChunks

* Even more fixes

* Test that recovery with wrong root is an error

* Break to launch another requests

* Styling fixes

* Add SessionIndex to API

* Proper relay parents for MakeRequest

* Remove validator_discovery and use message

* Remove a stream on exhaustion

* On cleanup free the request streams

* Fix merge and refactor
2021-01-15 02:06:25 +00:00
Robert Habermeier 9dee08af2f Batch messages to network bridge and introduce a timeout to SubsystemContext::send_message (#2197)
* guide: batch network messages

* bridge: batch

* av-dist: batch outgoing messages

* time-out message sends in subsystem context

* Update node/subsystem/src/messages.rs

Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>

* Revert "time-out message sends in subsystem context"

This reverts commit d49be62557ec37c8a350b93718acad723df704ef.

* Update node/network/availability-distribution/src/lib.rs

Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
2021-01-07 16:01:03 -05:00
Fedor Sakharov b97f52a4c8 Optimizations of av-store (#2223)
* Store all chunks and in a single transaction

* Adds chunks LRU to store

* Add pruning records metrics

* Use honest cache instead of LRU

* Remove unnecessary optional cache

* Fix review nits that are fixable
2021-01-07 21:00:30 +00:00
Bastian Köcher 5be092894e Add one Jaeger span per relay parent (#2196)
* Add one Jaeger span per relay parent

This adds one Jaeger span per relay parent, instead of always creating
new spans per relay parent. This should improve the UI view, because
subsystems are now grouped below one common span.

* Fix doc tests

* Replace `PerLeaveSpan` to `PerLeafSpan`

* More renaming

* Moare

* Update node/subsystem/src/lib.rs

Co-authored-by: Andronik Ordian <write@reusable.software>

* Skip the spans

* Increase `spec_version`

Co-authored-by: Andronik Ordian <write@reusable.software>
2021-01-05 15:09:25 +01:00
Robert Habermeier 41102a6ff9 Improve Network Spans (#2169)
* utility functions for erasure-coding threshold

* add candidate-hash tag to candidate jaeger spans

* debug implementation for jaeger span

* add a span to each live candidate in availability dist.

* availability span covers only our piece

* fix tests

* keep span alive slightly longer

* remove spammy bitfield-gossip-received log

* Revert "remove spammy bitfield-gossip-received log"

This reverts commit 831a2db506d66f64ea516af3caf891e8643f5c43.

* add claimed validator to bitfield-gossip span

* add peer-id to handle-incoming span

* add peer-id to availability distribution span

* Update node/network/availability-distribution/src/lib.rs

Co-authored-by: Bernhard Schuster <bernhard@ahoi.io>

* Update erasure-coding/src/lib.rs

Co-authored-by: Bernhard Schuster <bernhard@ahoi.io>

* Update node/subsystem/src/jaeger.rs

Co-authored-by: Bernhard Schuster <bernhard@ahoi.io>

Co-authored-by: Bernhard Schuster <bernhard@ahoi.io>
2021-01-04 17:29:53 +00:00
Robert Habermeier 01d271caeb Fix statement distribution: forward statements to other peers. (#2146)
* add candidate hash statement circulation span

* add relay-parent to hash-span

* Some typos and misspellings in docs I found, during my studies. (#2144)

* Fix stale link to overseer docs

* Some typos and mispellings in docs/comments

I found during studying how Polkadot works.

* Rococo V1 (#2141)

* Update to latest master and use 30 minutes sessions

* add bootnodes to chainspec

* Update Substrate

* Update chain-spec

* Update Cargo.lock

* GENESIS

* Change session length to one hour

* Bump spec_version to not fuck anything up ;)

Co-authored-by: Erin Grasmick <erin@parity.io>

* avoid creating duplicate unbacked spans when we see extra statements (#2145)

* improve jaeger spans for statement distribution

* tweak and add failing test for repropagation

* make a change that gets the test passing

* guide: clarify

* remove semicolon

Co-authored-by: Robert Klotzner <eskimor@users.noreply.github.com>
Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
Co-authored-by: Erin Grasmick <erin@parity.io>
2020-12-20 18:52:07 -05:00
Robert Habermeier e95be77eb6 overseer: observe stalled subsystems and shut down (#2148)
* overseer: observe stalled subsystems and shut down

* notify on send_message failure as well
2020-12-20 20:30:02 +00:00
Pierre Krieger a141a6fbc9 Fix Jaeger service name (#2140) 2020-12-18 16:09:33 +00:00
Bastian Köcher d0c97539e4 Fix bug and further optimizations in availability distribution (#2104)
* Fix bug and further optimizations in availability distribution

- There was a bug that resulted in only getting one candidate per block
as the candidates were put into the hashmap with the relay block hash as
key. The solution for this is to use the candidate hash and the relay
block hash as key.
- We stored received/sent messages with the candidate hash and chunk
index as key. The candidate hash wasn't required in this case, as the
messages are already stored per candidate.

* Update node/core/bitfield-signing/src/lib.rs

Co-authored-by: Robert Habermeier <rphmeier@gmail.com>

* Remove the reverse map

* major refactor of receipts & query_live

* finish refactoring

remove ancestory mapping,

improve relay-parent cleanup & receipts-cache cleanup,
add descriptor to `PerCandidate`

* rename and rewrite query_pending_availability

* add a bunch of consistency tests

* Add some last changes

* xy

* fz

* Make it compile again

* Fix one test

* Fix logging

* Remove some buggy code

* Make tests work again

* Move stuff around

* Remove dbg

* Remove state from test_harness

* More refactor and new test

* New test and fixes

* Move metric

* Remove "duplicated code"

* Fix tests

* New test

* Change break to continue

* Update node/core/av-store/src/lib.rs

* Update node/core/av-store/src/lib.rs

* Update node/core/bitfield-signing/src/lib.rs

Co-authored-by: Fedor Sakharov <fedor.sakharov@gmail.com>

* update guide to match live_candidates changes

* add comment

* fix bitfield signing

Co-authored-by: Robert Habermeier <rphmeier@gmail.com>
Co-authored-by: Bernhard Schuster <bernhard@ahoi.io>
Co-authored-by: Fedor Sakharov <fedor.sakharov@gmail.com>
2020-12-17 19:09:17 +00:00
Andronik Ordian 3f5156e866 refactor View to include finalized_number (#2128)
* refactor View to include finalized_number

* guide: update the NetworkBridge on BlockFinalized

* av-store: fix the tests

* actually fix tests

* grumbles

* ignore macro doctest

* use Hash::repeat_bytes more consistently

* broadcast empty leaves updates as well

* fix issuing view updates on empty leaves updates
2020-12-17 12:50:58 -05:00
Pierre Krieger e6e3bda3f0 Improve Jaeger errors and debugging experience (#2127)
* Improve Jaeger errors and debugging experience

* Bind on 0.0.0.0:0 instead
2020-12-17 14:38:56 +00:00
Bernhard Schuster a5fe710cc6 initial jaeger integration (#2047)
Co-authored-by: Pierre Krieger <pierre.krieger1708@gmail.com>
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
2020-12-11 17:38:55 +01:00
Bernhard Schuster 35c71bf315 addition error definitions (#2107)
* remove low information density error doc comments

* another round of error dancing

* fix compilation

* remove stale `None` argument

* adjust test, minor slip in command

* only add AvailabilityError for full node features

* another None where none shuld be
2020-12-10 16:57:36 +01:00
Peter Goodspeed-Niklaus e7e9605f87 do not store backed candidates in the provisioner (#1909)
* guide: non-semantic changes

* guide: update per the issue description

* GetBackedCandidates operates on multiple hashes now

* GetBackedCandidates still needs a relay parent

* implement changes specified in guide

* distinguish between various occasions for canceled oneshots

* add tracing info to getbackedcandidates

* REVERT ME: add tracing messages for GetBackedCandidates

Note that these messages are only sometimes actually passed on to the
candidate backing subsystem, with the consequence that it is
unexpectedly frequent that the provisioner fails to create its
provisionable data.

* REVERT ME: more tracing logging

* REVERT ME: log when CandidateBackingJob receives any message at all

* REVERT ME: log when send_msg sends a message to a job

* fix candidate-backing tests

* streamline GetBackedCandidates

This uses table.attested_candidate instead of table.get_candidate, because
it's not obvious how to get a BackedCandidate from just a
CommittedCandidateReceipt.

* REVERT ME: more logging tracing job lifespans

* promote warning about job premature demise

* don't terminate CandiateBackingJob::run_loop in event of failure to process message

* Revert "REVERT ME: more logging tracing job lifespans"

This reverts commit 7365f2fb3dec988d95cfcd317eba75587fe7fd16.

* Revert "REVERT ME: log when send_msg sends a message to a job"

This reverts commit 58e46aad038e6517d6d56390c8be65b046a21884.

* Revert "REVERT ME: log when CandidateBackingJob receives any message at all"

This reverts commit 0d6f38413c7c66b5e9e81dabc587906fa9f82656.

* Revert "REVERT ME: more tracing logging"

This reverts commit 675fd2628e84d1596965280e7314155ef21b28e6.

* Revert "REVERT ME: add tracing messages for GetBackedCandidates"

This reverts commit e09e156493430b33b6c8ab4b5cedb3f2f91afd51.

* formatting

* add logging message to CandidateBackingJob::run_loop start

* REVERT ME: add tracing to candidate-backing job creation

* run candidatebacking loop even if no assignment

* use unique error variants for each canceled oneshot

* Revert "REVERT ME: add tracing to candidate-backing job creation"

This reverts commit 8ce5f4f0bd7186dade134b118751480f72ea1fd6.

* try_runtime_api more to reduce silent exits

* add sanity check that returned backed candidates preserve ordering

* remove redundant err attribute
2020-12-04 11:24:59 +01:00
Bastian Köcher 536dceb4f6 Simplify subsystem jobs (#2037)
* Simplify subsystem jobs

This pr simplifies the subsystem jobs interface. Instead of requiring an
extra message that is used to signal that a job should be ended, a job
now ends when the receiver returns `None`. Besides that it changes the
interface to enforce that messages to a job provide a relay parent.

* Drop ToJobTrait

* Remove FromJob

We always convert this message to FromJobCommand anyway.
2020-11-30 17:01:26 +01:00
Robert Habermeier 0a79d663e4 Move erasure root out of candidate commitments and into descriptor (#2010)
* guide: move erasure-root to candidate descriptor

* primitives: move erasure root to descriptor

* guide: unify candidate commitments and validation outputs

* primitives: unify validation outputs and candidate commitments

* parachains-runtime: fix fallout

* runtimes: fix fallout

* collation generation: fix fallout

* fix stray reference in primitives

* fix fallout in node-primitives

* fix remaining fallout in collation generation

* fix fallout in candidate validation

* fix fallout in runtime API subsystem

* fix fallout in subsystem messages

* fix fallout in candidate backing

* fix fallout in availability distribution

* don't clone

* clone

Co-authored-by: Sergei Shulepov <sergei@parity.io>
2020-11-27 16:39:42 +00:00
Andronik Ordian 39a12b68f6 past-session validator discovery APIs (#2009)
* guide: fix formatting for SessionInfo module

* primitives: SessionInfo type

* punt on approval keys

* ah, revert the type alias

* session info runtime module skeleton

* update the guide

* runtime/configuration: sync with the guide

* runtime/configuration: setters for newly added fields

* runtime/configuration: set codec indexes

* runtime/configuration: update test

* primitives: fix SessionInfo definition

* runtime/session_info: initial impl

* runtime/session_info: use initializer for session handling (wip)

* runtime/session_info: mock authority discovery trait

* guide: update the initializer's order

* runtime/session_info: tests skeleton

* runtime/session_info: store n_delay_tranches in Configuration

* runtime/session_info: punt on approval keys

* runtime/session_info: add some basic tests

* Update primitives/src/v1.rs

* small fixes

* remove codec index annotation on structs

* fix off-by-one error

* validator_discovery: accept a session index

* runtime: replace validator_discovery api with session_info

* Update runtime/parachains/src/session_info.rs

Co-authored-by: Sergei Shulepov <sergei@parity.io>

* runtime/session_info: add a comment about missing entries

* runtime/session_info: define the keys

* util: expose connect_to_past_session_validators

* util: allow session_info requests for jobs

* runtime-api: add mock test for session_info

* collator-protocol: add session_index to test state

* util: fix error message for runtime error

* fix compilation

* fix tests after merge with master

Co-authored-by: Sergei Shulepov <sergei@parity.io>
2020-11-26 11:02:50 +00:00
Bastian Köcher 4ce744818c Some code cleanup in overseer (#2008)
* Some code cleanup in overseer

- Switches to select! in the overseer run loop to be more fair about
message processing between the different sources.
- Added a check to only send `ActiveLeaves` if the update actually
contains any data.

* Move the check

* Restore old behavior

* Simplify message sending and signal sending to subsystems

* Update node/subsystem/src/lib.rs
2020-11-25 09:27:50 +00:00
Andronik Ordian 69b103b1d5 overseer: send_msg should not return an error (#1995)
* send_message should not return an error

* Apply suggestions from code review

Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>

* s/send_logging_error/send_and_log_error

Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
2020-11-23 12:42:14 +01:00
Andronik Ordian 97f5bd9047 cleanup validator discovery (#1992)
* use snake_case for log targets

* remove unused continue

* validator_discovery: when disconnecting, use all addresses

* validator_discovery: simplify request revokation

* fix a typo
2020-11-20 18:34:57 +00:00
Peter Goodspeed-Niklaus e49989971d Add tracing support to node (#1940)
* drop in tracing to replace log

* add structured logging to trace messages

* add structured logging to debug messages

* add structured logging to info messages

* add structured logging to warn messages

* add structured logging to error messages

* normalize spacing and Display vs Debug

* add instrumentation to the various 'fn run'

* use explicit tracing module throughout

* fix availability distribution test

* don't double-print errors

* remove further redundancy from logs

* fix test errors

* fix more test errors

* remove unused kv_log_macro

* fix unused variable

* add tracing spans to collation generation

* add tracing spans to av-store

* add tracing spans to backing

* add tracing spans to bitfield-signing

* add tracing spans to candidate-selection

* add tracing spans to candidate-validation

* add tracing spans to chain-api

* add tracing spans to provisioner

* add tracing spans to runtime-api

* add tracing spans to availability-distribution

* add tracing spans to bitfield-distribution

* add tracing spans to network-bridge

* add tracing spans to collator-protocol

* add tracing spans to pov-distribution

* add tracing spans to statement-distribution

* add tracing spans to overseer

* cleanup
2020-11-20 12:02:04 +01:00
Sergei Shulepov fb6b94eaf0 Don't just swallow message in dummy-subsystem, but log (#1935)
* Don't just swallow error in dummy-subsystem, but log

* Add target to log.
2020-11-10 12:17:27 +01:00
Sergei Shulepov c96f8cfcca Implement HRMP (#1900)
* HRMP: Update the impl guide

* HRMP: Incorporate the channel notifications into the guide

* HRMP: Renaming in the impl guide

* HRMP: Constrain the maximum number of HRMP messages per candidate

This commit addresses the HRMP part of https://github.com/paritytech/polkadot/issues/1869

* XCM: Introduce HRMP related message types

* HRMP: Data structures and plumbing

* HRMP: Configuration

* HRMP: Data layout

* HRMP: Acceptance & Enactment

* HRMP: Test base logic

* Update adder collator

* HRMP: Runtime API for accessing inbound messages

Also, removing some redundant fully-qualified names.

* HRMP: Add diagnostic logging in acceptance criteria

* HRMP: Additional tests

* Self-review fixes

* save test refactorings for the next time

* Missed a return statement.

* a formatting blip

* Add missing logic for appending HRMP digests

* Remove the channel contents vectors which became empty

* Tighten HRMP channel digests invariants.

* Apply suggestions from code review

Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>

* Remove a note about sorting for channel id

* Add missing rustdocs to the configuration

* Clarify and update the invariant for HrmpChannelDigests

* Make the onboarding invariant less sloppy

Namely, introduce `Paras::is_valid_para` (in fact, it already is present
in the implementation) and hook up the invariant to that.

Note that this says "within a session" because I don't want to make it
super strict on the session boundary. The logic on the session boundary
should be extremely careful.

* Make `CandidateCheckContext` use T::BlockNumber for hrmp_watermark

Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
2020-11-06 15:35:36 +00:00
Bastian Köcher 640264f38b Make CandidateHash a real type (#1916)
* Make `CandidateHash` a real type

This pr adds a new type `CandidateHash` that is used instead of the
opaque `Hash` type. This helps to ensure on the type system level that
we are passing the correct types.

This pr also fixes wrong usage of `relay_parent` as `candidate_hash`
when communicating with the av storage.

* Update core-primitives/src/lib.rs

Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>

* Wrap the lines

Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
2020-11-05 16:28:45 +01:00