* crate skeleton and type definitions
* add ChainSelectionMessage
* add error type
* run loop
* fix overseer
* simplify determine_new_blocks API
* write an overlay struct and fetch new blocks
* add new function to overlay
* more flow
* add leaves to overlay and add a strong type around leaves-set
* add is_parent_viable
* implement block import, ignoring reversions
* add stagnant-at to overlay
* add stagnant
* add revert consensus log
* flow for reversions
* extract and import block reversions
* recursively update viability
* remove redundant parameter from WriteBlockEntry
* do some removal of viable leaves
* address grumbles
* refactor
* address grumbles
* add comment about non-monotonicity
* extract backend to submodule
* begin the hunt for viable leaves
* viability pivots for updating the active leaves
* remove LeafSearchFrontier
* partially -> explicitly viable and untwist some booleans
* extract tree to submodule
* implement block finality update
* Implement block approval routine
* implement stagnant detection
* ensure blocks pruned on finality are removed from the active leaves set
* write down some planned test cases
* floww
* leaf loading
* implement best_leaf_containing
* write down a few more tests to do
* remove dependence of tree on header
* guide: ChainApiMessage::BlockWeight
* node: BlockWeight ChainAPI
* fix compile issue
* note a few TODOs for the future
* fetch block weight using new BlockWeight ChainAPI
* implement unimplemented
* sort leaves by block number after weight
* remove warnings and add more TODOs
* create test module
* storage for test backend
* wrap inner in mutex
* add write waker query to test backend
* Add OverseerSignal -> FromOverseer conversion
* add test harnes
* add no-op test
* add some more test helpers
* the first test
* more progress on tests
* test two subtrees
* determine-new-blocks: cleaner genesis avoidance and tighter ancestry requests
* don't make ancestry requests when asking for one block
* add a couple more tests
* add to AllMessages in guide
* remove bad spaces from bridge
* compact iterator
* test import with gaps
* more reversion tests
* test finalization pruning subtrees
* fixups
* test clobbering and fix bug in overlay
* exhaustive backend state after finalizaiton tested
* more finality tests
* leaf tests
* test approval
* test ChainSelectionMessage::Leaves thoroughly
* remove TODO
* avoid Ordering::is_ne so CI can build
* comment algorithmic complexity
* Update node/core/chain-selection/src/lib.rs
Co-authored-by: Bernhard Schuster <bernhard@ahoi.io>
Co-authored-by: Bernhard Schuster <bernhard@ahoi.io>
* skeleton for dispute-coordinator
* add coordinator and participation message types
* begin dispute-coordinator DB
* functions for loading
* implement strongly-typed DB transaction
* add some tests for DB transaction
* core logic for pruning
* guide: update candidate-votes key for coordinator
* update candidate-votes key
* use big-endian encoding for session, and implement upper bound generator
* finish implementing pruning
* add a test for note_current_session
* define state of the subsystem itself
* barebones subsystem definition
* control flow
* more control flow
* implement session-updating logic
* trace
* control flow for message handling
* Update node/core/dispute-coordinator/src/lib.rs
Co-authored-by: André Silva <123550+andresilva@users.noreply.github.com>
* Update node/subsystem/src/messages.rs
Co-authored-by: André Silva <123550+andresilva@users.noreply.github.com>
* some more control flow
* guide: remove overlay
* more control flow
* implement some DB getters
* make progress on importing statements
* add SignedDisputeStatement struct
* move ApprovalVote to shared primitives
* add a signing-payload API to explicit dispute statements
* add signing-payload to CompactStatement
* add relay-parent hash to seconded/valid dispute variatns
* correct import
* type-safe wrapper around dispute statements
* use checked dispute statement in message type
* extract rolling session window cache to subsystem-util
* extract session window tests
* approval-voting: use rolling session info cache
* reduce dispute window to match runtime in practice
* add byzantine_threshold and supermajority_threshold utilities to primitives
* integrate rolling session window
* Add PartialOrd to CandidateHash
* add Ord to CandidateHash
* implement active dispute update
* add dispute messages to AllMessages
* add dispute stubs to overseer
* inform dispute participation to participate
* implement issue_local_statement
* implement `determine_undisputed_chain`
* fix warnings
* test harness for dispute coordinator tests
* add more helpers to test harness
* add some more helpers
* some tests for dispute coordinator
* ignore wrong validator indices
* test finality voting rule constraint
* add more tests
* add variants to network bridge
* fix test compilation
* remove most dispute coordinator functionality
as of #3222 we can do most of the work within the approval voting subsystem
* Revert "remove most dispute coordinator functionality"
This reverts commit 9cd615e8eb6ca0b382cbaff525d813e753d6004e.
* Use thiserror
Co-authored-by: Bernhard Schuster <bernhard@ahoi.io>
* Update node/core/dispute-coordinator/src/lib.rs
Co-authored-by: Bernhard Schuster <bernhard@ahoi.io>
* extract tests to separate module
* address nit
* adjust run_iteration API
Co-authored-by: André Silva <123550+andresilva@users.noreply.github.com>
Co-authored-by: Bernhard Schuster <bernhard@ahoi.io>
* Create validator_side module
* Subsume Candidate Selection
* Add test to ensure candidate backing logic is correct
* Ensure secondings are adequately cleaned up and address test flakyness
* Address Feedback
* guide: reversion safety
* guide: manage reversion safety in subsystems
* add leaf status to ActivatedLeaf
* add an LRU-cache to overseer for staleness detection
* update ActivatedLeaf usages in tests to contain status field
* add variant where missed accidentally
* add some helpers to LeafStatus
* address grumbles
* Remove signature verification in backing.
`SignedFullStatement` now signals that the signature has already been
checked.
* Remove unused check_payload function.
* Introduced unchecked signed variants.
* Fix inclusion to use unchecked variant.
* More unchecked variants.
* Use unchecked variants in protocols.
* Start fixing statement-distribution.
* Fixup statement distribution.
* Fix inclusion.
* Fix warning.
* Fix backing properly.
* Fix bitfield distribution.
* Make crypto store optional for `RuntimeInfo`.
* Factor out utility functions.
* get_group_rotation_info
* WIP: Collator cleanup + check signatures.
* Convenience signature checking functions.
* Check signature on collator-side.
* Fix warnings.
* Fix collator side tests.
* Get rid of warnings.
* Better Signed/UncheckedSigned implementation.
Also get rid of Encode/Decode for Signed! *party*
* Get rid of dead code.
* Move Signed in its own module.
* into_checked -> try_into_checked
* Fix merge.
* Factor out runtime module into utils.
* First fatal error design.
* Better error handling infra.
* Error handling cleanup.
* Send to peers of our group first.
* Finish backing group prioritization.
* Little cleanup.
* More cleanup.
* Forgot to checkin error.rs.
* Notes.
* Runtime -> RuntimeInfo
* qed in debug assert.
* PolkaErr -> Fault.
* Factor out runtime module into utils.
* Add maybe_authority information to `PeerConnected` event.
We already gather this information in authority discovery, so we might
as well share it with others.
This opens up an easy path to trigger validators differently from normal
nodes, e.g. for prioritization. This change has become more important
now, that we just connect to all validators and therefore just have a
long peer list without any information about those nodes.
* Test fix.
* guide: declare one para as a collator
* add ParaId to Declare messages and clean up
* fix build
* fix the testerinos
* begin adding keystore to collator-protocol
* remove request_x_ctx
* add core_for_group
* add bump_rotation
* add some more helpers to subsystem-util
* change signing_key API to take ref
* determine current and next para assignments
* disconnect collators who are not on current or next para
* add collator peer count metric
* notes for later
* some fixes
* add data & keystore to test state
* add a test utility for answering runtime API requests
* fix existing collator tests
* add new tests
* remove sc_keystore
* update cargo lock
Co-authored-by: Andronik Ordian <write@reusable.software>
* gossip: do not issue a connection request if we are not a validator
* guide updates
* use all relevant authorities when issuing a request
* use AuthorityDiscoveryApi instead
* update comments to the status quo
* overseer: pass messages directly between subsystems
* test that message is held on to
* Update node/overseer/src/lib.rs
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
* give every subsystem an unbounded sender too
* remove metered_channel::name
1. we don't provide good names
2. these names are never used anywhere
* unused mut
* remove unnecessary &mut
* subsystem unbounded_send
* remove unused MaybeTimer
We have channel size metrics that serve the same purpose better now and the implementation of message timing was pretty ugly.
* remove comment
* split up senders and receivers
* update metrics
* fix tests
* fix test subsystem context
* use SubsystemSender in jobs system now
* refactor of awful jobs code
* expose public `run` on JobSubsystem
* update candidate backing to new jobs & use unbounded
* bitfield signing
* candidate-selection
* provisioner
* approval voting: send unbounded for assignment/approvals
* async not needed
* begin bridge split
* split up network tasks into background worker
* port over network bridge
* Update node/network/bridge/src/lib.rs
Co-authored-by: Andronik Ordian <write@reusable.software>
* rename ValidationWorkerNotifications
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
Co-authored-by: Andronik Ordian <write@reusable.software>
* add number to `ActivatedLeavesUpdate`
* update subsystem util and overseer
* use new ActivatedLeaf everywhere
* sort view
* sorted and limited view in network bridge
* use live block hash only if it's newer
* grumples
* WIP
* availability distribution, still very wip.
Work on the requesting side of things.
* Some docs on what I intend to do.
* Checkpoint of session cache implementation
as I will likely replace it with something smarter.
* More work, mostly on cache
and getting things to type check.
* Only derive MallocSizeOf and Debug for std.
* availability-distribution: Cache feature complete.
* Sketch out logic in `FetchTask` for actual fetching.
- Compile fixes.
- Cleanup.
* Format cleanup.
* More format fixes.
* Almost feature complete `fetch_task`.
Missing:
- Check for cancel
- Actual querying of peer ids.
* Finish FetchTask so far.
* Directly use AuthorityDiscoveryId in protocol and cache.
* Resolve `AuthorityDiscoveryId` on sending requests.
* Rework fetch_task
- also make it impossible to check the wrong chunk index.
- Export needed function in validator_discovery.
* From<u32> implementation for `ValidatorIndex`.
* Fixes and more integration work.
* Make session cache proper lru cache.
* Use proper lru cache.
* Requester finished.
* ProtocolState -> Requester
Also make sure to not fetch our own chunk.
* Cleanup + fixes.
* Remove unused functions
- FetchTask::is_finished
- SessionCache::fetch_session_info
* availability-distribution responding side.
* Cleanup + Fixes.
* More fixes.
* More fixes.
adder-collator is running!
* Some docs.
* Docs.
* Fix reporting of bad guys.
* Fix tests
* Make all tests compile.
* Fix test.
* Cleanup + get rid of some warnings.
* state -> requester
* Mostly doc fixes.
* Fix test suite.
* Get rid of now redundant message types.
* WIP
* Rob's review remarks.
* Fix test suite.
* core.relay_parent -> leaf for session request.
* Style fix.
* Decrease request timeout.
* Cleanup obsolete errors.
* Metrics + don't fail on non fatal errors.
* requester.rs -> requester/mod.rs
* Panic on invalid BadValidator report.
* Fix indentation.
* Use typed default timeout constant.
* Make channel size 0, as each sender gets one slot anyways.
* Fix incorrect metrics initialization.
* Fix build after merge.
* More fixes.
* Hopefully valid metrics names.
* Better metrics names.
* Some tests that already work.
* Slightly better docs.
* Some more tests.
* Fix network bridge test.
* validator_discovery: pass PeerSet to the request
* validator_discovery: track PeerSet of connected peers
* validator_discovery: fix tests
* validator_discovery: fix long line
* some fixes
* some validator_discovery logs
* log validator discovery request
* Also connect to validators on `DistributePoV`.
* validator_discovery: store the whole state per peer_set
* bump spec versions in kusama, polkadot and westend
* Correcting doc.
* validator_discovery: bump channel capacity
* pov-distribution: some cleanup
* this should fix the test, but it does not
* I just got some brain damage while fixing this
Why are you even reading this???
* wrap long line
* address some review nits
Co-authored-by: Robert Klotzner <robert.klotzner@gmx.at>
* collation-generation: use persisted validation data
* node: remote FullValidationData API
* runtime: remove FullValidationData API
* backing tests: use persisted validation data
* FullCandidateReceipt: use persisted validation data
This is not a big change since this type is not used anywhere
* Remove ValidationData and TransientValidationData
Also update the guide
* Add one Jaeger span per relay parent
This adds one Jaeger span per relay parent, instead of always creating
new spans per relay parent. This should improve the UI view, because
subsystems are now grouped below one common span.
* Fix doc tests
* Replace `PerLeaveSpan` to `PerLeafSpan`
* More renaming
* Moare
* Update node/subsystem/src/lib.rs
Co-authored-by: Andronik Ordian <write@reusable.software>
* Skip the spans
* Increase `spec_version`
Co-authored-by: Andronik Ordian <write@reusable.software>
* refactor View to include finalized_number
* guide: update the NetworkBridge on BlockFinalized
* av-store: fix the tests
* actually fix tests
* grumbles
* ignore macro doctest
* use Hash::repeat_bytes more consistently
* broadcast empty leaves updates as well
* fix issuing view updates on empty leaves updates
* Do not spam when we can not send a message to a job
There are legal reasons why a job ended. If a job failed, the error is
logged. So, we don't need to log an error when we can not send a message
to a job.
* Review feedback
* Rework `ConnectionsRequests`
Instead of implementing the `Stream` trait, this struct now provides a
function `next()`. This enables us to encode into the type system that
it will always return a value or block indefinitely.
* Review feedback
* guide: non-semantic changes
* guide: update per the issue description
* GetBackedCandidates operates on multiple hashes now
* GetBackedCandidates still needs a relay parent
* implement changes specified in guide
* distinguish between various occasions for canceled oneshots
* add tracing info to getbackedcandidates
* REVERT ME: add tracing messages for GetBackedCandidates
Note that these messages are only sometimes actually passed on to the
candidate backing subsystem, with the consequence that it is
unexpectedly frequent that the provisioner fails to create its
provisionable data.
* REVERT ME: more tracing logging
* REVERT ME: log when CandidateBackingJob receives any message at all
* REVERT ME: log when send_msg sends a message to a job
* fix candidate-backing tests
* streamline GetBackedCandidates
This uses table.attested_candidate instead of table.get_candidate, because
it's not obvious how to get a BackedCandidate from just a
CommittedCandidateReceipt.
* REVERT ME: more logging tracing job lifespans
* promote warning about job premature demise
* don't terminate CandiateBackingJob::run_loop in event of failure to process message
* Revert "REVERT ME: more logging tracing job lifespans"
This reverts commit 7365f2fb3dec988d95cfcd317eba75587fe7fd16.
* Revert "REVERT ME: log when send_msg sends a message to a job"
This reverts commit 58e46aad038e6517d6d56390c8be65b046a21884.
* Revert "REVERT ME: log when CandidateBackingJob receives any message at all"
This reverts commit 0d6f38413c7c66b5e9e81dabc587906fa9f82656.
* Revert "REVERT ME: more tracing logging"
This reverts commit 675fd2628e84d1596965280e7314155ef21b28e6.
* Revert "REVERT ME: add tracing messages for GetBackedCandidates"
This reverts commit e09e156493430b33b6c8ab4b5cedb3f2f91afd51.
* formatting
* add logging message to CandidateBackingJob::run_loop start
* REVERT ME: add tracing to candidate-backing job creation
* run candidatebacking loop even if no assignment
* use unique error variants for each canceled oneshot
* Revert "REVERT ME: add tracing to candidate-backing job creation"
This reverts commit 8ce5f4f0bd7186dade134b118751480f72ea1fd6.
* try_runtime_api more to reduce silent exits
* add sanity check that returned backed candidates preserve ordering
* remove redundant err attribute
* Simplify subsystem jobs
This pr simplifies the subsystem jobs interface. Instead of requiring an
extra message that is used to signal that a job should be ended, a job
now ends when the receiver returns `None`. Besides that it changes the
interface to enforce that messages to a job provide a relay parent.
* Drop ToJobTrait
* Remove FromJob
We always convert this message to FromJobCommand anyway.
* guide: fix formatting for SessionInfo module
* primitives: SessionInfo type
* punt on approval keys
* ah, revert the type alias
* session info runtime module skeleton
* update the guide
* runtime/configuration: sync with the guide
* runtime/configuration: setters for newly added fields
* runtime/configuration: set codec indexes
* runtime/configuration: update test
* primitives: fix SessionInfo definition
* runtime/session_info: initial impl
* runtime/session_info: use initializer for session handling (wip)
* runtime/session_info: mock authority discovery trait
* guide: update the initializer's order
* runtime/session_info: tests skeleton
* runtime/session_info: store n_delay_tranches in Configuration
* runtime/session_info: punt on approval keys
* runtime/session_info: add some basic tests
* Update primitives/src/v1.rs
* small fixes
* remove codec index annotation on structs
* fix off-by-one error
* validator_discovery: accept a session index
* runtime: replace validator_discovery api with session_info
* Update runtime/parachains/src/session_info.rs
Co-authored-by: Sergei Shulepov <sergei@parity.io>
* runtime/session_info: add a comment about missing entries
* runtime/session_info: define the keys
* util: expose connect_to_past_session_validators
* util: allow session_info requests for jobs
* runtime-api: add mock test for session_info
* collator-protocol: add session_index to test state
* util: fix error message for runtime error
* fix compilation
* fix tests after merge with master
Co-authored-by: Sergei Shulepov <sergei@parity.io>
* Initial commit
* Remove unnecessary struct
* Some review nits
* Update node/network/pov-distribution/src/lib.rs
* Update parachain/test-parachains/adder/collator/tests/integration.rs
* Review nits
* notify_all_we_are_awaiting
* Both ways of peers connections should work the same
* Add mod-level docs to error.rs
* Avoid multiple connection requests at same parent
* Dont bail on errors
* FusedStream for ConnectionRequests
* Fix build after merge
* Improve error handling
* Remove whitespace formatting
* use snake_case for log targets
* remove unused continue
* validator_discovery: when disconnecting, use all addresses
* validator_discovery: simplify request revokation
* fix a typo
* reexport prometheus-super for ease of use of other subsystems
* add some prometheus timers for collation generation subsystem
* add timing metrics to av-store
* add metrics to candidate backing
* add timing metric to bitfield signing
* add timing metrics to candidate selection
* add timing metrics to candidate-validation
* add timing metrics to chain-api
* add timing metrics to provisioner
* add timing metrics to runtime-api
* add timing metrics to availability-distribution
* add timing metrics to bitfield-distribution
* add timing metrics to collator protocol: collator side
* add timing metrics to collator protocol: validator side
* fix candidate validation test failures
* add timing metrics to pov distribution
* add timing metrics to statement-distribution
* use substrate_prometheus_endpoint prometheus reexport instead of prometheus_super
* don't include JOB_DELAY in bitfield-signing metrics
* give adder-collator ability to easily export its genesis-state and validation code
* wip: adder-collator pushbutton script
* don't attempt to register the adder-collator automatically
Instead, get these values with
```sh
target/release/adder-collator export-genesis-state
target/release/adder-collator export-genesis-wasm
```
And then register the parachain on https://polkadot.js.org/apps/?rpc=ws%3A%2F%2F127.0.0.1%3A9944#/explorer
To collect prometheus data, after running the script, create `prometheus.yml` per the instructions
at https://www.notion.so/paritytechnologies/Setting-up-Prometheus-locally-835cb3a9df7541a781c381006252b5ff
and then run:
```sh
docker run -v `pwd`/prometheus.yml:/etc/prometheus/prometheus.yml:z --network host prom/prometheus
```
Demonstrates that data makes it across to prometheus, though it is likely to be useful in the future
to tweak the buckets.
* Update parachain/test-parachains/adder/collator/src/cli.rs
Co-authored-by: Andronik Ordian <write@reusable.software>
* use the grandpa-pause parameter
* skip metrics in tracing instrumentation
* remove unnecessary grandpa_pause cli param
Co-authored-by: Andronik Ordian <write@reusable.software>
* drop in tracing to replace log
* add structured logging to trace messages
* add structured logging to debug messages
* add structured logging to info messages
* add structured logging to warn messages
* add structured logging to error messages
* normalize spacing and Display vs Debug
* add instrumentation to the various 'fn run'
* use explicit tracing module throughout
* fix availability distribution test
* don't double-print errors
* remove further redundancy from logs
* fix test errors
* fix more test errors
* remove unused kv_log_macro
* fix unused variable
* add tracing spans to collation generation
* add tracing spans to av-store
* add tracing spans to backing
* add tracing spans to bitfield-signing
* add tracing spans to candidate-selection
* add tracing spans to candidate-validation
* add tracing spans to chain-api
* add tracing spans to provisioner
* add tracing spans to runtime-api
* add tracing spans to availability-distribution
* add tracing spans to bitfield-distribution
* add tracing spans to network-bridge
* add tracing spans to collator-protocol
* add tracing spans to pov-distribution
* add tracing spans to statement-distribution
* add tracing spans to overseer
* cleanup
* Connect to different validators on different leaves
* Implement ConnectionRequests
* Replace existing connection request
* Do not terminate if there are no ongoing requests
* Adds tests
* Remove the loop
* Add replacement test
* Use find
* Update node/subsystem-util/src/validator_discovery.rs
Co-authored-by: Andronik Ordian <write@reusable.software>
* Add requests revocation to cleanup
* Revert "Add requests revocation to cleanup"
This reverts commit d0ac1d7a0672f0ba803c923a32ca6ca84538f549.
Co-authored-by: Andronik Ordian <write@reusable.software>
* Adds integration test based on adder collator
This adds an integration test for parachains that uses the adder
collator. The test will start two relay chain nodes and one collator and
waits until 4 blocks are build and enacted by the parachain.
* Make sure the integration test is run in CI
* Fix wasm compilation
* Update parachain/test-parachains/adder/collator/src/lib.rs
Co-authored-by: Sergei Shulepov <sergei@parity.io>
* Update cli/src/command.rs
Co-authored-by: Sergei Shulepov <sergei@parity.io>
* backing: extract log target
* bitfield-signing: extract log target
* utils: fix a typo
* provisioner: extract log target
* candidate selection: remove unused error variant
* bitfield-distribution: change the return type of run
* pov-distribution: extract log target
* collator-protocol: simplify runtime request
* collation-generation: do not exit early on error
* collation-generation: do not exit on double init
* collator-protocol: do not exit on errors and rename LOG_TARGET
* collator-protocol: a workaround for ununused imports warning
* Update node/network/bitfield-distribution/src/lib.rs
* collation-generation: elevate warn! to error!
* collator-protocol: fix imports
* post merge fix
* fix compilation