* Don't send `ActiveLeaves` from leaves in db on startup in Overseer. Wait for fresh leaves instead.
* Don't pass initial set of leaves to Overseer
* Fix compilation error in subsystem-test-helpers
* Introduce jemalloc-stats feature flag
* remove unneeded space
* Update node/overseer/src/lib.rs
Co-authored-by: Marcin S. <marcin@bytedude.com>
* Update Cargo.toml
Co-authored-by: Marcin S. <marcin@bytedude.com>
* revert making tikv-jemallocator depend on jemalloc-stats
* conditionally import memory_stats instead of using dead_code
* fix test via expllicit import
* Add jemalloc-stats feature to crates, propagate it from root
* Apply `jemalloc-stats` feature to prepare mem stats; small refactor
* effect changes recommended on PR
* Update node/overseer/src/metrics.rs
Co-authored-by: Marcin S. <marcin@bytedude.com>
* fix compile error on in pipeline for linux. missing import
* Update node/overseer/src/lib.rs
Co-authored-by: Bastian Köcher <git@kchr.de>
* revert to defining collect_memory_stats inline
---------
Co-authored-by: Marcin S. <marcin@bytedude.com>
Co-authored-by: Marcin S <marcin@realemail.net>
Co-authored-by: Bastian Köcher <git@kchr.de>
* Add getrusage and memory tracker for precheck preparation
* Log memory stats metrics after prechecking
* Fix tests
* Try to fix errors (linux-only so I'm relying on CI here)
* Try to fix CI
* Add module docs for `prepare/memory_stats.rs`; fix CI error
* Report memory stats for all preparation jobs
* Use `RUSAGE_SELF` instead of `RUSAGE_THREAD`
Not sure why I did that -- was a brainfart on my end.
* Revert last commit (RUSAGE_THREAD is correct)
* Use exponential buckets
* Use `RUSAGE_SELF` for `getrusage`; enable `max_rss` metric for MacOS
* Increase poll interval
* Revert "Use `RUSAGE_SELF` for `getrusage`; enable `max_rss` metric for MacOS"
This reverts commit becf7a815409ab530fc61370abffcd1b97b9a777.
* Setting up new ChainSelectionMessage
* Partial first pass
* Got dispute conclusion data to provisioner
* Finished first draft for 4804 code
* A bit of polish and code comments
* cargo fmt
* Implementers guide and code comments
* More formatting, and naming issues
* Wrote test for ChainSelection side of change
* Added dispute coordinator side test
* FMT
* Addressing Marcin's comments
* fmt
* Addressing further Marcin comment
* Removing unnecessary test line
* Rough draft addressing Robert changes
* Clean up and test modification
* Majorly refactored scraper change
* Minor fixes for ChainSelection
* Polish and fmt
* Condensing inclusions per candidate logic
* Addressing Tsveto's comments
* Addressing Robert's Comments
* Altered inclusions struct to use nested BTreeMaps
* Naming fix
* Fixing inclusions struct comments
* Update node/core/dispute-coordinator/src/scraping/mod.rs
Add comment to split_off() use
Co-authored-by: Marcin S. <marcin@bytedude.com>
* Optimizing removal at block height for inclusions
* fmt
* Using copy trait
Co-authored-by: Marcin S. <marcin@bytedude.com>
* Add clippy config and remove .cargo from gitignore
* first fixes
* Clippyfied
* Add clippy CI job
* comment out rusty-cachier
* minor
* fix ci
* remove DAG from check-dependent-project
* add DAG to clippy
Co-authored-by: alvicsam <alvicsam@gmail.com>
* Don't import backing statements directly
into the dispute coordinator. This also gets rid of a redundant
signature check. Both should have some impact on backing performance.
In general this PR should make us scale better in the number of parachains.
Reasoning (aka why this is fine):
For the signature check: As mentioned, it is a redundant check. The
signature has already been checked at this point. This is even made
obvious by the used types. The smart constructor is not perfect as
discussed [here](https://github.com/paritytech/polkadot/issues/3455),
but is still a reasonable security.
For not importing to the dispute-coordinator: This should be good as the
dispute coordinator does scrape backing votes from chain. This suffices
in practice as a super majority of validators must have seen a backing
fork in order for a candidate to get included and only included
candidates pose a threat to our system. The import from chain is
preferable over direct import of backing votes for two reasons:
1. The import is batched, greatly improving import performance. All
backing votes for a candidate are imported with a single import.
And indeed we were able to see in metrics that importing votes
from chain is fast.
2. We do less work in general as not every candidate for which
statements are gossiped might actually make it on a chain. The
dispute coordinator as with the current implementation would still
import and keep those votes around for six sessions.
While redundancy is good for reliability in the event of bugs, this also
comes at a non negligible cost. The dispute-coordinator right now is the
subsystem with the highest load, despite the fact that it should not be
doing much during mormal operation and it is only getting worse
with more parachains as the load is a direct function of the number of statements.
We'll see on Versi how much of a performance improvement this PR
* Get rid of dead code.
* Dont send approval vote
* Make it pass CI
* Bring back tests for fixing them later.
* Explicit signature check.
* Resurrect approval-voting tests (not fixed yet)
* Send out approval votes in dispute-distribution.
Use BTreeMap for ordered dispute votes.
* Bring back an important warning.
* Fix approval voting tests.
* Don't send out dispute message on import + test
+ Some cleanup.
* Guide changes.
Note that the introduced complexity is actually redundant.
* WIP: guide changes.
* Finish guide changes about dispute-coordinator
conceputally. Requires more proof read still.
Also removed obsolete implementation details, where the code is better
suited as the source of truth.
* Finish guide changes for now.
* Remove own approval vote import logic.
* Implement logic for retrieving approval-votes
into approval-voting and approval-distribution subsystems.
* Update roadmap/implementers-guide/src/node/disputes/dispute-coordinator.md
Co-authored-by: asynchronous rob <rphmeier@gmail.com>
* Review feedback.
In particular: Add note about disputes of non included candidates.
* Incorporate Review Remarks
* Get rid of superfluous space.
* Tidy up import logic a bit.
Logical vote import is now separated, making the code more readable and
maintainable.
Also: Accept import if there is at least one invalid signer that has not
exceeded its spam slots, instead of requiring all of them to not exceed
their limits. This is more correct and a preparation for vote batching.
* We don't need/have empty imports.
* Fix tests and bugs.
* Remove error prone redundancy.
* Import approval votes on dispute initiated/concluded.
* Add test for approval vote import.
* Make guide checker happy (hopefully)
* Another sanity check + better logs.
* Reasoning about boundedness.
* Use `CandidateIndex` as opposed to `CoreIndex`.
* Remove redundant import.
* Review remarks.
* Add metric for calls to request signatures
* More review remarks.
* Add metric on imported approval votes.
* Include candidate hash in logs.
* More trace log
* Break cycle.
* Add some tracing.
* Cleanup allowed messages.
* fmt
* Tracing + timeout for get inherent data.
* Better error.
* Break cycle in all places.
* Clarified comment some more.
* Typo.
* Break cycle approval-distribution - approval-voting.
Co-authored-by: asynchronous rob <rphmeier@gmail.com>
* foo
* rolling session window
* fixup
* remove use statemetn
* fmt
* split NetworkBridge into two subsystems
Pending cleanup
* split
* chore: reexport OrchestraError as OverseerError
* chore: silence warnings
* fixup tests
* chore: add default timenout of 30s to subsystem test helper ctx handle
* single item channel
* fixins
* fmt
* cleanup
* remove dead code
* remove sync bounds again
* wire up shared state
* deal with some FIXMEs
* use distinct tags
Co-authored-by: Andrei Sandu <54316454+sandreim@users.noreply.github.com>
* use tag
Co-authored-by: Andrei Sandu <54316454+sandreim@users.noreply.github.com>
* address naming
tx and rx are common in networking and also have an implicit meaning regarding networking
compared to incoming and outgoing which are already used with subsystems themselvesq
* remove unused sync oracle
* remove unneeded state
* fix tests
* chore: fmt
* do not try to register twice
* leak Metrics type
Co-authored-by: Andrei Sandu <54316454+sandreim@users.noreply.github.com>
Co-authored-by: Andronik <write@reusable.software>
* Increase message channel size to 2048
Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
* Use unbounded channel for reading data
Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
* explicitly tag network requests with version
* fmt
* make PeerSet more aware of versioning
* some generalization of the network bridge to support upgrades
* walk back some renaming
* walk back some version stuff
* extract version from fallback
* remove V1 from NetworkBridgeUpdate
* add accidentally-removed timer
* implement focusing for versioned messages
* fmt
* fix up network bridge & tests
* remove inaccurate version check in bridge
* remove some TODO [now]s
* fix fallout in statement distribution
* fmt
* fallout in gossip-support
* fix fallout in collator-protocol
* fix fallout in bitfield-distribution
* fix fallout in approval-distribution
* fmt
* use never!
* fmt
* Move `trait ParachainHost` to a separate version independent module
`trait ParachainHost` is no longer part of a specific primitives
version. Instead there is a single trait for stable and staging api
versions. The trait contains stable AND staging methods. The latter are
explicitly marked as unstable.
* Fix `use` primitives
`polkadot_primitives::v2` becomes `polkadot_primitives::runtime_api`
* Staging API declaration and stubs
Introduces the concept for 'staging functions' in runtime API. These
functions are still in testing and they are meant to be used only
within test networks (Westend).
They coexist with the stable calls for technical reasons - maintaining
different runtime APIs for different networks is hard to implement.
Check the doc comments in source files for more details how the staging
API should be used.
* Add new staging method - get_session_disputes()
Add `staging_get_session_disputes` to `ParachainHost` as the first
method of the staging API.
* Hide vstaging runtime api implementations behind feature flag
* Fix test runtime
* fn staging_get_session_disputes() is renamed to fn staging_get_disputes()
* split metrics from bitfield signing
* cleanup all logging
* add a unit test for subset generation
* chore: add one more test to assert need is properly represented
* u8 as usize
* chore: overseer fixin
* fix test
* Update node/network/bitfield-distribution/src/metrics.rs
Co-authored-by: Andrei Sandu <54316454+sandreim@users.noreply.github.com>
* Update node/network/bitfield-distribution/src/metrics.rs
Co-authored-by: Andrei Sandu <54316454+sandreim@users.noreply.github.com>
* fallout from suggested rename
* consistency
Co-authored-by: Andrei Sandu <54316454+sandreim@users.noreply.github.com>
* remove v0 primitives from polkadot-primitives
* first pass: remove v0
* fix fallout in erasure-coding
* remove v1 primitives, consolidate to v2
* the great import update
* update runtime_api_impl_v1 to v2 as well
* guide: add `Version` request for runtime API
* add version query to runtime API
* reintroduce OldV1SessionInfo in a limited way
This commit implements the last major piece of #3211: the subsystem that
tracks PVFs that require voting, issues pre-check requests to
candidate-validation and makes sure that the votes are submitted to the
chain.
The function `has_api` checks that the api + version matches, which
isn't true anymore after bumping the version. The fix is to just compare
the runtime api version being at least `1`.
* Companion PR for removing Prometheus metrics prefix
* Was missing some metrics
* Fix missing renames
* Fix test
* Fixes
* Update test
* Update Substrate
* Second time
* remove prefix from intergration test for zombienet
* update zombienet image
* Update Substrate
Co-authored-by: Bastian Köcher <info@kchr.de>
Co-authored-by: Javier Viola <pepoviola@gmail.com>
* remove Default from CandidateHash
* Apply suggestions from code review
Co-authored-by: Andronik Ordian <write@reusable.software>
* chore: fmt
* remove backed candidate default
* Partial migration away from CandidateReceipt::default
* Remove more CandidateReceipt defaults
* fmt
* Mostly remove CommittedCandidateReceipt default usage
* Remove CommittedCandidateReceipt
* Remove more Defaults from polakdot primitives v1 + fmt
* Remove more Default from polkadot primites v1
* WIP trying to get overseer example + tests to compile
* feat: add primitives test helpers
* reduce deps of helper
* update primitive helpers
* make candidate validation compile
* fixup cargo lock
* make av-store compile
* fixup disputes coordinator tests
* test: fixup backing
* test: fixup approval voting
* fixup bitfield signing
* test: fixup runtime-api
* test: fixup availability dist
* foxi[ pverseer test]
* remove some Defaults, remove bounds from `dummy`
All `fn dummy` in primitives need to be removed anyways.
This aids in the transition.
* it's a test helper, so always use std
* test: fixup parachains runtime tests
Excluding benches.
* fix keyring
* fix paras runtime properly, no more default
* Remove fn dummy() usage from approval voting
* Move TestCandidateBuilder out of av store to test helpers
* Make candidate validation tests pass
* Make most dispute coirdinator tests pass
* Make provisioner tests work
* Make availability recovery tests work with test helpers
* Update polkadot-collator-protocol tests
* Update statement distribution tests
* Update polkadot overseer examples and tests
* Derive default for validation code so we don't break unrelated things
* Make para runtime test pass (no bench)
* Some more work
* chore: cargo fmt
* cargo fix
* avoid some Default::default
* fixup dispute coordinator test
* remove unused crate deps
* remove Default::default wherever possible, replace by dummy_* for the most part
* chore: cargo fmt
* Remove some warnings
* Remove CommittedCandidateReceipt dummy
* Remove CandidateReceipt dummy
* Remove CandidateDescriptor dummy
* Remove commented out code
* Fix para runtime tests
* chore: nightly
* Some updates to the builder
* Dynamically adjust mock head data size
* Make dispute cooridinator tests work
* Fix test candidate_backing_reorders_votes work
* +nightly-2021-10-29 fmt
* Spelling and remove a default use in builder
* Various clean up
* More small updates
* fmt
* More small updates
* Doc comments for test helpers
* cargo run --quiet --release --features=runtime-benchmarks -- benchmark --chain=kusama-dev --steps=50 --repeat=20 --pallet=runtime_parachains::paras_inherent --extrinsic=* --execution=wasm --wasm-execution=compiled --heap-pages=4096 --header=./file_header.txt --output=./runtime/kusama/src/weights/runtime_parachains_paras_inherent.rs
* cargo run --quiet --release --features=runtime-benchmarks -- benchmark --chain=polkadot-dev --steps=50 --repeat=20 --pallet=runtime_parachains::paras_inherent --extrinsic=* --execution=wasm --wasm-execution=compiled --heap-pages=4096 --header=./file_header.txt --output=./runtime/polkadot/src/weights/runtime_parachains_paras_inherent.rs
* Update lib.rs
* review comments
* fix warnings
* fix test by using correct candidate receipt relay parent
Co-authored-by: Andronik Ordian <write@reusable.software>
Co-authored-by: emostov <32168567+emostov@users.noreply.github.com>
Co-authored-by: Parity Bot <admin@parity.io>
Co-authored-by: Gavin Wood <gavin@parity.io>
* Mostly notes.
* Better error messages.
* Introduce Fatal/NonFatal + drop back channel participation
- Fatal/NonFatal - in order to make it easier to use utility functions.
- We drop the back channel in dispute participation as it won't be
needed any more.
* Better error messages.
* Utility function for receiving `CandidateEvent`s.
* Ordering module typechecks.
* cargo fmt
* Prepare spam slots module.
* Implement SpamSlots mechanism.
* Implement queues.
* cargo fmt
* Participation.
* Participation taking shape.
* Finish participation.
* cargo fmt
* Cleanup.
* WIP: Cleanup + Integration.
* Make `RollingSessionWindow` initialized by default.
* Make approval voting typecheck.
* Get rid of lazy_static & fix approval voting tests
* Move `SessionWindowSize` to node primitives.
* Implement dispute coordinator initialization.
* cargo fmt
* Make queues return error instead of boolean.
* Initialized: WIP
* Introduce chain api for getting finalized block.
* Fix ordering to only prune candidates on finalized events.
* Pruning of old sessions in spam slots.
* New import logic.
* Make everything typecheck.
* Fix warnings.
* Get rid of obsolete dispute-participation.
* Fixes.
* Add back accidentelly deleted Cargo.lock
* Deliver disputes in an ordered fashion.
* Add module docs for errors
* Use type synonym.
* hidden docs.
* Fix overseer tests.
* Ordering provider taking `CandidateReceipt`.
... To be kicked on one next commit.
* Fix ordering to use relay_parent
as included block is not unique per candidate.
* Add comment in ordering.rs.
* Take care of duplicate entries in queues.
* Better spam slots.
* Review remarks + docs.
* Fix db tests.
* Participation tests.
* Also scrape votes on first leaf for good measure.
* Make tests typecheck.
* Spelling.
* Only participate in actual disputes, not on every import.
* Don't account backing votes to spam slots.
* Fix more tests.
* Don't participate if we don't have keys.
* Fix tests, typos and warnings.
* Fix merge error.
* Spelling fixes.
* Add missing docs.
* Queue tests.
* More tests.
* Add metrics + don't short circuit import.
* Basic test for ordering provider.
* Import fix.
* Remove dead link.
* One more dead link.
Co-authored-by: Lldenaurois <Ljdenaurois@gmail.com>
* pvf: make execution timeout configurable
* guide: add timeouts to candidate validation params
* add timeouts to candidate validation messages
* fmt
* port backing to use the backing pvf timeout
* port approval-voting to use the execution timeout
* port dispute participation to use the correct timeout
* fmt
* address grumbles & test failure
* Remove incorrect proof about Jemalloc
The truth is that Jemalloc is not always the default allocator. This is
only true for the polkadot binary.
* Fmt
* Rephrase
* overseer: remove mut in connector
* rename SelectRelayChainWFallback -> SelectRelayChain
* split Basics
* introduce the OverseerConnector, use it
* introduce is_relay_chain to RelayChainSelection
* chore: rename var
* avoid dummy import in subsystem
* actually remove Disconnecte/Connected enum
* extract DummySubsystem into mod dummy.
* Handle::Connected -> Handle::new
* chore: fmt
* fix test
* select relay chain takes no arg, simplification
* fmt
* Update node/service/src/lib.rs
Co-authored-by: Andronik Ordian <write@reusable.software>
* chore: improve malus tests
* avoid the deferred setting of `is_relay_chain` in `RelayChainSelection`
* positive assertion is not mandated, only the negative one, to avoid a stall
* chore: fmt
* assure the `RelayChainSelection` is not used before the overseer is up and running
Co-authored-by: Andronik Ordian <write@reusable.software>
* Attempt to add log stats to gossip-support.
* WIP: Keep track of connected validators.
* Clarify metric.
* WIP: Make gossip support report connectivity.
* WIP: Fixing tests.
* Fix network bridge + integrate in overseer.
* Consistent naming.
* Fix logic error
* cargo fmt
* Pretty logs.
* cargo fmt
* Use `Delay` to trigger periodic checks.
* fmt
* Fix warning for authority set size of 1.
* More correct ratio report if there are no resolved validators.
* Prettier rendering of empty set.
* Fix typo.
* Another typo.
* Don't check on every leaf update.
* Make compatible with older rustc.
* Fix tests.
* Demote warning.
* remove connected disconnected state from overseer
* foo
* split new partial
* fix
* refactor init code to not require a `OverseerHandle` when we don't have an overseer
* intermediate
* fixins
* X
* fixup
* foo
* fixup
* docs
* conditional
* Update node/service/src/lib.rs
* review by ladi