We consider BEEFY mature enough to run by default on all nodes
for test networks (Rococo/Wococo/Versi).
Right now, most nodes are not running it since it's opt-in using
--beefy flag. Switch to an opt-out model for test networks.
Replace --beefy flag from CLI with --no-beefy and have BEEFY
client start by default on test networks.
Signed-off-by: acatangiu <adrian@parity.io>
* PVF: Refactor workers into separate crates, remove host dependency
* Fix compile error
* Remove some leftover code
* Fix compile errors
* Update Cargo.lock
* Remove worker main.rs files
I accidentally copied these from the other PR. This PR isn't intended to
introduce standalone workers yet.
* Address review comments
* cargo fmt
* Update a couple of comments
* Update log targets
* impl guide: Update Collator Generation
* Address review comments
* Fix compile errors
I don't remember why I did this. Maybe it only made sense with the async backing
changes.
* Remove leftover glossary
* metrics: tests: Fix flaky runtime_can_publish_metrics
When an re-org happens wait_for_blocks(2) would actually exit after the second
import of blocks 1, so the conditions for the metric to exist won't be met
hence the occasional test failure.
More details in:
https://github.com/paritytech/polkadot/issues/7267
Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
* metrics: tests: Cleanup un-needed box pin
Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
---------
Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
* Make `issue_explicit_statement_with_index` regular function
* Make `issue_backing_statement_with_index` regular function
* Issue `RevertBlocks` as soon as a dispute has `byzantine threshold + 1` invalid votes.
* Remove a comment
* Fix `has_fresh_byzantine_threshold_against()`
* Extend `informs_chain_selection_when_dispute_concluded_against` test
* PVF: Remove `rayon` and some uses of `tokio`
1. We were using `rayon` to spawn a superfluous thread to do execution, so it was removed.
2. We were using `rayon` to set a threadpool-specific thread stack size, and AFAIK we couldn't do that with `tokio` (it's possible [per-runtime](https://docs.rs/tokio/latest/tokio/runtime/struct.Builder.html#method.thread_stack_size) but not per-thread). Since we want to remove `tokio` from the workers [anyway](https://github.com/paritytech/polkadot/issues/7117), I changed it to spawn threads with the `std::thread` API instead of `tokio`.[^1]
[^1]: NOTE: This PR does not totally remove the `tokio` dependency just yet.
3. Since `std::thread` API is not async, we could no longer `select!` on the threads as futures, so the `select!` was changed to a naive loop.
4. The order of thread selection was flipped to make (3) sound (see note in code).
I left some TODO's related to panics which I'm going to address soon as part of https://github.com/paritytech/polkadot/issues/7045.
* PVF: Vote invalid on panics in execution thread (after a retry)
Also make sure we kill the worker process on panic errors and internal errors to
potentially clear any error states independent of the candidate.
* Address a couple of TODOs
Addresses a couple of follow-up TODOs from
https://github.com/paritytech/polkadot/pull/7153.
* Add some documentation to implementer's guide
* Fix compile error
* Fix compile errors
* Fix compile error
* Update roadmap/implementers-guide/src/node/utility/candidate-validation.md
Co-authored-by: Andrei Sandu <54316454+sandreim@users.noreply.github.com>
* Address comments + couple other changes (see message)
- Measure the CPU time in the prepare thread, so the observed time is not
affected by any delays in joining on the thread.
- Measure the full CPU time in the execute thread.
* Implement proper thread synchronization
Use condvars i.e. `Arc::new((Mutex::new(true), Condvar::new()))` as per the std
docs.
Considered also using a condvar to signal the CPU thread to end, in place of an
mpsc channel. This was not done because `Condvar::wait_timeout_while` is
documented as being imprecise, and `mpsc::Receiver::recv_timeout` is not
documented as such. Also, we would need a separate condvar, to avoid this case:
the worker thread finishes its job, notifies the condvar, the CPU thread returns
first, and we join on it and not the worker thread. So it was simpler to leave
this part as is.
* Catch panics in threads so we always notify condvar
* Use `WaitOutcome` enum instead of bool condition variable
* Fix retry timeouts to depend on exec timeout kind
* Address review comments
* Make the API for condvars in workers nicer
* Add a doc
* Use condvar for memory stats thread
* Small refactor
* Enumerate internal validation errors in an enum
* Fix comment
* Add a log
* Fix test
* Update variant naming
* Address a missed TODO
---------
Co-authored-by: Andrei Sandu <54316454+sandreim@users.noreply.github.com>
* Replace `RollingSessionWindow` with `RuntimeInfo` - initial commit
* Fix tests in import
* Fix the rest of the tests
* Remove dead code
* Fix todos
* Simplify session caching
* Comments for `SessionInfoProvider`
* Separate `SessionInfoProvider` from `State`
* `cache_session_info_for_head` becomes freestanding function
* Remove unneeded `mut` usage
* fn session_info -> fn get_session_info() to avoid name clashes. The function also tries to initialize `SessionInfoProvider`
* Fix SessionInfo retrieval
* Code cleanup
* Don't wrap `SessionInfoProvider` in an `Option`
* Remove `earliest_session()`
* Remove pre-caching -> wip
* Fix some tests and code cleanup
* Fix all tests
* Fixes in tests
* Fix comments, variable names and small style changes
* Fix a warning
* impl From<SessionWindowSize> for NonZeroUsize
* Fix logging for `get_session_info` - remove redundant logs and decrease log level to DEBUG
* Code review feedback
There is a deny(clippy::dbg_macro) in the crate root, so newer
Clippy fails here since tests use dbg.
But dbg in tests are fine IMHO.
Signed-off-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io>
* companion for #13384
* adjust parsing RPC address from process output
* update rpc cli
* update lockfile for {"substrate"}
* bump zombienet v1.3.48
* bump zombienet version
* allow zombienet-tests-misc-upgrade-node to fail
* add comment and issue link to allowed_failure
* grumbles: disable failed job
* disabled the correct test
---------
Co-authored-by: parity-processbot <>
Co-authored-by: Javier Viola <javier@parity.io>
* Pass `SessionInfo` directly to `CandidateEnvironment::new` otherwise it should be an async function
* Replace calls to `RollingSessionWindow` with `RuntimeInfo`
Adjust `dispute-coordinator` initialization to use `RuntimeInfo`
* Modify `dispute-coordinator` initialization
* Pass `Hash` to `process_on_chain_votes` so that `RuntimeInfo` calls can be made
Remove some fixmes
* Pass `Hash` to `handle_import_statements` to perform `RuntimeInfo` calls
* remove todo comments
* Remove `error` from `Initialized`
Rework new session handling code
* Remove db code which is no longer used
* Update stale comment and remove unneeded type specification
* Cache SessionInfo on startup
* Use `DISPUTE_WINDOW` from primitives
* Fix caching in `process_active_leaves_update`
* handle_import_statements: leaf_hash -> block_hash
* Restore `ensure_available_session_info`
* Don't interrupt `process_on_chain_votes` if SessionInfo can't be fetched
* Small style improvements in logging
* process_on_chain_votes: leaf_hash -> block_hash
* Restore `note_earliest_session` - it is required to prune disputes and votes
* Cache new sessions only when there is an actual session change
* Fix tests
* `CandidateEnvironment::new` gets `session_idx` and fetches SessionInfo by itself to avoid the invariant where the input SessionIndex and SessionInfo parameters don't match
* Fix handling of missing session info
* Move sessions caching in `handle_startup` and fix tests
* Load `relay_parent` from db in `handle_import_statements` instead of passing it as a parameter via two functions
* Don't do two db reads
* Fix the twisted logic in `handle_import_statements`
* fixup
* Small style fix
* Decrease log levels for caching errors to debug and fix a typo
* Update outdated comment
* Remove `ensure_available_session_info`
* Load relay parent from db in `process_on_chain_votes`
* Revert "Load relay parent from db in `process_on_chain_votes`"
This reverts commit 978ad4f223d517faa7a7fbad96e3f8de4fa17501.
* Keep track of highest seen session and last session cached without gaps.
* Apply suggestions from code review
Co-authored-by: ordian <write@reusable.software>
* Handle session caching failure on startup correctly
* Update node/core/dispute-coordinator/src/initialized.rs
Co-authored-by: ordian <write@reusable.software>
* Simplify session caching retries
* Update stale comment
* Fix lower bound calculation for session caching
---------
Co-authored-by: ordian <write@reusable.software>
* PVF: Don't dispute on missing artifact
A dispute should never be raised if the local cache doesn't provide a certain
artifact. You can not dispute based on this reason, as it is a local hardware
issue and not related to the candidate to check.
Design:
Currently we assume that if we prepared an artifact, it remains there on-disk
until we prune it, i.e. we never check again if it's still there.
We can change it so that instead of artifact-not-found triggering a dispute, we
retry once (like we do for AmbiguousWorkerDeath, except we don't dispute if it
still doesn't work). And when enqueuing an execute job, we check for the
artifact on-disk, and start preparation if not found.
Changes:
- [x] Integration test (should fail without the following changes)
- [x] Check if artifact exists when executing, prepare if not
- [x] Return an internal error when file is missing
- [x] Retry once on internal errors
- [x] Document design (update impl guide)
* Add some context to wasm error message (it is quite long)
* Fix impl guide
* Add check for missing/inaccessible file
* Add comment referencing Substrate issue
* Add test for retrying internal errors
---------
Co-authored-by: parity-processbot <>
* Switch to DNS name based bootnodes for Rococo
Signed-off-by: bakhtin <a@bakhtin.net>
* Switch Rococo bootnodes from WS to WSS
Signed-off-by: bakhtin <a@bakhtin.net>
---------
Signed-off-by: bakhtin <a@bakhtin.net>
* globally upgrade syn to 1.0.109
* globally upgrade quote to 1.0.26
* globally upgrade proc-macro2 to 1.0.56
* globally bump syn to v2.0.13
* update expander to v1.0.0
* temporary commit to prove new version of expander works
(new version hasn't been released yet so using git)
* use expander 2.0.0
* upgrade to syn 2.0.14
* update lock file
* Happy New Year!
* Remove year entierly
Co-authored-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io>
* Remove years from copyright notice in the entire repo
---------
Co-authored-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io>
* Onchain scraper in `dispute-coordinator` will scrape `SCRAPED_FINALIZED_BLOCKS_COUNT` blocks before finality
The purpose is to make the availability of a `CandidateReceipt` for finalized candidates more likely.
For details see: https://github.com/paritytech/polkadot/issues/7009
* Fix off by one error
* Replace `SCRAPED_FINALIZED_BLOCKS_COUNT` with `DISPUTE_CANDIDATE_LIFETIME_AFTER_FINALIZATION`