* Fix a couple of typos
* Retry failed PVF execution
PVF execution that fails due to AmbiguousWorkerDeath should be retried once.
This should reduce the occurrence of failures due to transient conditions.
Closes#6195
* Address a couple of nits
* Write tests; refactor (add `validate_candidate_with_retry`)
* Update node/core/candidate-validation/src/lib.rs
Co-authored-by: Andronik <write@reusable.software>
Co-authored-by: eskimor <eskimor@users.noreply.github.com>
Co-authored-by: Andronik <write@reusable.software>
* split metrics from collation generation
* move metrics to separate file out of backing
* split bitfield signing metrics
* split candidate validation metrics
* split chain api metrics
* split metrics from runtime API
* util is not used in backed metrics mod
* fmt
* missing types
* sure
* remove v0 primitives from polkadot-primitives
* first pass: remove v0
* fix fallout in erasure-coding
* remove v1 primitives, consolidate to v2
* the great import update
* update runtime_api_impl_v1 to v2 as well
* guide: add `Version` request for runtime API
* add version query to runtime API
* reintroduce OldV1SessionInfo in a limited way
* Companion PR for removing Prometheus metrics prefix
* Was missing some metrics
* Fix missing renames
* Fix test
* Fixes
* Update test
* Update Substrate
* Second time
* remove prefix from intergration test for zombienet
* update zombienet image
* Update Substrate
Co-authored-by: Bastian Köcher <info@kchr.de>
Co-authored-by: Javier Viola <pepoviola@gmail.com>
* remove Default from CandidateHash
* Apply suggestions from code review
Co-authored-by: Andronik Ordian <write@reusable.software>
* chore: fmt
* remove backed candidate default
* Partial migration away from CandidateReceipt::default
* Remove more CandidateReceipt defaults
* fmt
* Mostly remove CommittedCandidateReceipt default usage
* Remove CommittedCandidateReceipt
* Remove more Defaults from polakdot primitives v1 + fmt
* Remove more Default from polkadot primites v1
* WIP trying to get overseer example + tests to compile
* feat: add primitives test helpers
* reduce deps of helper
* update primitive helpers
* make candidate validation compile
* fixup cargo lock
* make av-store compile
* fixup disputes coordinator tests
* test: fixup backing
* test: fixup approval voting
* fixup bitfield signing
* test: fixup runtime-api
* test: fixup availability dist
* foxi[ pverseer test]
* remove some Defaults, remove bounds from `dummy`
All `fn dummy` in primitives need to be removed anyways.
This aids in the transition.
* it's a test helper, so always use std
* test: fixup parachains runtime tests
Excluding benches.
* fix keyring
* fix paras runtime properly, no more default
* Remove fn dummy() usage from approval voting
* Move TestCandidateBuilder out of av store to test helpers
* Make candidate validation tests pass
* Make most dispute coirdinator tests pass
* Make provisioner tests work
* Make availability recovery tests work with test helpers
* Update polkadot-collator-protocol tests
* Update statement distribution tests
* Update polkadot overseer examples and tests
* Derive default for validation code so we don't break unrelated things
* Make para runtime test pass (no bench)
* Some more work
* chore: cargo fmt
* cargo fix
* avoid some Default::default
* fixup dispute coordinator test
* remove unused crate deps
* remove Default::default wherever possible, replace by dummy_* for the most part
* chore: cargo fmt
* Remove some warnings
* Remove CommittedCandidateReceipt dummy
* Remove CandidateReceipt dummy
* Remove CandidateDescriptor dummy
* Remove commented out code
* Fix para runtime tests
* chore: nightly
* Some updates to the builder
* Dynamically adjust mock head data size
* Make dispute cooridinator tests work
* Fix test candidate_backing_reorders_votes work
* +nightly-2021-10-29 fmt
* Spelling and remove a default use in builder
* Various clean up
* More small updates
* fmt
* More small updates
* Doc comments for test helpers
* cargo run --quiet --release --features=runtime-benchmarks -- benchmark --chain=kusama-dev --steps=50 --repeat=20 --pallet=runtime_parachains::paras_inherent --extrinsic=* --execution=wasm --wasm-execution=compiled --heap-pages=4096 --header=./file_header.txt --output=./runtime/kusama/src/weights/runtime_parachains_paras_inherent.rs
* cargo run --quiet --release --features=runtime-benchmarks -- benchmark --chain=polkadot-dev --steps=50 --repeat=20 --pallet=runtime_parachains::paras_inherent --extrinsic=* --execution=wasm --wasm-execution=compiled --heap-pages=4096 --header=./file_header.txt --output=./runtime/polkadot/src/weights/runtime_parachains_paras_inherent.rs
* Update lib.rs
* review comments
* fix warnings
* fix test by using correct candidate receipt relay parent
Co-authored-by: Andronik Ordian <write@reusable.software>
Co-authored-by: emostov <32168567+emostov@users.noreply.github.com>
Co-authored-by: Parity Bot <admin@parity.io>
Co-authored-by: Gavin Wood <gavin@parity.io>
Closes https://github.com/paritytech/polkadot/issues/4293
This PR changes the way how we treat a certain subset of PVF preparation
errors. Specifically, now only the deterministic errors are treated as
invalid candidates. That is, the errors that are easily
attributable to either the the PVF contents or the wasmtime code, but
not e.g. I/O errors that could be triggered by the OS (insufficient
memory, disk failure, too much load, etc). The latter are treated as
internal errors and thus do not trigger the disputes.
* pvf: make execution timeout configurable
* guide: add timeouts to candidate validation params
* add timeouts to candidate validation messages
* fmt
* port backing to use the backing pvf timeout
* port approval-voting to use the execution timeout
* port dispute participation to use the correct timeout
* fmt
* address grumbles & test failure
* feat/overseer: introduce closure init
Enables removal of the connected/disconnected overseer state.
* feat/overseer: allow replacement logic to access the original
Allows to re-use init-once types, which would otherwise error.
* feat/overseer: introduce external connector
Preparation for removal of `AllSubsystems`
which is another prerequisite for removing
the connect/disconnect state.
* fix/test: replace needs closure
* fixup
* simplify
* mea culpa
* all-subsystems-gen test
* Factor out runtime module into utils.
* Add maybe_authority information to `PeerConnected` event.
We already gather this information in authority discovery, so we might
as well share it with others.
This opens up an easy path to trigger validators differently from normal
nodes, e.g. for prioritization. This change has become more important
now, that we just connect to all validators and therefore just have a
long peer list without any information about those nodes.
* Test fix.
* Implement PVF validation host
* WIP: Diener
* Increase the alloted compilation time
* Add more comments
* Minor clean up
* Apply suggestions from code review
Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
* Fix pruning artifact removal
* Fix formatting and newlines
* Fix the thread pool
* Update node/core/pvf/src/executor_intf.rs
Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
* Remove redundant test declaration
* Don't convert the path into an intermediate string
* Try to workaround the test failure
* Use the puppet_worker trick again
* Fix a blip
* Move `ensure_wasmtime_version` under the tests mod
* Add a macro for puppet_workers
* fix build for not real-overseer
* Rename the puppet worker for adder collator
* play it safe with the name of adder puppet worker
* Typo: triggered
* Add more comments
* Do not kill exec worker on every error
* Plumb Duration for timeouts
* typo: critical
* Add proofs
* Clean unused imports
* Revert "WIP: Diener"
This reverts commit b9f54e513366c7a6dfdd117ac19fbdc46b900b4d.
* Sync version of wasmtime
* Update cargo.lock
* Update Substrate
* Merge fixes still
* Update wasmtime version in test
* bastifmt
Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
* Squash spaces
* Trailing new line for testing.rs
* Remove controversial code
* comment about biasing
* Fix suggestion
* Add comments
* make it more clear why unwrap_err
* tmpfile retry
* proper proofs for claim_idle
* Remove mutex from ValidationHost
* Add some more logging
* Extract exec timeout into a constant
* Add some clarifying logging
* Use blake2_256
* Clean up the merge
Specifically the leftovers after removing real-overseer
* Update parachain/test-parachains/adder/collator/Cargo.toml
Co-authored-by: Andronik Ordian <write@reusable.software>
Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
Co-authored-by: Andronik Ordian <write@reusable.software>
* use compressed blob in candidate-validation
* add some tests for compressed code blobs
* remove CompressedPoV and apply compression in collation-generation
* decompress BlockData before executing
* don't produce oversized collations
* add test for PoV decompression failure
* fix tests and clean up
* fix test
* address review and fix CI
* take this )
* code stored in para + modify CandidateDescriptor.
* WIP: digest + some more impl
* validation_code_hash in payload + check in inclusion
* check in client + refator
* tests
* fix encoding indices
* remove old todos
* fix test
* fix test
* add test
* fetch validation code inside collation-generation from the relay-chain
* HashMismatch -> PoVHashMismatch + miscompilation
* refactor, store hash when needed
* storage rename: more specific but slightly too verbose
* do not hash on candidate validation, fetch hash instead
* better test
* fix test
* guide updates
* don't panic in runtime
Co-authored-by: Robert Habermeier <rphmeier@gmail.com>
* Remove stuff out of the runtime that does not belong there.
There might be more, but it is a start.
* White space fixes.
* Fix tests.
* Leave whitespace in ui tests alone.
* Add back zstd for no reason.
* Fix browser wasm (hopefully)
* Update shared-memory to new version & refactor
This two are combined in a single commit because the new version of
shared-memory doesn't provide the used functionality anymore.
Therefore in order to update the version of this crate we implement the
functionality that we need by ourselves, providing a cleaner API along
the way.
* Significantly decrease the required memory for a workspace
For some reason it was allocating an entire GiB of memory. I suspect
this has something to do with the current memory size limit of a PVF
execution environment (the prior name suggests that). However, we don't
need so much memory anywhere near that amount.
In fact, we could reduce the allocated size even more, but that maybe
for the next time.
* Unlink shmem just after opening
That will make sure that we don't leak the shmem accidentally.
* Do not compile workspace mod for androind and wasm
* Address some review comments
* Fix the test runner
* Fix missed +1 for the attached flag
* Use .expect rather than .unwrap
* Add a rustdoc for the workspace module
* fixup! Use .expect rather than .unwrap
* Add some doc comments to pub members
* Warn on error removing shm_unlink
* Change the alignment implementation
* Fix the comment nit
* PVD: `block_number`->`relay_parent_number`
* ValidationParams: `relay_chain_height`->`relay_parent_number`
* Expose DMQ MQC hash as a well-known-key
This way the relay storage merkle proofs will be able to obtain the DMQ
MQC hash and we will be able to remove the it from the
PersistedValidationData struct.
* PersistedValidationData: Remove HRMP MQC heads
* PersistedValidationData: Remove `dmq_mqc_head`
* Expose the HRMP ingress channel index as a well-known-key
This way a parachain (PVF and collator) can find all the parachains that
have an outbound channel to the given one. That allows in turn to find
all the inbound channels for the given para.
Having access to that allows the parachain to get the same information
as the hrmp_mqc_heads now provide.
* Rename `relay_storage_root` to `relay_parent_storage_root`
* refactor View to include finalized_number
* guide: update the NetworkBridge on BlockFinalized
* av-store: fix the tests
* actually fix tests
* grumbles
* ignore macro doctest
* use Hash::repeat_bytes more consistently
* broadcast empty leaves updates as well
* fix issuing view updates on empty leaves updates
* reexport prometheus-super for ease of use of other subsystems
* add some prometheus timers for collation generation subsystem
* add timing metrics to av-store
* add metrics to candidate backing
* add timing metric to bitfield signing
* add timing metrics to candidate selection
* add timing metrics to candidate-validation
* add timing metrics to chain-api
* add timing metrics to provisioner
* add timing metrics to runtime-api
* add timing metrics to availability-distribution
* add timing metrics to bitfield-distribution
* add timing metrics to collator protocol: collator side
* add timing metrics to collator protocol: validator side
* fix candidate validation test failures
* add timing metrics to pov distribution
* add timing metrics to statement-distribution
* use substrate_prometheus_endpoint prometheus reexport instead of prometheus_super
* don't include JOB_DELAY in bitfield-signing metrics
* give adder-collator ability to easily export its genesis-state and validation code
* wip: adder-collator pushbutton script
* don't attempt to register the adder-collator automatically
Instead, get these values with
```sh
target/release/adder-collator export-genesis-state
target/release/adder-collator export-genesis-wasm
```
And then register the parachain on https://polkadot.js.org/apps/?rpc=ws%3A%2F%2F127.0.0.1%3A9944#/explorer
To collect prometheus data, after running the script, create `prometheus.yml` per the instructions
at https://www.notion.so/paritytechnologies/Setting-up-Prometheus-locally-835cb3a9df7541a781c381006252b5ff
and then run:
```sh
docker run -v `pwd`/prometheus.yml:/etc/prometheus/prometheus.yml:z --network host prom/prometheus
```
Demonstrates that data makes it across to prometheus, though it is likely to be useful in the future
to tweak the buckets.
* Update parachain/test-parachains/adder/collator/src/cli.rs
Co-authored-by: Andronik Ordian <write@reusable.software>
* use the grandpa-pause parameter
* skip metrics in tracing instrumentation
* remove unnecessary grandpa_pause cli param
Co-authored-by: Andronik Ordian <write@reusable.software>
* drop in tracing to replace log
* add structured logging to trace messages
* add structured logging to debug messages
* add structured logging to info messages
* add structured logging to warn messages
* add structured logging to error messages
* normalize spacing and Display vs Debug
* add instrumentation to the various 'fn run'
* use explicit tracing module throughout
* fix availability distribution test
* don't double-print errors
* remove further redundancy from logs
* fix test errors
* fix more test errors
* remove unused kv_log_macro
* fix unused variable
* add tracing spans to collation generation
* add tracing spans to av-store
* add tracing spans to backing
* add tracing spans to bitfield-signing
* add tracing spans to candidate-selection
* add tracing spans to candidate-validation
* add tracing spans to chain-api
* add tracing spans to provisioner
* add tracing spans to runtime-api
* add tracing spans to availability-distribution
* add tracing spans to bitfield-distribution
* add tracing spans to network-bridge
* add tracing spans to collator-protocol
* add tracing spans to pov-distribution
* add tracing spans to statement-distribution
* add tracing spans to overseer
* cleanup
* Rename ExecutionMode to IsolationStrategy
Execution mode is too generic name and can imply a lot of different
aspects of execution. The notion of isolation better describes the
meant aspect.
And while I am at it, I also renamed mode -> strategy cause it seems a
bit more appropriate, although that is way more subjective.
* Fix compilation in wasm_executor tests.
* Add a comment to IsolationStrategy
* Update comments on IsolationStrategy
* Update node/core/candidate-validation/src/lib.rs
Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
* Accomodate the point on interruption
* Update parachain/src/wasm_executor/mod.rs
Co-authored-by: Andronik Ordian <write@reusable.software>
* Naming nits
Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
Co-authored-by: Andronik Ordian <write@reusable.software>
* Adds integration test based on adder collator
This adds an integration test for parachains that uses the adder
collator. The test will start two relay chain nodes and one collator and
waits until 4 blocks are build and enacted by the parachain.
* Make sure the integration test is run in CI
* Fix wasm compilation
* Update parachain/test-parachains/adder/collator/src/lib.rs
Co-authored-by: Sergei Shulepov <sergei@parity.io>
* Update cli/src/command.rs
Co-authored-by: Sergei Shulepov <sergei@parity.io>
* HRMP: Update the impl guide
* HRMP: Incorporate the channel notifications into the guide
* HRMP: Renaming in the impl guide
* HRMP: Constrain the maximum number of HRMP messages per candidate
This commit addresses the HRMP part of https://github.com/paritytech/polkadot/issues/1869
* XCM: Introduce HRMP related message types
* HRMP: Data structures and plumbing
* HRMP: Configuration
* HRMP: Data layout
* HRMP: Acceptance & Enactment
* HRMP: Test base logic
* Update adder collator
* HRMP: Runtime API for accessing inbound messages
Also, removing some redundant fully-qualified names.
* HRMP: Add diagnostic logging in acceptance criteria
* HRMP: Additional tests
* Self-review fixes
* save test refactorings for the next time
* Missed a return statement.
* a formatting blip
* Add missing logic for appending HRMP digests
* Remove the channel contents vectors which became empty
* Tighten HRMP channel digests invariants.
* Apply suggestions from code review
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
* Remove a note about sorting for channel id
* Add missing rustdocs to the configuration
* Clarify and update the invariant for HrmpChannelDigests
* Make the onboarding invariant less sloppy
Namely, introduce `Paras::is_valid_para` (in fact, it already is present
in the implementation) and hook up the invariant to that.
Note that this says "within a session" because I don't want to make it
super strict on the session boundary. The logic on the session boundary
should be extremely careful.
* Make `CandidateCheckContext` use T::BlockNumber for hrmp_watermark
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
* Parachain improvements
- Set the parachains configuration in Rococo genesis
- Don't stop the overseer when a subsystem job is stopped
- Several small code changes
* Remove unused functionality
* Return error from the runtime instead of printing it
* Apply suggestions from code review
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
* Update primitives/src/v1.rs
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
* Update primitives/src/v1.rs
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
* Fix test
* Revert "Update primitives/src/v1.rs"
This reverts commit 11fce2785acd1de481ca57815b8e18400f09fd52.
* Revert "Update primitives/src/v1.rs"
This reverts commit d6439fed4f954360c89fb1e12b73954902c76a41.
* Revert "Return error from the runtime instead of printing it"
This reverts commit cb4b5c0830ac516a6d54b2c24197e9354f2b98cb.
* Revert "Fix test"
This reverts commit 0c5fa1b5566d4cd3c55a55d485e707165ce7a59e.
* Update runtime/parachains/src/runtime_api_impl/v1.rs
Co-authored-by: Sergei Shulepov <sergei@parity.io>
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
Co-authored-by: Sergei Shulepov <sergei@parity.io>
* fix: ensure candidate validation gets code based on occupied core assumption
* guide: runtime API for historical validation code
* add historical runtime API
* integrate into runtime API subsystem
* remove blocked TODO
* fix service build: enable notifications protocol only under real overseer
* Update node/subsystem/src/messages.rs
Co-authored-by: Sergei Shulepov <sergei@parity.io>
* fix compilation
Co-authored-by: Robert Habermeier <robert@Roberts-MacBook-Pro.local>
Co-authored-by: Sergei Shulepov <sergei@parity.io>
* UMP: Update the impl guide
* UMP: Incorporate XCM related changes into the guide
* UMP: Data structures and configuration
* UMP: Initial plumbing
* UMP: Data layout
* UMP: Acceptance criteria & enactment
* UMP: Fix dispatcher bug and add the test for it
* UMP: Constrain the maximum size of an UMP message
This commit addresses the UMP part of https://github.com/paritytech/polkadot/issues/1869
* Fix failing test due to misconfiguration
* Make the type of RelayDispatchQueueSize be more apparent in the guide
* Revert renaming `max_upward_queue_capacity` to `max_upward_queue_count`
* convert spaces to tabs
Co-authored-by: Bernhard Schuster <bernhard@ahoi.io>
* Update runtime/parachains/src/router/ump.rs
Co-authored-by: Bernhard Schuster <bernhard@ahoi.io>
Co-authored-by: Bernhard Schuster <bernhard@ahoi.io>
* remove pending TODO after the DMP impl merge
* DMP: Update the impl guide
* DMP: Incorporate XCM related changes into the guide
This is the DMP related part of https://github.com/paritytech/polkadot/issues/1702