* start working on building the real overseer
Unfortunately, this fails to compile right now due to an upstream
failure to compile which is probably brought on by a recent upgrade
to rustc v1.47.
* fill in AllSubsystems internal constructors
* replace fn make_metrics with Metrics::attempt_to_register
* update to account for #1740
* remove Metrics::register, rename Metrics::attempt_to_register
* add 'static bounds to real_overseer type params
* pass authority_discovery and network_service to real_overseer
It's not straightforwardly obvious that this is the best way to handle
the case when there is no authority discovery service, but it seems
to be the best option available at the moment.
* select a proper database configuration for the availability store db
* use subdirectory for av-store database path
* apply Basti's patch which avoids needing to parameterize everything on Block
* simplify path extraction
* get all tests to compile
* Fix Prometheus double-registry error
for debugging purposes, added this to node/subsystem-util/src/lib.rs:472-476:
```rust
Some(registry) => Self::try_register(registry).map_err(|err| {
eprintln!("PrometheusError calling {}::register: {:?}", std::any::type_name::<Self>(), err);
err
}),
```
That pointed out where the registration was failing, which led to
this fix. The test still doesn't pass, but it now fails in a new
and different way!
* authorities must have authority discovery, but not necessarily overseer handlers
* fix broken SpawnedSubsystem impls
detailed logging determined that using the `Box::new` style of
future generation, the `self.run` method was never being called,
leading to dropped receivers / closed senders for those subsystems,
causing the overseer to shut down immediately.
This is not the final fix needed to get things working properly,
but it's a good start.
* use prometheus properly
Prometheus lets us register simple counters, which aren't very
interesting. It also allows us to register CounterVecs, which are.
With a CounterVec, you can provide a set of labels, which can
later be used to filter the counts.
We were using them wrong, though. This pattern was repeated in a
variety of places in the code:
```rust
// panics with an cardinality mismatch
let my_counter = register(CounterVec::new(opts, &["succeeded", "failed"])?, registry)?;
my_counter.with_label_values(&["succeeded"]).inc()
```
The problem is that the labels provided in the constructor are not
the set of legal values which can be annotated, but a set of individual
label names which can have individual, arbitrary values.
This commit fixes that.
* get av-store subsystem to actually run properly and not die on first signal
* typo fix: incomming -> incoming
* don't disable authority discovery in test nodes
* Fix rococo-v1 missing session keys
* Update node/core/av-store/Cargo.toml
* try dummying out av-store on non-full-nodes
* overseer and subsystems are required only for full nodes
* Reduce the amount of warnings on browser target
* Fix two more warnings
* InclusionInherent should actually have an Inherent module on rococo
* Ancestry: don't return genesis' parent hash
* Update Cargo.lock
* fix broken test
* update test script: specify chainspec as script argument
* Apply suggestions from code review
Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
* Update node/service/src/lib.rs
Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
* node/service/src/lib: Return error via ? operator
* post-merge blues
* add is_collator flag
* prevent occasional av-store test panic
* simplify fix; expand application
* run authority_discovery in Role::Discover when collating
* distinguish between proposer closed channel errors
* add IsCollator enum, remove is_collator CLI flag
* improve formatting
* remove nop loop
* Fix some stuff
* Adds test parachain adder collator
* Add sudo to Rococo, change session length to 30 seconds and some renaming
* Update to the latest changes on master
* Some fixes
* Fix compilation
* Update parachain/test-parachains/adder/collator/src/lib.rs
Co-authored-by: Sergei Shulepov <sergei@parity.io>
* Review comments
* Downgrade transaction version
* Fixes
* MOARE
* Register notification protocols
* utils: remove unused error
* av-store: more resilient to some errors
* address review nits
* address more review nits
Co-authored-by: Peter Goodspeed-Niklaus <peter.r.goodspeedniklaus@gmail.com>
Co-authored-by: Andronik Ordian <write@reusable.software>
Co-authored-by: Fedor Sakharov <fedor.sakharov@gmail.com>
Co-authored-by: Robert Habermeier <robert@Roberts-MBP.lan1>
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
Co-authored-by: Max Inden <mail@max-inden.de>
Co-authored-by: Sergey Shulepov <s.pepyakin@gmail.com>
Co-authored-by: Sergei Shulepov <sergei@parity.io>
* Improve and unify testing facilities
This improves the testing facilities by making the test client easier to
use. It also removes code that is not required for the test client.
Besides that it also moves the test service and test client under
`node/test`.
* Update Cargo.lock
* Update node/test/client/src/block_builder.rs
Co-authored-by: Andronik Ordian <write@reusable.software>
* Remove explicit lifetime annotation
* Fix warnings and add extra `BlockBuilderExt`
Co-authored-by: Andronik Ordian <write@reusable.software>
* Remove old service, 3rd try
i.e.
Revert "Revert "Remove Old Service, 2nd try (#1732)" (#1758)"
This reverts commit 9a0f08bfe1.
Closes#1757.
We now have some evidence that the polkadot validator was producing
blocks after all; the reason the blocks_constructed metric was 0 was
that as a new metric it hadn't yet been incorporated into that
branch's codebase. See
https://github.com/paritytech/polkadot/issues/1757#issuecomment-700977602
As this PR is based on a newer `master` branch than the previous one,
that should hopefully no longer be an issue.
* paras trait now has an Origin type
* initial work running a two node local net
* use the right incantations so the nodes produce blocks together
* improve internal documentation
Co-authored-by: Bastian Köcher <git@kchr.de>
* Restore "Remove service, migrate all to service-new (#1630)"
i.e.
Revert "Revert "Remove service, migrate all to service-new (#1630)" (#1731)"
This reverts commit b4457f555b.
This allows us to get the changeset from #1630 into a new branch
which can be merged sometime in the future after appropriate burnin
tests have completed.
* remove ',)' from codebase outside of macros
* restore bdfl-preferred formatting
* attempt to improve destructuring formatting
* rename polkadot-service-new -> polkadot-service
* Apply suggestions from code review
Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
* remove unused import
* Update runtime/rococo-v1/README.md
Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
Co-authored-by: Andronik Ordian <write@reusable.software>
Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
* import rococo into chain-spec
* make a few stabs at moving forward
* wip: rococo readme
* remove /service crate
- Move the chain-spec files to node-service
- update sufficient cargo files that polkadot-service-new builds
- not everything else builds yet
* wip: chase down some build errors in polkadot-cli
There's a lot more to go, but some progress has happened.
* make more progress getting polkadot-cli to build
* don't ignore polkadot.json within the res directory
* don't recreate pathbufs
* Prepare Polkadot to be used by Cumulus
This begins to make Polkadot usable from Cumulus.
* Remove old test
* migrate new_chain_ops fix from /service
* partially remove node/test-service
* Reset some changes
* Revert "partially remove node/test-service"
This reverts commit 7b8f9ba5bfc286a309df89853ae11facf3277ffb.
* WIP: replace v0 ParachainHost impl with v1 for test runtime
This is necessary because one of the current errors when building
the test service boils down to:
the trait bound `polkadot_test_runtime::RuntimeApiImpl<...>`:
`polkadot_primitives::v1::ParachainHost<...>` is not satisfied
This is WIP because it appears to be causing some std leakage into
the wasm environment, or something; the compiler is currently
complaining about duplicate definitions of `panic_handler` and `oom`.
Presumably I have to identify all std types (Vec etc) and replace
them with sp_std equivalents.
* fix test runtime build
it wasn't std leakage, after all
* bump westend spec version
* use service-new as service within cli
* to revert: demo that forwarding the test runtime to the real impl blows up
* Revert "to revert: demo that forwarding the test runtime to the real impl blows up"
This reverts commit 68d2f385f378721c7433e3e39133434610cd2a51.
* Revert "Revert "to revert: demo that forwarding the test runtime to the real impl blows up""
This reverts commit 04cb1cbf8873b4429cb9c9fdccb7f4bb137dc720.
Might have just forgotten to disable default features
* More reverts
* MOARE
* plug in the runtime as the generic instantiation
This feels closer to a solution, but it still has problems: in particular,
it's assumed that Runtime implements all appropriate Trait traits,
which this one apparently does not.
* implement necessary traits to get the test runtime compiling
This is almost certainly not correct in some way; it really
looks like I need to mess with the construct_runtime! macro
somehow, to inject the inclusion trait's event type as a Event
variant. Still, better lock down this changeset while it all
compiles.
* add inclusion::Event as variant into Event enum
* implement unimplemented bits in kusama
* implement unimplemented bits in polkadot runtime
* implement unimplemented bits in westend runtime
* migrate client upgrades from master
* update test service with new node changes
* package metadata--that wasn't intended to be removed
* add parachains v1 modules to each runtime
It's not clear what precisely this does, but it's probably the right
thing to do.
* enable cli to opt out of full node features
* adjust rococo chainspec per example
https://github.com/paritytech/polkadot/blob/26f1fa47f7836ab4bee5d4aad127ebce748320dd/service/src/chain_spec.rs#L362
* try to fix Cargo.lock
Co-authored-by: Bastian Köcher <git@kchr.de>
Co-authored-by: Andronik Ordian <write@reusable.software>
* WIP
* The initial implementation of the collator side.
* Improve comments
* Multiple collation requests
* Add more tests and comments to validator side
* Add comments, remove dead code
* Apply suggestions from code review
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
* Fix build after suggested changes
* Also connect to the next validator group
* Remove a Future impl and move TimeoutExt to util
* Minor nits
* Fix build
* Change FetchCollations back to FetchCollation
* Try this
* Final fixes
* Fix build
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
* choose the straightforward candidate selection algorithm for now
* add draft implementation of candidate selection
* fix typo in summary
* more properly report misbehaving collators
* describe how CandidateSelection subsystem becomes aware of candidates
* revise candidate selection / collator protocol interaction pattern
* implement rest of candidate selection per the guide
* review: resolve nits
* start writing test suite, harness
* implement first test
* add second test
* implement third test
Co-authored-by: Bernhard Schuster <bernhard@ahoi.io>
* update primitives
* correct parent_head field
* make hrmp field pub
* refactor validation data: runtime
* refactor validation data: messages
* add arguments to full_validation_data runtime API
* port runtime API
* mostly port over candidate validation
* remove some parameters from ValidationParams
* guide: update candidate validation
* update candidate outputs
* update ValidationOutputs in primitives
* port over candidate validation
* add a new test for no-transient behavior
* update util runtime API wrappers
* candidate backing
* fix missing imports
* change some fields of validation data around
* runtime API impl
* update candidate validation
* fix backing tests
* grumbles from review
* fix av-store tests
* fix some more crates
* fix provisioner tests
* fix availability distribution tests
* port collation-generation to new validation data
* fix overseer tests
* Update roadmap/implementers-guide/src/node/utility/candidate-validation.md
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
* update networking types
* port over overseer-protocol message types
* Add the collation protocol to network bridge
* message sending
* stub for ConnectToValidators
* add some helper traits and methods to protocol types
* add collator protocol message
* leaves-updating
* peer connection and disconnection
* add utilities for dispatching multiple events
* implement message handling
* add an observedrole enum with equality and no sentry nodes
* derive partial-eq on network bridge event
* add PartialEq impls for network message types
* add Into implementation for observedrole
* port over existing network bridge tests
* add some more tests
* port bitfield distribution
* port over bitfield distribution tests
* add codec indices
* port PoV distribution
* port over PoV distribution tests
* port over statement distribution
* port over statement distribution tests
* update overseer and service-new
* address review comments
* port availability distribution
* port over availability distribution tests
* sketch out provisioner basics
* handle provisionable data
* stub out select_inherent_data
* split runtime APIs into sub-chapters to improve linkability
* explain SignedAvailabilityBitfield semantics
* add internal link to further documentation
* some more work figuring out how the provisioner can do its thing
* fix broken link
* don't import enum variants where it's one layer deep
* make request_availability_cores a free fn in util
* document more precisely what should happen on block production
* finish first-draft implementation of provisioner
* start working on the full and proper backed candidate selection rule
* Pass number of block under construction via RequestInherentData
* Revert "Pass number of block under construction via RequestInherentData"
This reverts commit 850fe62cc0dfb04252580c21a985962000e693c8.
That initially looked like the better approach--it spent the time
budget for fetching the block number in the proposer, instead of
the provisioner, and that felt more appropriate--but it turns out
not to be obvious how to get the block number of the block under
construction from within the proposer. The Chain API may be less
ideal, but it should be easier to implement.
* wip: get the block under production from the Chain API
* add ChainApiMessage to AllMessages
* don't break the run loop if a provisionable data channel closes
* clone only those backed candidates which are coherent
* propagate chain_api subsystem through various locations
* add delegated_subsystem! macro to ease delegating subsystems
Unfortunately, it doesn't work right:
```
error[E0446]: private type `CandidateBackingJob` in public interface
--> node/core/backing/src/lib.rs:775:1
|
86 | struct CandidateBackingJob {
| - `CandidateBackingJob` declared as private
...
775 | delegated_subsystem!(CandidateBackingJob as CandidateBackingSubsystem);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ can't leak private type
```
I'm not sure precisely what's going wrong, here; I suspect the problem is
the use of `$job as JobTrait>::RunArgs` and `::ToJob`; the failure would be
that it's not reifying the types to verify that the actual types are public,
but instead referring to them via `CandidateBackingJob`, which is in fact private;
that privacy is the point.
Going to see if I can generic my way out of this, but we may be headed for a
quick revert here.
* fix delegated_subsystem
The invocation is a bit more verbose than I'd prefer, but it's also
more explicit about what types need to be public. I'll take it as a win.
* add provisioning subsystem; reduce public interface of provisioner
* deny missing docs in provisioner
* refactor core selection per code review suggestion
This is twice as much code when measured by line, but IMO it is
in fact somewhat clearer to read, so overall a win.
Also adds an improved rule for selecting availability bitfields,
which (unlike the previous implementation) guarantees that the
appropriate postconditions hold there.
* fix bad merge double-declaration
* update guide with (hopefully) complete provisioner candidate selection procedure
* clarify candidate selection algorithm
* Revert "clarify candidate selection algorithm"
This reverts commit c68a02ac9cf42b3a4a28eb197d38633a40d0e3e6.
* clarify candidate selection algorithm
* update provisioner to implement candidate selection per the guide
* add test that no more than one bitfield is selected per validator
* add test that each selected bitfield corresponds to an occupied core
* add test that more set bits win conflicts
* add macro for specializing runtime requests; specailize all runtime requests
* add tests harness for select_candidates tests
* add first real select_candidates test, fix test_harness
* add mock overseer and test that success is possible
* add test that the candidate selection algorithm picks the right ones
* make candidate selection test somewhat more stringent
* skeleton for candidate-validation
* add to workspace
* implement candidate validation logic
* guide: note occupied-core assumption for candidate validation
* adjust message doc
* wire together `run` asynchronously
* add a Subsystem implementation
* clean up a couple warnings
* fix compilation errors due to merge
* improve candidate-validation.md
* remove old reference to subsystem-test helpers crate
* update Cargo.lock
* add a couple new Runtime API methods
* add a candidate validation message
* fetch validation data from the chain state
* some tests for assumption checking
* make spawn_validate_exhaustive mockable
* more tests on the error handling side
* fix all other grumbles except for wasm validation API change
* wrap a SpawnNamed in candidate-validation
* warn
* amend guide
* squanch warning
* remove duplicate after merge
* type defaults for ParachainHost
* add ValidationCode message
* implement core loop of runtime API subsystem
* subsystem trait implementation for runtime API subsystem
* implement a mock runtime API
* some tests that ensure requests are forwarded to runtime API correctly
* fix dependency grumbles
* improve RuntimeApiError API
* Initial commit
* WIP
* Make atomic transactions
* Remove pruning code
* Fix build and add a Nop to bridge
* Fixes from review
* Move config struct around for clarity
* Rename constructor and warn on missing docs
* Fix a test and rename a message
* Fix some more reviews
* Obviously failed to rebase cleanly
* update guide to reduce confusion and TODOs
* work from previous bitfield signing effort
There were large merge issues with the old bitfield signing PR, so
we're just copying all the work from that onto this and restarting.
Much of the existing work will be discarded because we now have better
tools available, but that's fine.
* start rewriting bitfield signing in terms of the util module
* implement construct_availability_bitvec
It's not an ideal implementation--we can make it much more concurrent--
but at least it compiles.
* implement the unimplemented portions of bitfield signing
* get core availability concurrently, not sequentially
* use sp-std instead of std for a parachain item
* resolve type inference failure caused by multiple From impls
* handle bitfield signing subsystem & Allmessages variant in overseer
* fix more multi-From inference issues
* more concisely handle overflow
Co-authored-by: Andronik Ordian <write@reusable.software>
* Revert "resolve type inference failure caused by multiple From impls"
This reverts commit 7fc77805de5e5074a1b01037f8d4e3919e03e0e1.
* Revert "fix more multi-From inference issues"
This reverts commit f14ffe589e20d664d8a900ed62f68b6fb844a514.
* impl From<i32> for ParaId
* handle another instance of AllSubsystems
* improve consistency when returning existing options
Co-authored-by: Andronik Ordian <write@reusable.software>
* Enable transfers
Also quash any conviction from Referendum Zero; Sudo was always
going to have been removed so lock-voting doesn't make sense in
this case.
* Add test for migration; remove superfluous comment.
* Fixes
* Bump
* Weekly elections
* get conclude signal working properly; don't allocate a vector
* wip: add test suite / example / explanation for using utility subsystem
Unfortunately, the test fails right now for reasons which seem
very odd. Just have to keep poking at it.
* explicitly import everything
* fix subsystem-util test
The root problem here was two-fold:
- there was a circular dependency from subsystem -> test-helpers/subsystem ->
subsystem
- cfg(test) doesn't propagate between crates
The solution: move the subsystem test helpers into a sub-module
within subsystem. Publicly export them from the previous location
so no other code breaks.
Doing this has an additional benefit: it ensures that no production
code can ever accidentally use the subsystem helpers, as they are compile-
gated on cfg(test).
* fully commit to moving test helpers into a subsystem module
* add some more tests
* get rid of log tests in favor of real error forwarding
It's not obvious whether we'll ever really want to chase down
these errors outside a testing context, but having the capability
won't hurt.
* fix issue which caused test to hang on osx
* only require that job errors are PartialEq when testing
also fix polkadot-node-core-backing tests
* get rid of any notion of partialeq
* rethink testing
Combine tests of starting and stopping job: leaving a test executor
with a job running was pretty clearly the cause of the sometimes-hang.
Also, add a timeout so tests _can't_ hang anymore; they just fail
after a while.
* rename fwd_errors -> forward_errors
* warn on error propagation failure
* fix unused import leftover from merge
* derive eq for subsystemerror
* Remove Sudo
NOTE: To ensure minimal index changes to pre-existing pallet deployments,
this is done with a "swap_remove" style; the previous last pallet
(Purchase), which is hitherto unused, has been shifted into the old index
of Sudo.
* Remove CC1 designation.
* Fixes
* Bump
* Fixes
* Fixes
* Fixes
* Fixes
* Fixes
* Fixes
* Fixes
* Fixes
* Fixes
* Fixes