* initial impl approval distribution
* initial tests and fixes
* batching seems difficult: different peers have different needs
* bridge: fix test after merge
* some guide updates
* only send assignments to peers who know about the block
* fix a test, add approvals test
* simplify
* do not send assignment to peers for finalized blocks
* guide: protocol input and output
* one more test
* more comments, logs, initial metrics
* fix a typo
* one more thing: early return when reimporting a thing locally
* skeleton
* skeleton aux-schema module
* start approval types
* start aux schema with aux store
* doc
* finish basic types
* start approval types
* doc
* finish basic types
* write out schema types
* add debug and codec impls to approval types
* add debug and codec impls to approval types
also add some key computation
* add debug and codec impls to approval types
* getters for block and candidate entries
* grumbles
* remove unused AssignmentId
* load_decode utility
* implement DB clearing
* function for adding new block entry to aux store
* start `canonicalize` implementation
* more skeleton
* finish implementing canonicalize
* tag TODO
* implement a test AuxStore
* add allow(unused)
* basic loading and deleting test
* block_entry test function
* add a test for `add_block_entry`
* ensure range is exclusive at end
* test clear()
* test that add_block sets children
* add a test for canonicalize
* Update node/core/approval-voting/src/aux_schema/mod.rs
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
* Update node/core/approval-voting/src/aux_schema/tests.rs
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
* Update node/core/approval-voting/src/aux_schema/mod.rs
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
* Adds message types
* Add code skeleton
* Adds subsystem code.
* Adds a first test
* Adds interaction result to availability_lru
* Use LruCache instead of a HashMap
* Whitespaces to tabs
* Do not ignore errors
* Change error type
* Add a timeout to chunk requests
* Add custom errors and log them
* Adds replace_availability_recovery method
* recovery_threshold computed by erasure crate
* change core to std
* adds docs to error type
* Adds a test for invalid reconstruction
* refactors interaction run into multiple methods
* Cleanup AwaitedChunks
* Even more fixes
* Test that recovery with wrong root is an error
* Break to launch another requests
* Styling fixes
* Add SessionIndex to API
* Proper relay parents for MakeRequest
* Remove validator_discovery and use message
* Remove a stream on exhaustion
* On cleanup free the request streams
* Fix merge and refactor
* Add one Jaeger span per relay parent
This adds one Jaeger span per relay parent, instead of always creating
new spans per relay parent. This should improve the UI view, because
subsystems are now grouped below one common span.
* Fix doc tests
* Replace `PerLeaveSpan` to `PerLeafSpan`
* More renaming
* Moare
* Update node/subsystem/src/lib.rs
Co-authored-by: Andronik Ordian <write@reusable.software>
* Skip the spans
* Increase `spec_version`
Co-authored-by: Andronik Ordian <write@reusable.software>
* start working on building the real overseer
Unfortunately, this fails to compile right now due to an upstream
failure to compile which is probably brought on by a recent upgrade
to rustc v1.47.
* fill in AllSubsystems internal constructors
* replace fn make_metrics with Metrics::attempt_to_register
* update to account for #1740
* remove Metrics::register, rename Metrics::attempt_to_register
* add 'static bounds to real_overseer type params
* pass authority_discovery and network_service to real_overseer
It's not straightforwardly obvious that this is the best way to handle
the case when there is no authority discovery service, but it seems
to be the best option available at the moment.
* select a proper database configuration for the availability store db
* use subdirectory for av-store database path
* apply Basti's patch which avoids needing to parameterize everything on Block
* simplify path extraction
* get all tests to compile
* Fix Prometheus double-registry error
for debugging purposes, added this to node/subsystem-util/src/lib.rs:472-476:
```rust
Some(registry) => Self::try_register(registry).map_err(|err| {
eprintln!("PrometheusError calling {}::register: {:?}", std::any::type_name::<Self>(), err);
err
}),
```
That pointed out where the registration was failing, which led to
this fix. The test still doesn't pass, but it now fails in a new
and different way!
* authorities must have authority discovery, but not necessarily overseer handlers
* fix broken SpawnedSubsystem impls
detailed logging determined that using the `Box::new` style of
future generation, the `self.run` method was never being called,
leading to dropped receivers / closed senders for those subsystems,
causing the overseer to shut down immediately.
This is not the final fix needed to get things working properly,
but it's a good start.
* use prometheus properly
Prometheus lets us register simple counters, which aren't very
interesting. It also allows us to register CounterVecs, which are.
With a CounterVec, you can provide a set of labels, which can
later be used to filter the counts.
We were using them wrong, though. This pattern was repeated in a
variety of places in the code:
```rust
// panics with an cardinality mismatch
let my_counter = register(CounterVec::new(opts, &["succeeded", "failed"])?, registry)?;
my_counter.with_label_values(&["succeeded"]).inc()
```
The problem is that the labels provided in the constructor are not
the set of legal values which can be annotated, but a set of individual
label names which can have individual, arbitrary values.
This commit fixes that.
* get av-store subsystem to actually run properly and not die on first signal
* typo fix: incomming -> incoming
* don't disable authority discovery in test nodes
* Fix rococo-v1 missing session keys
* Update node/core/av-store/Cargo.toml
* try dummying out av-store on non-full-nodes
* overseer and subsystems are required only for full nodes
* Reduce the amount of warnings on browser target
* Fix two more warnings
* InclusionInherent should actually have an Inherent module on rococo
* Ancestry: don't return genesis' parent hash
* Update Cargo.lock
* fix broken test
* update test script: specify chainspec as script argument
* Apply suggestions from code review
Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
* Update node/service/src/lib.rs
Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
* node/service/src/lib: Return error via ? operator
* post-merge blues
* add is_collator flag
* prevent occasional av-store test panic
* simplify fix; expand application
* run authority_discovery in Role::Discover when collating
* distinguish between proposer closed channel errors
* add IsCollator enum, remove is_collator CLI flag
* improve formatting
* remove nop loop
* Fix some stuff
* Adds test parachain adder collator
* Add sudo to Rococo, change session length to 30 seconds and some renaming
* Update to the latest changes on master
* Some fixes
* Fix compilation
* Update parachain/test-parachains/adder/collator/src/lib.rs
Co-authored-by: Sergei Shulepov <sergei@parity.io>
* Review comments
* Downgrade transaction version
* Fixes
* MOARE
* Register notification protocols
* utils: remove unused error
* av-store: more resilient to some errors
* address review nits
* address more review nits
Co-authored-by: Peter Goodspeed-Niklaus <peter.r.goodspeedniklaus@gmail.com>
Co-authored-by: Andronik Ordian <write@reusable.software>
Co-authored-by: Fedor Sakharov <fedor.sakharov@gmail.com>
Co-authored-by: Robert Habermeier <robert@Roberts-MBP.lan1>
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
Co-authored-by: Max Inden <mail@max-inden.de>
Co-authored-by: Sergey Shulepov <s.pepyakin@gmail.com>
Co-authored-by: Sergei Shulepov <sergei@parity.io>
* Improve and unify testing facilities
This improves the testing facilities by making the test client easier to
use. It also removes code that is not required for the test client.
Besides that it also moves the test service and test client under
`node/test`.
* Update Cargo.lock
* Update node/test/client/src/block_builder.rs
Co-authored-by: Andronik Ordian <write@reusable.software>
* Remove explicit lifetime annotation
* Fix warnings and add extra `BlockBuilderExt`
Co-authored-by: Andronik Ordian <write@reusable.software>
* Remove old service, 3rd try
i.e.
Revert "Revert "Remove Old Service, 2nd try (#1732)" (#1758)"
This reverts commit 9a0f08bfe1.
Closes#1757.
We now have some evidence that the polkadot validator was producing
blocks after all; the reason the blocks_constructed metric was 0 was
that as a new metric it hadn't yet been incorporated into that
branch's codebase. See
https://github.com/paritytech/polkadot/issues/1757#issuecomment-700977602
As this PR is based on a newer `master` branch than the previous one,
that should hopefully no longer be an issue.
* paras trait now has an Origin type
* initial work running a two node local net
* use the right incantations so the nodes produce blocks together
* improve internal documentation
Co-authored-by: Bastian Köcher <git@kchr.de>
* Restore "Remove service, migrate all to service-new (#1630)"
i.e.
Revert "Revert "Remove service, migrate all to service-new (#1630)" (#1731)"
This reverts commit b4457f555b.
This allows us to get the changeset from #1630 into a new branch
which can be merged sometime in the future after appropriate burnin
tests have completed.
* remove ',)' from codebase outside of macros
* restore bdfl-preferred formatting
* attempt to improve destructuring formatting
* rename polkadot-service-new -> polkadot-service
* Apply suggestions from code review
Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
* remove unused import
* Update runtime/rococo-v1/README.md
Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
Co-authored-by: Andronik Ordian <write@reusable.software>
Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
* import rococo into chain-spec
* make a few stabs at moving forward
* wip: rococo readme
* remove /service crate
- Move the chain-spec files to node-service
- update sufficient cargo files that polkadot-service-new builds
- not everything else builds yet
* wip: chase down some build errors in polkadot-cli
There's a lot more to go, but some progress has happened.
* make more progress getting polkadot-cli to build
* don't ignore polkadot.json within the res directory
* don't recreate pathbufs
* Prepare Polkadot to be used by Cumulus
This begins to make Polkadot usable from Cumulus.
* Remove old test
* migrate new_chain_ops fix from /service
* partially remove node/test-service
* Reset some changes
* Revert "partially remove node/test-service"
This reverts commit 7b8f9ba5bfc286a309df89853ae11facf3277ffb.
* WIP: replace v0 ParachainHost impl with v1 for test runtime
This is necessary because one of the current errors when building
the test service boils down to:
the trait bound `polkadot_test_runtime::RuntimeApiImpl<...>`:
`polkadot_primitives::v1::ParachainHost<...>` is not satisfied
This is WIP because it appears to be causing some std leakage into
the wasm environment, or something; the compiler is currently
complaining about duplicate definitions of `panic_handler` and `oom`.
Presumably I have to identify all std types (Vec etc) and replace
them with sp_std equivalents.
* fix test runtime build
it wasn't std leakage, after all
* bump westend spec version
* use service-new as service within cli
* to revert: demo that forwarding the test runtime to the real impl blows up
* Revert "to revert: demo that forwarding the test runtime to the real impl blows up"
This reverts commit 68d2f385f378721c7433e3e39133434610cd2a51.
* Revert "Revert "to revert: demo that forwarding the test runtime to the real impl blows up""
This reverts commit 04cb1cbf8873b4429cb9c9fdccb7f4bb137dc720.
Might have just forgotten to disable default features
* More reverts
* MOARE
* plug in the runtime as the generic instantiation
This feels closer to a solution, but it still has problems: in particular,
it's assumed that Runtime implements all appropriate Trait traits,
which this one apparently does not.
* implement necessary traits to get the test runtime compiling
This is almost certainly not correct in some way; it really
looks like I need to mess with the construct_runtime! macro
somehow, to inject the inclusion trait's event type as a Event
variant. Still, better lock down this changeset while it all
compiles.
* add inclusion::Event as variant into Event enum
* implement unimplemented bits in kusama
* implement unimplemented bits in polkadot runtime
* implement unimplemented bits in westend runtime
* migrate client upgrades from master
* update test service with new node changes
* package metadata--that wasn't intended to be removed
* add parachains v1 modules to each runtime
It's not clear what precisely this does, but it's probably the right
thing to do.
* enable cli to opt out of full node features
* adjust rococo chainspec per example
https://github.com/paritytech/polkadot/blob/26f1fa47f7836ab4bee5d4aad127ebce748320dd/service/src/chain_spec.rs#L362
* try to fix Cargo.lock
Co-authored-by: Bastian Köcher <git@kchr.de>
Co-authored-by: Andronik Ordian <write@reusable.software>
* WIP
* The initial implementation of the collator side.
* Improve comments
* Multiple collation requests
* Add more tests and comments to validator side
* Add comments, remove dead code
* Apply suggestions from code review
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
* Fix build after suggested changes
* Also connect to the next validator group
* Remove a Future impl and move TimeoutExt to util
* Minor nits
* Fix build
* Change FetchCollations back to FetchCollation
* Try this
* Final fixes
* Fix build
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
* choose the straightforward candidate selection algorithm for now
* add draft implementation of candidate selection
* fix typo in summary
* more properly report misbehaving collators
* describe how CandidateSelection subsystem becomes aware of candidates
* revise candidate selection / collator protocol interaction pattern
* implement rest of candidate selection per the guide
* review: resolve nits
* start writing test suite, harness
* implement first test
* add second test
* implement third test
Co-authored-by: Bernhard Schuster <bernhard@ahoi.io>
* update primitives
* correct parent_head field
* make hrmp field pub
* refactor validation data: runtime
* refactor validation data: messages
* add arguments to full_validation_data runtime API
* port runtime API
* mostly port over candidate validation
* remove some parameters from ValidationParams
* guide: update candidate validation
* update candidate outputs
* update ValidationOutputs in primitives
* port over candidate validation
* add a new test for no-transient behavior
* update util runtime API wrappers
* candidate backing
* fix missing imports
* change some fields of validation data around
* runtime API impl
* update candidate validation
* fix backing tests
* grumbles from review
* fix av-store tests
* fix some more crates
* fix provisioner tests
* fix availability distribution tests
* port collation-generation to new validation data
* fix overseer tests
* Update roadmap/implementers-guide/src/node/utility/candidate-validation.md
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
* update networking types
* port over overseer-protocol message types
* Add the collation protocol to network bridge
* message sending
* stub for ConnectToValidators
* add some helper traits and methods to protocol types
* add collator protocol message
* leaves-updating
* peer connection and disconnection
* add utilities for dispatching multiple events
* implement message handling
* add an observedrole enum with equality and no sentry nodes
* derive partial-eq on network bridge event
* add PartialEq impls for network message types
* add Into implementation for observedrole
* port over existing network bridge tests
* add some more tests
* port bitfield distribution
* port over bitfield distribution tests
* add codec indices
* port PoV distribution
* port over PoV distribution tests
* port over statement distribution
* port over statement distribution tests
* update overseer and service-new
* address review comments
* port availability distribution
* port over availability distribution tests
* sketch out provisioner basics
* handle provisionable data
* stub out select_inherent_data
* split runtime APIs into sub-chapters to improve linkability
* explain SignedAvailabilityBitfield semantics
* add internal link to further documentation
* some more work figuring out how the provisioner can do its thing
* fix broken link
* don't import enum variants where it's one layer deep
* make request_availability_cores a free fn in util
* document more precisely what should happen on block production
* finish first-draft implementation of provisioner
* start working on the full and proper backed candidate selection rule
* Pass number of block under construction via RequestInherentData
* Revert "Pass number of block under construction via RequestInherentData"
This reverts commit 850fe62cc0dfb04252580c21a985962000e693c8.
That initially looked like the better approach--it spent the time
budget for fetching the block number in the proposer, instead of
the provisioner, and that felt more appropriate--but it turns out
not to be obvious how to get the block number of the block under
construction from within the proposer. The Chain API may be less
ideal, but it should be easier to implement.
* wip: get the block under production from the Chain API
* add ChainApiMessage to AllMessages
* don't break the run loop if a provisionable data channel closes
* clone only those backed candidates which are coherent
* propagate chain_api subsystem through various locations
* add delegated_subsystem! macro to ease delegating subsystems
Unfortunately, it doesn't work right:
```
error[E0446]: private type `CandidateBackingJob` in public interface
--> node/core/backing/src/lib.rs:775:1
|
86 | struct CandidateBackingJob {
| - `CandidateBackingJob` declared as private
...
775 | delegated_subsystem!(CandidateBackingJob as CandidateBackingSubsystem);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ can't leak private type
```
I'm not sure precisely what's going wrong, here; I suspect the problem is
the use of `$job as JobTrait>::RunArgs` and `::ToJob`; the failure would be
that it's not reifying the types to verify that the actual types are public,
but instead referring to them via `CandidateBackingJob`, which is in fact private;
that privacy is the point.
Going to see if I can generic my way out of this, but we may be headed for a
quick revert here.
* fix delegated_subsystem
The invocation is a bit more verbose than I'd prefer, but it's also
more explicit about what types need to be public. I'll take it as a win.
* add provisioning subsystem; reduce public interface of provisioner
* deny missing docs in provisioner
* refactor core selection per code review suggestion
This is twice as much code when measured by line, but IMO it is
in fact somewhat clearer to read, so overall a win.
Also adds an improved rule for selecting availability bitfields,
which (unlike the previous implementation) guarantees that the
appropriate postconditions hold there.
* fix bad merge double-declaration
* update guide with (hopefully) complete provisioner candidate selection procedure
* clarify candidate selection algorithm
* Revert "clarify candidate selection algorithm"
This reverts commit c68a02ac9cf42b3a4a28eb197d38633a40d0e3e6.
* clarify candidate selection algorithm
* update provisioner to implement candidate selection per the guide
* add test that no more than one bitfield is selected per validator
* add test that each selected bitfield corresponds to an occupied core
* add test that more set bits win conflicts
* add macro for specializing runtime requests; specailize all runtime requests
* add tests harness for select_candidates tests
* add first real select_candidates test, fix test_harness
* add mock overseer and test that success is possible
* add test that the candidate selection algorithm picks the right ones
* make candidate selection test somewhat more stringent