* Initial attempt to extract grid topology related code
* Use shared code in the approval distribution subsystem
* Fix spellcheck issues
* Moe Aggression stuff back to the approval-distribution subsystem
* Cargo fmt
* explicitly tag network requests with version
* fmt
* make PeerSet more aware of versioning
* some generalization of the network bridge to support upgrades
* walk back some renaming
* walk back some version stuff
* extract version from fallback
* remove V1 from NetworkBridgeUpdate
* add accidentally-removed timer
* implement focusing for versioned messages
* fmt
* fix up network bridge & tests
* remove inaccurate version check in bridge
* remove some TODO [now]s
* fix fallout in statement distribution
* fmt
* fallout in gossip-support
* fix fallout in collator-protocol
* fix fallout in bitfield-distribution
* fix fallout in approval-distribution
* fmt
* use never!
* fmt
* gossip-support: be explicit about dimensions
* some guide updates
* update network-bridge to distinguish x and y dimensions
* get everything to compile
* beginnings
* some TODOs
* polkadot runtime: use relevant_authorities
* make gossip topologies per-session
* better formatting
* gossip support: use current session validators
* expand in comment
* adjust tests and fix index bug
* add past/present/future connection test and clean up code
* fmt
* network bridge: updated types
* update protocols to new gossip topology message
* guide updates
* add session to BlockApprovalMeta
* add session to block info
* refactor knowledge and remove most unify logic
* start replacing gossip_peers with new SessionTopologies
* add routing information to message state
* add some utilities to SessionTopology
* implement new gossip topology logic
* re-implement unify_with_peer
* distribute assignments according to topology
* finish grid topology implementation
* refactor network bridge slightly
* issue connection requests on all past/present/future
* fmt
* address grumbles
* tighten invariants in unify_with_peer
* implement random propagation
* refactor: extract required routing adjustment logic
* some block-age logic
* aggressively propagate messages when finality is slow
* overhaul aggression system to have 3 levels
* add aggression metrics
* remove aggression L3
* reduce random circulation
* remove PeerData
* get approval tests compiling
* use btree_map in known_by to make deterministic
* Revert "use btree_map in known_by to make deterministic"
This reverts commit 330d65343a7bb6fe4dd0f24bd8dbc15c0cbdbd9d.
* test XY grid propagation
* remove stray println
* test unshared dimension propagation
* add random gossip check
* test unify_with_peer better
* test sending after getting gossip topology
* test L1 aggression on originator
* test L1 aggression for non-originators
* test non-originator aggression L2
* fnt
* ~spellcheck
* fix statement-distribution tests
* fix flaky test
* fix metrics typo
* re-send periodically
* test resending
* typo
Co-authored-by: Bernhard Schuster <bernhard@ahoi.io>
* add more metrics about apd messages
* add back unify_with_peer logs
* make Resend an enum
* be more explicit when resending
* fmt
* fix error
* add a TODO for refactoring
* remove debug metrics
* add some guide stuff
* fmt
* update runtime API in test-runtim
Co-authored-by: Bernhard Schuster <bernhard@ahoi.io>
* Add mmr_root() to pallet-mmr API to expose root from state
* use the right MmrApi primitives
* bridges: use correct mmr primitives
* rococo: beefy-mmr deposit mmr root digest
* fix lockfile
* update lockfile for {"substrate"}
Co-authored-by: parity-processbot <>
* Move `trait ParachainHost` to a separate version independent module
`trait ParachainHost` is no longer part of a specific primitives
version. Instead there is a single trait for stable and staging api
versions. The trait contains stable AND staging methods. The latter are
explicitly marked as unstable.
* Fix `use` primitives
`polkadot_primitives::v2` becomes `polkadot_primitives::runtime_api`
* Staging API declaration and stubs
Introduces the concept for 'staging functions' in runtime API. These
functions are still in testing and they are meant to be used only
within test networks (Westend).
They coexist with the stable calls for technical reasons - maintaining
different runtime APIs for different networks is hard to implement.
Check the doc comments in source files for more details how the staging
API should be used.
* Add new staging method - get_session_disputes()
Add `staging_get_session_disputes` to `ParachainHost` as the first
method of the staging API.
* Hide vstaging runtime api implementations behind feature flag
* Fix test runtime
* fn staging_get_session_disputes() is renamed to fn staging_get_disputes()
The PVF host is designed to avoid spawning tasks to minimize knowledge
of outer code. Using `async_std::task::spawn` (or Tokio's counterpart)
deemed unacceptable, `SpawnNamed` undesirable. Instead there is only one
task returned that is spawned by the candidate-validation subsystem.
The tasks from the sub-components are polled by that root task.
However, the way the tasks are bundled was incorrect. There was a giant
select that was polling those tasks. Particularly, that implies that as soon as
one of the arms of that select goes into await those sub-tasks stop
getting polled. This is a recipe for a deadlock which indeed happened
here.
Specifically, the deadlock happened during sending messages to the
execute queue by calling
[`send_execute`](https://github.com/paritytech/polkadot/blob/a68d9be35656dcd96e378fd9dd3d613af754d48a/node/core/pvf/src/host.rs#L601).
When the channel to the queue reaches the capacity, the control flow is
suspended until the queue handles those messages. Since this code is
essentially reached from [one of the select
arms](https://github.com/paritytech/polkadot/blob/a68d9be35656dcd96e378fd9dd3d613af754d48a/node/core/pvf/src/host.rs#L371),
the queue won't be given the control and thus no further progress can be
made.
This problem is solved by bundling the tasks one level higher instead,
by `selecting` over those long-running tasks.
We also stop treating returning from those long-running tasks as error
conditions, since that can happen during legit shutdown.
This issue happens when some peer sends a good but already known Seconded statement and the statement-distribution code does not update the statements_received field in the peer_knowledge structure. Subsequently, a Valid statement causes out-of-view message that is incorrectly emitted and causes reputation lose.
This PR also introduces a concept of passing the specific pseudo-random generator to subsystems to make it easier to write deterministic tests. This functionality is not really necessary for the specific issue and unit test but it can be useful for other tests and subsystems.
* Rename to BagError
* Additional parameter for 'revert' command
* Set aux revert param to None
* Align to changes in how the WASM executor is configured in `substrate`
* update lockfile for {"substrate"}
* update lockfile for {"substrate"}
* Update substrate
* Update substrate
Co-authored-by: Keith Yeung <kungfukeith11@gmail.com>
Co-authored-by: Davide Galassi <davxy@datawok.net>
Co-authored-by: Shawn Tabrizi <shawntabrizi@gmail.com>
Co-authored-by: parity-processbot <>
* use coarsetime-saturated for the time being
* Revert "use coarsetime-saturated for the time being"
This reverts commit 114263f0e6a98fd94ac1b29aad2efb1c75ada6fc.
* make coarsetime min 0.1.22