* Move NewGossipTopology -> SessionGridTopology outside as this implementation is shared
* Add method to return peers difference between topologies
* Implement basic grid topology usage for the bitfield distribution
* Fix tests
* Oops, fix tests
* Add some tests for random routing
* Add a unit test for topology distribution
* Store the current and the previous topology to match sessions boundaries
* Update tests
* Update node/network/bitfield-distribution/src/lib.rs
Co-authored-by: Andronik <write@reusable.software>
* Update node/network/protocol/src/grid_topology.rs
Co-authored-by: Andronik <write@reusable.software>
* Update node/network/bitfield-distribution/src/lib.rs
Co-authored-by: Andronik <write@reusable.software>
* Add some debug
* Fix tests as HashSet order is undefined
* Move session bounded topology to the common code part
* Fix tests
* Allow to select routing by peer index
* Implement grid topology in the statement distribution subsystem
* Fix tests compilation
* Fix test
* Refactor API slightly
* Address review comments
* Reduce runtime error logging severity
* Update node/network/protocol/src/grid_topology.rs
Co-authored-by: Bernhard Schuster <bernhard@ahoi.io>
* Update node/network/bitfield-distribution/src/tests.rs
Co-authored-by: Bernhard Schuster <bernhard@ahoi.io>
* Fmt run
* Use named struct
* Fix logging stuff
* One more accidental fmt damage
* Increase active queue size and add metrics
Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
* Revert "Increase active queue size and add metrics"
This reverts commit c4f48e8bded6dfeb9c62814ba2f8d815c34b04cf.
* Use validator index to choose the routing strategy
Noted by: @rphmeier
* Fix test after distribution logic fix
Co-authored-by: Andronik <write@reusable.software>
Co-authored-by: Bernhard Schuster <bernhard@ahoi.io>
Co-authored-by: Andrei Sandu <andrei-mihail@parity.io>
* Double grandpa gossip duration.
* Make resend period slightly larger.
So it won't get triggered by additional grandpa delay.
* Bump other values as well.
* Don't change gossip duration on Polkadot.
(and Westend as it is meant to be a testbed for Polkadot)
* Move NewGossipTopology -> SessionGridTopology outside as this implementation is shared
* Add method to return peers difference between topologies
* Implement basic grid topology usage for the bitfield distribution
* Fix tests
* Oops, fix tests
* Add some tests for random routing
* Add a unit test for topology distribution
* Store the current and the previous topology to match sessions boundaries
* Update tests
* Update node/network/bitfield-distribution/src/lib.rs
Co-authored-by: Andronik <write@reusable.software>
* Update node/network/protocol/src/grid_topology.rs
Co-authored-by: Andronik <write@reusable.software>
* Update node/network/bitfield-distribution/src/lib.rs
Co-authored-by: Andronik <write@reusable.software>
* Add some debug
* Fix tests as HashSet order is undefined
Co-authored-by: Andronik <write@reusable.software>
* Initial attempt to extract grid topology related code
* Use shared code in the approval distribution subsystem
* Fix spellcheck issues
* Moe Aggression stuff back to the approval-distribution subsystem
* Cargo fmt
* explicitly tag network requests with version
* fmt
* make PeerSet more aware of versioning
* some generalization of the network bridge to support upgrades
* walk back some renaming
* walk back some version stuff
* extract version from fallback
* remove V1 from NetworkBridgeUpdate
* add accidentally-removed timer
* implement focusing for versioned messages
* fmt
* fix up network bridge & tests
* remove inaccurate version check in bridge
* remove some TODO [now]s
* fix fallout in statement distribution
* fmt
* fallout in gossip-support
* fix fallout in collator-protocol
* fix fallout in bitfield-distribution
* fix fallout in approval-distribution
* fmt
* use never!
* fmt
* gossip-support: be explicit about dimensions
* some guide updates
* update network-bridge to distinguish x and y dimensions
* get everything to compile
* beginnings
* some TODOs
* polkadot runtime: use relevant_authorities
* make gossip topologies per-session
* better formatting
* gossip support: use current session validators
* expand in comment
* adjust tests and fix index bug
* add past/present/future connection test and clean up code
* fmt
* network bridge: updated types
* update protocols to new gossip topology message
* guide updates
* add session to BlockApprovalMeta
* add session to block info
* refactor knowledge and remove most unify logic
* start replacing gossip_peers with new SessionTopologies
* add routing information to message state
* add some utilities to SessionTopology
* implement new gossip topology logic
* re-implement unify_with_peer
* distribute assignments according to topology
* finish grid topology implementation
* refactor network bridge slightly
* issue connection requests on all past/present/future
* fmt
* address grumbles
* tighten invariants in unify_with_peer
* implement random propagation
* refactor: extract required routing adjustment logic
* some block-age logic
* aggressively propagate messages when finality is slow
* overhaul aggression system to have 3 levels
* add aggression metrics
* remove aggression L3
* reduce random circulation
* remove PeerData
* get approval tests compiling
* use btree_map in known_by to make deterministic
* Revert "use btree_map in known_by to make deterministic"
This reverts commit 330d65343a7bb6fe4dd0f24bd8dbc15c0cbdbd9d.
* test XY grid propagation
* remove stray println
* test unshared dimension propagation
* add random gossip check
* test unify_with_peer better
* test sending after getting gossip topology
* test L1 aggression on originator
* test L1 aggression for non-originators
* test non-originator aggression L2
* fnt
* ~spellcheck
* fix statement-distribution tests
* fix flaky test
* fix metrics typo
* re-send periodically
* test resending
* typo
Co-authored-by: Bernhard Schuster <bernhard@ahoi.io>
* add more metrics about apd messages
* add back unify_with_peer logs
* make Resend an enum
* be more explicit when resending
* fmt
* fix error
* add a TODO for refactoring
* remove debug metrics
* add some guide stuff
* fmt
* update runtime API in test-runtim
Co-authored-by: Bernhard Schuster <bernhard@ahoi.io>
This issue happens when some peer sends a good but already known Seconded statement and the statement-distribution code does not update the statements_received field in the peer_knowledge structure. Subsequently, a Valid statement causes out-of-view message that is incorrectly emitted and causes reputation lose.
This PR also introduces a concept of passing the specific pseudo-random generator to subsystems to make it easier to write deterministic tests. This functionality is not really necessary for the specific issue and unit test but it can be useful for other tests and subsystems.
* Try to fix out-of-view messages in approval distribution
Suggested by: @ordian
* Cargo fmt
* Add a unit test for the proposed fix
* Spelling fix
* Use a simplier approach to fix the race condition as suggested by @rphmeier
* Cargo fmt run
* remove v0 primitives from polkadot-primitives
* first pass: remove v0
* fix fallout in erasure-coding
* remove v1 primitives, consolidate to v2
* the great import update
* update runtime_api_impl_v1 to v2 as well
* guide: add `Version` request for runtime API
* add version query to runtime API
* reintroduce OldV1SessionInfo in a limited way
* Companion PR for removing Prometheus metrics prefix
* Was missing some metrics
* Fix missing renames
* Fix test
* Fixes
* Update test
* Update Substrate
* Second time
* remove prefix from intergration test for zombienet
* update zombienet image
* Update Substrate
Co-authored-by: Bastian Köcher <info@kchr.de>
Co-authored-by: Javier Viola <pepoviola@gmail.com>
* CI: add spellcheck
* revert me
* CI: explicit command for spellchecker
* spellcheck: edit misspells
* CI: run spellcheck on diff
* spellcheck: edits
* spellcheck: edit misspells
* spellcheck: add rules
* spellcheck: mv configs
* spellcheck: more edits
* spellcheck: chore
* spellcheck: one more thing
* spellcheck: and another one
* spellcheck: seems like it doesn't get to an end
* spellcheck: new words after rebase
* spellcheck: new words appearing out of nowhere
* chore
* review edits
* more review edits
* more edits
* wonky behavior
* wonky behavior 2
* wonky behavior 3
* change git behavior
* spellcheck: another bunch of new edits
* spellcheck: new words are koming out of nowhere
* CI: finding the master
* CI: fetching master implicitly
* CI: undebug
* new errors
* a bunch of new edits
* and some more
* Update node/core/approval-voting/src/approval_db/v1/mod.rs
Co-authored-by: Andronik Ordian <write@reusable.software>
* Update xcm/xcm-executor/src/assets.rs
Co-authored-by: Andronik Ordian <write@reusable.software>
* Apply suggestions from code review
Co-authored-by: Andronik Ordian <write@reusable.software>
* Suggestions from the code review
* CI: scan only changed files
Co-authored-by: Andronik Ordian <write@reusable.software>
* Simplify some Option / Result / ? operator patterns
When they identically match a combinator on those types.
Tool-aided by [comby-rust](https://github.com/huitseeker/comby-rust).
* adjust review comments
Co-authored-by: Shawn Tabrizi <shawntabrizi@gmail.com>
* Factor out runtime module into utils.
* Add maybe_authority information to `PeerConnected` event.
We already gather this information in authority discovery, so we might
as well share it with others.
This opens up an easy path to trigger validators differently from normal
nodes, e.g. for prioritization. This change has become more important
now, that we just connect to all validators and therefore just have a
long peer list without any information about those nodes.
* Test fix.
* tests/av-store: use future::join instead of future::select
* tests/backing: use future::join instead of future::select
* tests/provisioner: use future::join instead of future::select
* tests/av-dist: use future::join instead of future::select
* tests/av-recovery: use future::join instead of future::select
* tests/bridge: use future::join instead of future::select
* tests/collator-protocol: use future::join instead of future::select
* tests/stmt-dist: use future::join instead of future::select
* fix tests
* extract database from av-store itself
* generalize approval-voting over database type
* modes (without handling) and pruning old wakeups
* rework approval importing
* add our_approval_sig to ApprovalEntry
* import assignment
* guide updates for check-full-approval changes
* some aux functions
* send messages when becoming active.
* guide: network bridge sends view updates only when done syncing
* network bridge: send view updates only when done syncing
* tests for new network-bridge behavior
* add a test for updating approval entry with sig
* fix some warnings
* test load-all-blocks
* instantiate new parachains DB
* fix network-bridge empty view updates
* tweak
* fix wasm build, i think
* Update node/core/approval-voting/src/lib.rs
Co-authored-by: Andronik Ordian <write@reusable.software>
* add some versioning to parachains_db
* warnings
* fix merge changes
* remove versioning again
Co-authored-by: Andronik Ordian <write@reusable.software>
* approval-distribution: limit the amount of packets on unify
* guide: fix a typo
* compilation fix
* grammar
* Update roadmap/implementers-guide/src/node/approval/approval-distribution.md
Co-authored-by: David <dvdplm@gmail.com>
* more grammar
* propagate only local assignments/approvals after a certain depth
* increase the threshold
* guides update
Co-authored-by: David <dvdplm@gmail.com>
* WIP
* availability distribution, still very wip.
Work on the requesting side of things.
* Some docs on what I intend to do.
* Checkpoint of session cache implementation
as I will likely replace it with something smarter.
* More work, mostly on cache
and getting things to type check.
* Only derive MallocSizeOf and Debug for std.
* availability-distribution: Cache feature complete.
* Sketch out logic in `FetchTask` for actual fetching.
- Compile fixes.
- Cleanup.
* Format cleanup.
* More format fixes.
* Almost feature complete `fetch_task`.
Missing:
- Check for cancel
- Actual querying of peer ids.
* Finish FetchTask so far.
* Directly use AuthorityDiscoveryId in protocol and cache.
* Resolve `AuthorityDiscoveryId` on sending requests.
* Rework fetch_task
- also make it impossible to check the wrong chunk index.
- Export needed function in validator_discovery.
* From<u32> implementation for `ValidatorIndex`.
* Fixes and more integration work.
* Make session cache proper lru cache.
* Use proper lru cache.
* Requester finished.
* ProtocolState -> Requester
Also make sure to not fetch our own chunk.
* Cleanup + fixes.
* Remove unused functions
- FetchTask::is_finished
- SessionCache::fetch_session_info
* availability-distribution responding side.
* Cleanup + Fixes.
* More fixes.
* More fixes.
adder-collator is running!
* Some docs.
* Docs.
* Fix reporting of bad guys.
* Fix tests
* Make all tests compile.
* Fix test.
* Cleanup + get rid of some warnings.
* state -> requester
* Mostly doc fixes.
* Fix test suite.
* Get rid of now redundant message types.
* WIP
* Rob's review remarks.
* Fix test suite.
* core.relay_parent -> leaf for session request.
* Style fix.
* Decrease request timeout.
* Cleanup obsolete errors.
* Metrics + don't fail on non fatal errors.
* requester.rs -> requester/mod.rs
* Panic on invalid BadValidator report.
* Fix indentation.
* Use typed default timeout constant.
* Make channel size 0, as each sender gets one slot anyways.
* Fix incorrect metrics initialization.
* Fix build after merge.
* More fixes.
* Hopefully valid metrics names.
* Better metrics names.
* Some tests that already work.
* Slightly better docs.
* Some more tests.
* Fix network bridge test.
* add tracing to approval voting
* notify if session info is not working
* add dispute period to chain specs
* propagate genesis session to parachains runtime
* use `on_genesis_session`
* protect against zero cores in computation
* tweak voting rule to be based off of best and add logs
* genesis configuration should use VRF slots only
* swallow more keystore errors
* add some docs
* make validation-worker args non-optional and update clap
* better tracing for bitfield signing and provisioner
* pass amount of bits in bitfields to inclusion instead of recomputing
* debug -> warn for some logs
* better tracing for availability recovery
* a little av-store tracing
* bridge: forward availability recovery messages
* add missing try_from impl
* some more tracing
* improve approval distribution tracing
* guide: hold onto pending approval messages until NewBlocks
* Hold onto pending approval messages until NewBlocks
* guide: adjust comment
* process all actions for one wakeup at a time
* vec
* fix network bridge test
* replace randomness-collective-flip with Babe
* remove PairNotFound
* feat/view: assure heads in a view are sorted
Allows O(n) comparisons, adds an alternate equiv relation
which takes O(n^2) for integrity verification.
Ref #2133
* revert: remove custom PartialEq impl, there are no duplicates
* fix: do not sort the live_heads, that alters the local view
* refactor/view: heads should not be public
* chore/spellcheck: add unfinalized
* fix/view: add missing len() and is_empty() fns
* quirk
* vec is not view
* Update node/network/approval-distribution/src/tests.rs
Co-authored-by: Andronik Ordian <write@reusable.software>
* Update node/network/bridge/src/lib.rs
Co-authored-by: Andronik Ordian <write@reusable.software>
* Update node/network/protocol/src/lib.rs
Co-authored-by: Andronik Ordian <write@reusable.software>
* fixup comment
* fix botched test
Co-authored-by: Andronik Ordian <write@reusable.software>
* refactor/reputation: unify the values used
* chore/rep: rename Annoy* to Cost*, make duplicate message Cost*Repeated
* fix/reputation: lost and found, convert at the boundary to substrate
* refactor/rep: move conversion to base reputation one level down, left conversions
* fix/rep: order of magnitude adjustments
Thanks pierre!
* remove spaces
* chore/rep: give rationale for order of magnitude
* refactor/rep: move UnifiedReputationChange to separate file
* fix/rep: order of magnitudes correction