* Initial attempt to extract grid topology related code
* Use shared code in the approval distribution subsystem
* Fix spellcheck issues
* Moe Aggression stuff back to the approval-distribution subsystem
* Cargo fmt
* explicitly tag network requests with version
* fmt
* make PeerSet more aware of versioning
* some generalization of the network bridge to support upgrades
* walk back some renaming
* walk back some version stuff
* extract version from fallback
* remove V1 from NetworkBridgeUpdate
* add accidentally-removed timer
* implement focusing for versioned messages
* fmt
* fix up network bridge & tests
* remove inaccurate version check in bridge
* remove some TODO [now]s
* fix fallout in statement distribution
* fmt
* fallout in gossip-support
* fix fallout in collator-protocol
* fix fallout in bitfield-distribution
* fix fallout in approval-distribution
* fmt
* use never!
* fmt
* gossip-support: be explicit about dimensions
* some guide updates
* update network-bridge to distinguish x and y dimensions
* get everything to compile
* beginnings
* some TODOs
* polkadot runtime: use relevant_authorities
* make gossip topologies per-session
* better formatting
* gossip support: use current session validators
* expand in comment
* adjust tests and fix index bug
* add past/present/future connection test and clean up code
* fmt
* network bridge: updated types
* update protocols to new gossip topology message
* guide updates
* add session to BlockApprovalMeta
* add session to block info
* refactor knowledge and remove most unify logic
* start replacing gossip_peers with new SessionTopologies
* add routing information to message state
* add some utilities to SessionTopology
* implement new gossip topology logic
* re-implement unify_with_peer
* distribute assignments according to topology
* finish grid topology implementation
* refactor network bridge slightly
* issue connection requests on all past/present/future
* fmt
* address grumbles
* tighten invariants in unify_with_peer
* implement random propagation
* refactor: extract required routing adjustment logic
* some block-age logic
* aggressively propagate messages when finality is slow
* overhaul aggression system to have 3 levels
* add aggression metrics
* remove aggression L3
* reduce random circulation
* remove PeerData
* get approval tests compiling
* use btree_map in known_by to make deterministic
* Revert "use btree_map in known_by to make deterministic"
This reverts commit 330d65343a7bb6fe4dd0f24bd8dbc15c0cbdbd9d.
* test XY grid propagation
* remove stray println
* test unshared dimension propagation
* add random gossip check
* test unify_with_peer better
* test sending after getting gossip topology
* test L1 aggression on originator
* test L1 aggression for non-originators
* test non-originator aggression L2
* fnt
* ~spellcheck
* fix statement-distribution tests
* fix flaky test
* fix metrics typo
* re-send periodically
* test resending
* typo
Co-authored-by: Bernhard Schuster <bernhard@ahoi.io>
* add more metrics about apd messages
* add back unify_with_peer logs
* make Resend an enum
* be more explicit when resending
* fmt
* fix error
* add a TODO for refactoring
* remove debug metrics
* add some guide stuff
* fmt
* update runtime API in test-runtim
Co-authored-by: Bernhard Schuster <bernhard@ahoi.io>
This issue happens when some peer sends a good but already known Seconded statement and the statement-distribution code does not update the statements_received field in the peer_knowledge structure. Subsequently, a Valid statement causes out-of-view message that is incorrectly emitted and causes reputation lose.
This PR also introduces a concept of passing the specific pseudo-random generator to subsystems to make it easier to write deterministic tests. This functionality is not really necessary for the specific issue and unit test but it can be useful for other tests and subsystems.
* split metrics from bitfield signing
* cleanup all logging
* add a unit test for subset generation
* chore: add one more test to assert need is properly represented
* u8 as usize
* chore: overseer fixin
* fix test
* Update node/network/bitfield-distribution/src/metrics.rs
Co-authored-by: Andrei Sandu <54316454+sandreim@users.noreply.github.com>
* Update node/network/bitfield-distribution/src/metrics.rs
Co-authored-by: Andrei Sandu <54316454+sandreim@users.noreply.github.com>
* fallout from suggested rename
* consistency
Co-authored-by: Andrei Sandu <54316454+sandreim@users.noreply.github.com>
* Use more reasonable buckets for `process_msg` histogram
* Another adjustments of the buckets
* Move historgram timer to a more relevant place
* Add a dedicated collation distribution time metric
* Cargo fmt
* Try to fix out-of-view messages in approval distribution
Suggested by: @ordian
* Cargo fmt
* Add a unit test for the proposed fix
* Spelling fix
* Use a simplier approach to fix the race condition as suggested by @rphmeier
* Cargo fmt run
* remove v0 primitives from polkadot-primitives
* first pass: remove v0
* fix fallout in erasure-coding
* remove v1 primitives, consolidate to v2
* the great import update
* update runtime_api_impl_v1 to v2 as well
* guide: add `Version` request for runtime API
* add version query to runtime API
* reintroduce OldV1SessionInfo in a limited way
* Add a simple metric for statements out-of-view
* Avoid repeated out-of-view peer reputation change messages
* Log reporting status
* Address review comments
* Use counter to store a number of unexpected messages from a peer
* Distinguish different unexpected statements in the metrics
* Fix labels cardinality
* Rename metric name to `statements_unexpected`
* Move metrics to a separate unit, avoid unnecessary enum
* Prefer specific methods in lieu of public constants
* seed commit for fatality based errors
* fatality
* first draft of fatality
* cleanup
* differnt approach
* simplify
* first working version for enums, with documentation
* add split
* fix simple split test case
* extend README.md
* update fatality impl
* make tests passed
* apply fatality to first subsystem
* fatality fixes
* use fatality in a subsystem
* fix subsystemg
* fixup proc macro
* fix/test: log::*! do not execute when log handler is missing
* fix spelling
* rename Runtime2 to something sane
* allow nested split with `forward` annotations
* add free license
* enable and fixup all tests
* use external fatality
Makes this more reviewable.
* bump fatality dep
Avoid duplicate expander compilations.
* migrate availability distribution
* more fatality usage
* chore: bump fatality to 0.0.6
* fixup remaining subsystems
* chore: fmt
* make cargo spellcheck happy
* remove single instance of `#[fatal(false)]`
* last quality sweep
* fixup
* Revert "collator-protocol: fix wrong warning (#4909)"
This reverts commit 128421b5dd.
* Revert "collator-protocol: short-term fixes for connectivity (#4640)"
This reverts commit aff88a864a.
* make the slots great again
Co-authored-by: Andronik <write@reusable.software>