* Happy New Year!
* Remove year entierly
Co-authored-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io>
* Remove years from copyright notice in the entire repo
---------
Co-authored-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io>
* Pass the PerLeafSpan as mutable reference to handle_new_head function
* cargo +nightly fmt --all
* Add mock span for test
* cargo +nightly fmt --all
* add new-blocks-hashes to span
* ref span in match statement, set span to disabled if not passed
* remove second match clause, make handle_new_head_span mutable
* cargo +nightly fmt --all
* improve tag on error and warning
* add imported blocks and info span
* cargo +nightly fmt --all
* Improve error for imported_blocks_and_info trace
* format tags on get_header_span
* add lost-to-finality tag
* add missing bracket
* - Add bitfield child span
- Add block db insertion span
* - fix update-bitfield span tag
* - Fix type conversion to u64
- Add missing argument
* - Cargo fmt
* - Test add_follows_from
* - Revert as relationship between spans not working correctly
* - use drop to test if parent-child relationship can be re-established
* - remove bitfield span, check if parent-child relationship can be reestablished
* - Remove dangling bitfield span which is not used, to see if parent-child relationship can be re-established
* Another dangling bitfield span
* cargo fmt
* - add imported blocks and info span
- add candidate span per candidate
* add tags before moving block_header to push scope
* - Add db-insertion span
* cargo fmt
* fix types
* * Pass mutable reference to span in handle_new_head
* Change get-header-span tags in handle_new_head
* Create cache-session-info span in handle_new_head
* Create optional argument in determine_new_blocks
* Pass mutable reference to handle_new_head_span in determine_new_blocks in handle_new_head function
* Add candidate-hash, candidate-number, lost-to-finality tags to candidate_span in handle_new_head function
* Manually drop db_insertion_span and remove superfluous tags to it, only keeping approved-bitfields tag
* Add ApprovalVoting stage in jaeger
* * Pass mutable reference to jaeger::Span in stead of PerLeafSpan
* Add block-import span
* *Pass optional_span (optional argument) to determine_new_blocks util function
* * Add num-candidates int tag to block_import_span
* * Add head tag to cache_session_span
* * Create PerLeafSpan in handle_from_overseer (this is required to establish parent-child relationship between approval-voting span, and leaf-activated root span)
* * Add candidate-import-span as child of block-import-span
* Add candidate-hash and num-approval tags to candidate-import-span
* * Fix num-candidate tag to bitvec-len tag in candidate-import-span
* *Fix imported_blocKs_and_info span to create new-block-span as not dealing with candidates
* Consider the future::select! block
* Use HashMap<Hash, jaeger::PerLeafSpan>
* Remove Stage 9
* Add missing spans
* cargo +nightly fmt --all
* Remove optional span argument for determine_new_blocks
* * Remove no-longer needed default PerLeafSpan implementation
* Remove no-longer necessary mock span given re-factoring of handle_new_head() no longer neeing mutable span
* Split validation-result and request-data (availability and validation code) spans into two by dropping request_validation_data_spans
* Remove drop statements for cache_session_info_span
*
* Remove unnecessary span
* Remove another excessively spammy span
* Add missing spans from State in import tests
* Use functional approach to get spans
* - Add functional approach for the approval-voting span
- Add doc on block_numbers given labelling ambiguity
- Add span pruning logic
- Use .add_para_id on validation_result_span
* Replace for hash_set in hash_set_iter with map closure
* cargo +nightly fmt --all
* Change from unconsumed `map` to `.for_each`
* cargo +nightly fmt --all
* Refactor add_para_id to validation_result_span
* cargo +nightly fmt --all
* Remove duplicate tag
* Add missing tag to handle-approved-ancestor span
* Refactor span pruning to only invoke retain once
* Typo in span name
* - Replace unwrap_or with unwrap_or_else due to lazy evaluation of trace-identifier in polkadot_node_jaeger
- Remove some redundant spans
* Add approval-distribution spans
* - Add unwrap_or_else on note-approved-in-chain-selection
- Use child_with_trace_id to add traceID string tag on span (note this does not change the traceID, but just adds a tag)
* cargo +nightly fmt --all
* - Add traceID tags were necessary in approval-voting and availability-distribution
- Always use block-hash tag in stead of relay-parent tag in approval-distribution
* Remove schedule-wakeup span as it will duplicate spans on existing wakeups (which should be a no-op)
* Remove a couple of warnings related to mutability
* Fix failing tests in availability distribution
* Add traceID tag to launch-approval and validation-result
* Reshuffle the validation and validation result spans to where more appropriate and add block-hash tag
* - Add tranche and should-trigger tag to process-wakeup span
- Add candidate-hash and traceID to check-and-import-approval span
* cargo fmt
* - Adjustments after PR comments
* Move span pruning after other pruning logic
* Remove DerefMut - no longer needed
* Relabel request-chunk spans
* - Fix typo in span label
- Add docs for drops
* Add new approval-voting span pruning logic
* Undo removal of !
* cargo fmt
* seed commit for fatality based errors
* fatality
* first draft of fatality
* cleanup
* differnt approach
* simplify
* first working version for enums, with documentation
* add split
* fix simple split test case
* extend README.md
* update fatality impl
* make tests passed
* apply fatality to first subsystem
* fatality fixes
* use fatality in a subsystem
* fix subsystemg
* fixup proc macro
* fix/test: log::*! do not execute when log handler is missing
* fix spelling
* rename Runtime2 to something sane
* allow nested split with `forward` annotations
* add free license
* enable and fixup all tests
* use external fatality
Makes this more reviewable.
* bump fatality dep
Avoid duplicate expander compilations.
* migrate availability distribution
* more fatality usage
* chore: bump fatality to 0.0.6
* fixup remaining subsystems
* chore: fmt
* make cargo spellcheck happy
* remove single instance of `#[fatal(false)]`
* last quality sweep
* fixup
* WIP: Get rid of request multiplexer.
* WIP
* Receiver for handling of incoming requests.
* Get rid of useless `Fault` abstraction.
The things the type system let us do are not worth getting abstracted in
its own type. Instead error handling is going to be merely a pattern.
* Make most things compile again.
* Port availability distribution away from request multiplexer.
* Formatting.
* Port dispute distribution over.
* Fixup statement distribution.
* Handle request directly in collator protocol.
+ Only allow fatal errors at top level.
* Use direct request channel for availability recovery.
* Finally get rid of request multiplexer
Fixes#2842 and paves the way for more back pressure possibilities.
* Fix overseer and statement distribution tests.
* Fix collator protocol and network bridge tests.
* Fix tests in availability recovery.
* Fix availability distribution tests.
* Fix dispute distribution tests.
* Add missing dependency
* Typos.
* Review remarks.
* More remarks.
* Remove signature verification in backing.
`SignedFullStatement` now signals that the signature has already been
checked.
* Remove unused check_payload function.
* Introduced unchecked signed variants.
* Fix inclusion to use unchecked variant.
* More unchecked variants.
* Use unchecked variants in protocols.
* Start fixing statement-distribution.
* Fixup statement distribution.
* Fix inclusion.
* Fix warning.
* Fix backing properly.
* Fix bitfield distribution.
* Make crypto store optional for `RuntimeInfo`.
* Factor out utility functions.
* get_group_rotation_info
* WIP: Collator cleanup + check signatures.
* Convenience signature checking functions.
* Check signature on collator-side.
* Fix warnings.
* Fix collator side tests.
* Get rid of warnings.
* Better Signed/UncheckedSigned implementation.
Also get rid of Encode/Decode for Signed! *party*
* Get rid of dead code.
* Move Signed in its own module.
* into_checked -> try_into_checked
* Fix merge.
* Factor out runtime module into utils.
* First fatal error design.
* Better error handling infra.
* Error handling cleanup.
* Send to peers of our group first.
* Finish backing group prioritization.
* Little cleanup.
* More cleanup.
* Forgot to checkin error.rs.
* Notes.
* Runtime -> RuntimeInfo
* qed in debug assert.
* PolkaErr -> Fault.
* Factor out runtime module into utils.
* Add maybe_authority information to `PeerConnected` event.
We already gather this information in authority discovery, so we might
as well share it with others.
This opens up an easy path to trigger validators differently from normal
nodes, e.g. for prioritization. This change has become more important
now, that we just connect to all validators and therefore just have a
long peer list without any information about those nodes.
* Test fix.
* Indentation fix.
* Prepare request-response for PoV fetching.
* Drop old PoV distribution.
* WIP: Fetch PoV directly from backing.
* Backing compiles.
* Runtime access and connection management for PoV distribution.
* Get rid of seemingly dead code.
* Implement PoV fetching.
Backing does not yet use it.
* Don't send `ConnectToValidators` for empty list.
* Even better - no need to check over and over again.
* PoV fetching implemented.
+ Typechecks
+ Should work
Missing:
- Guide
- Tests
- Do fallback fetching in case fetching from seconding validator fails.
* Check PoV hash upon reception.
* Implement retry of PoV fetching in backing.
* Avoid pointless validation spawning.
* Add jaeger span to pov requesting.
* Add back tracing.
* Review remarks.
* Whitespace.
* Whitespace again.
* Cleanup + fix tests.
* Log to log target in overseer.
* Fix more tests.
* Don't fail if group cannot be found.
* Simple test for PoV fetcher.
* Handle missing group membership better.
* Add test for retry functionality.
* Fix flaky test.
* Spaces again.
* Guide updates.
* Spaces.
* WIP: Whole subsystem test.
* New tests compile.
* Avoid needless runtime queries for no validator nodes.
* Make tx and rx publicly accessible in virtual overseer.
This simplifies mocking in some cases, as tx can be cloned, but rx can
not.
* Whole subsystem test working.
* Update node/network/availability-distribution/src/session_cache.rs
Co-authored-by: Andronik Ordian <write@reusable.software>
* Update node/network/availability-distribution/src/session_cache.rs
Co-authored-by: Andronik Ordian <write@reusable.software>
* Document better what `None` return value means.
* Get rid of BitVec dependency.
* Update Cargo.lock
* Hopefully fixed implementers guide build.
Co-authored-by: Andronik Ordian <write@reusable.software>
* WIP
* availability distribution, still very wip.
Work on the requesting side of things.
* Some docs on what I intend to do.
* Checkpoint of session cache implementation
as I will likely replace it with something smarter.
* More work, mostly on cache
and getting things to type check.
* Only derive MallocSizeOf and Debug for std.
* availability-distribution: Cache feature complete.
* Sketch out logic in `FetchTask` for actual fetching.
- Compile fixes.
- Cleanup.
* Format cleanup.
* More format fixes.
* Almost feature complete `fetch_task`.
Missing:
- Check for cancel
- Actual querying of peer ids.
* Finish FetchTask so far.
* Directly use AuthorityDiscoveryId in protocol and cache.
* Resolve `AuthorityDiscoveryId` on sending requests.
* Rework fetch_task
- also make it impossible to check the wrong chunk index.
- Export needed function in validator_discovery.
* From<u32> implementation for `ValidatorIndex`.
* Fixes and more integration work.
* Make session cache proper lru cache.
* Use proper lru cache.
* Requester finished.
* ProtocolState -> Requester
Also make sure to not fetch our own chunk.
* Cleanup + fixes.
* Remove unused functions
- FetchTask::is_finished
- SessionCache::fetch_session_info
* availability-distribution responding side.
* Cleanup + Fixes.
* More fixes.
* More fixes.
adder-collator is running!
* Some docs.
* Docs.
* Fix reporting of bad guys.
* Fix tests
* Make all tests compile.
* Fix test.
* Cleanup + get rid of some warnings.
* state -> requester
* Mostly doc fixes.
* Fix test suite.
* Get rid of now redundant message types.
* WIP
* Rob's review remarks.
* Fix test suite.
* core.relay_parent -> leaf for session request.
* Style fix.
* Decrease request timeout.
* Cleanup obsolete errors.
* Metrics + don't fail on non fatal errors.
* requester.rs -> requester/mod.rs
* Panic on invalid BadValidator report.
* Fix indentation.
* Use typed default timeout constant.
* Make channel size 0, as each sender gets one slot anyways.
* Fix incorrect metrics initialization.
* Fix build after merge.
* More fixes.
* Hopefully valid metrics names.
* Better metrics names.
* Some tests that already work.
* Slightly better docs.
* Some more tests.
* Fix network bridge test.
* feat/view: assure heads in a view are sorted
Allows O(n) comparisons, adds an alternate equiv relation
which takes O(n^2) for integrity verification.
Ref #2133
* revert: remove custom PartialEq impl, there are no duplicates
* fix: do not sort the live_heads, that alters the local view
* refactor/view: heads should not be public
* chore/spellcheck: add unfinalized
* fix/view: add missing len() and is_empty() fns
* quirk
* vec is not view
* Update node/network/approval-distribution/src/tests.rs
Co-authored-by: Andronik Ordian <write@reusable.software>
* Update node/network/bridge/src/lib.rs
Co-authored-by: Andronik Ordian <write@reusable.software>
* Update node/network/protocol/src/lib.rs
Co-authored-by: Andronik Ordian <write@reusable.software>
* fixup comment
* fix botched test
Co-authored-by: Andronik Ordian <write@reusable.software>
* refactor/reputation: unify the values used
* chore/rep: rename Annoy* to Cost*, make duplicate message Cost*Repeated
* fix/reputation: lost and found, convert at the boundary to substrate
* refactor/rep: move conversion to base reputation one level down, left conversions
* fix/rep: order of magnitude adjustments
Thanks pierre!
* remove spaces
* chore/rep: give rationale for order of magnitude
* refactor/rep: move UnifiedReputationChange to separate file
* fix/rep: order of magnitudes correction
* Move NetworkBridgeEvent to subsystem::messages.
It is not protocol related at all, it is in fact only part of the
subsystem communication as it gets wrapped into messages of each
subsystem.
* Request/response infrastructure is taking shape.
WIP: Does not compile.
* Multiplexer variant not supported by Rusts type system.
* request_response::request type checks.
* Cleanup.
* Minor fixes for request_response.
* Implement request sending + move multiplexer.
Request multiplexer is moved to bridge, as there the implementation is
more straight forward as we can specialize on `AllMessages` for the
multiplexing target.
Sending of requests is mostly complete, apart from a few `From`
instances. Receiving is also almost done, initializtion needs to be
fixed and the multiplexer needs to be invoked.
* Remove obsolete multiplexer.
* Initialize bridge with multiplexer.
* Finish generic request sending/receiving.
Subsystems are now able to receive and send requests and responses via
the overseer.
* Doc update.
* Fixes.
* Link issue for not yet implemented code.
* Fixes suggested by @ordian - thanks!
- start encoding at 0
- don't crash on zero protocols
- don't panic on not yet implemented request handling
* Update node/network/protocol/src/request_response/v1.rs
Use index 0 instead of 1.
Co-authored-by: Andronik Ordian <write@reusable.software>
* Update node/network/protocol/src/request_response.rs
Co-authored-by: Andronik Ordian <write@reusable.software>
* Fix existing tests.
* Better avoidance of division by zoro errors.
* Doc fixes.
* send_request -> start_request.
* Fix missing renamings.
* Update substrate.
* Pass TryConnect instead of true.
* Actually import `IfDisconnected`.
* Fix wrong import.
* Update node/network/bridge/src/lib.rs
typo
Co-authored-by: Pierre Krieger <pierre.krieger1708@gmail.com>
* Update node/network/bridge/src/multiplexer.rs
Remove redundant import.
Co-authored-by: Pierre Krieger <pierre.krieger1708@gmail.com>
* Stop doing tracing from within `From` instance.
Thanks for the catch @tomaka!
* Get rid of redundant import.
* Formatting cleanup.
* Fix tests.
* Add link to issue.
* Clarify comments some more.
* Fix tests.
* Formatting fix.
* tabs
* Fix link
Co-authored-by: Bernhard Schuster <bernhard@ahoi.io>
* Use map_err.
Co-authored-by: Bernhard Schuster <bernhard@ahoi.io>
* Improvements inspired by suggestions by @drahnr.
- Channel size is now determined by function.
- Explicitely scope NetworkService::start_request.
Co-authored-by: Andronik Ordian <write@reusable.software>
Co-authored-by: Pierre Krieger <pierre.krieger1708@gmail.com>
Co-authored-by: Bernhard Schuster <bernhard@ahoi.io>
* Do not send empty view updates to peers
It happened that we send empty view updates to our peers, because we
only updated our finalized block. This could lead to situations where we
overwhelmed sub systems with too many messages. On Rococo this lead to
constant restarts of our nodes, because some node apparently was
finalizing a lot of blocks.
To prevent this, the pr is doing the following:
1. If a peer sends us an empty view, we report this peer and decrease it
reputation.
2. We ensure that we only send a view update when the `heads` changed
and not only the `finalized_number`.
3. We do not send empty `ActiveLeavesUpdates` from the overseer, as this
makes no sense to send these empty updates. If some subsystem is relying
on the finalized block, it needs to listen for the overseer signal.
* Update node/network/bridge/src/lib.rs
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
* Don't work if they're are no added heads
* Fix test
* Ahhh
* More fixes
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
* Store all chunks and in a single transaction
* Adds chunks LRU to store
* Add pruning records metrics
* Use honest cache instead of LRU
* Remove unnecessary optional cache
* Fix review nits that are fixable
* Add one Jaeger span per relay parent
This adds one Jaeger span per relay parent, instead of always creating
new spans per relay parent. This should improve the UI view, because
subsystems are now grouped below one common span.
* Fix doc tests
* Replace `PerLeaveSpan` to `PerLeafSpan`
* More renaming
* Moare
* Update node/subsystem/src/lib.rs
Co-authored-by: Andronik Ordian <write@reusable.software>
* Skip the spans
* Increase `spec_version`
Co-authored-by: Andronik Ordian <write@reusable.software>
* guide: add candidate information to OccupiedCore
* add descriptor and hash to occupied core type
* guide: add candidate hash to inclusion
* runtime: return candidate info in core state
* bitfield signing: stop querying runtime as much
* minimize going to runtime in availability distribution
* fix availability distribution tests
* guide: remove para ID from Occupied core
* get all crates compiling
* Fix bug and further optimizations in availability distribution
- There was a bug that resulted in only getting one candidate per block
as the candidates were put into the hashmap with the relay block hash as
key. The solution for this is to use the candidate hash and the relay
block hash as key.
- We stored received/sent messages with the candidate hash and chunk
index as key. The candidate hash wasn't required in this case, as the
messages are already stored per candidate.
* Update node/core/bitfield-signing/src/lib.rs
Co-authored-by: Robert Habermeier <rphmeier@gmail.com>
* Remove the reverse map
* major refactor of receipts & query_live
* finish refactoring
remove ancestory mapping,
improve relay-parent cleanup & receipts-cache cleanup,
add descriptor to `PerCandidate`
* rename and rewrite query_pending_availability
* add a bunch of consistency tests
* Add some last changes
* xy
* fz
* Make it compile again
* Fix one test
* Fix logging
* Remove some buggy code
* Make tests work again
* Move stuff around
* Remove dbg
* Remove state from test_harness
* More refactor and new test
* New test and fixes
* Move metric
* Remove "duplicated code"
* Fix tests
* New test
* Change break to continue
* Update node/core/av-store/src/lib.rs
* Update node/core/av-store/src/lib.rs
* Update node/core/bitfield-signing/src/lib.rs
Co-authored-by: Fedor Sakharov <fedor.sakharov@gmail.com>
* update guide to match live_candidates changes
* add comment
* fix bitfield signing
Co-authored-by: Robert Habermeier <rphmeier@gmail.com>
Co-authored-by: Bernhard Schuster <bernhard@ahoi.io>
Co-authored-by: Fedor Sakharov <fedor.sakharov@gmail.com>
* refactor View to include finalized_number
* guide: update the NetworkBridge on BlockFinalized
* av-store: fix the tests
* actually fix tests
* grumbles
* ignore macro doctest
* use Hash::repeat_bytes more consistently
* broadcast empty leaves updates as well
* fix issuing view updates on empty leaves updates
* use snake_case for log targets
* remove unused continue
* validator_discovery: when disconnecting, use all addresses
* validator_discovery: simplify request revokation
* fix a typo
* reexport prometheus-super for ease of use of other subsystems
* add some prometheus timers for collation generation subsystem
* add timing metrics to av-store
* add metrics to candidate backing
* add timing metric to bitfield signing
* add timing metrics to candidate selection
* add timing metrics to candidate-validation
* add timing metrics to chain-api
* add timing metrics to provisioner
* add timing metrics to runtime-api
* add timing metrics to availability-distribution
* add timing metrics to bitfield-distribution
* add timing metrics to collator protocol: collator side
* add timing metrics to collator protocol: validator side
* fix candidate validation test failures
* add timing metrics to pov distribution
* add timing metrics to statement-distribution
* use substrate_prometheus_endpoint prometheus reexport instead of prometheus_super
* don't include JOB_DELAY in bitfield-signing metrics
* give adder-collator ability to easily export its genesis-state and validation code
* wip: adder-collator pushbutton script
* don't attempt to register the adder-collator automatically
Instead, get these values with
```sh
target/release/adder-collator export-genesis-state
target/release/adder-collator export-genesis-wasm
```
And then register the parachain on https://polkadot.js.org/apps/?rpc=ws%3A%2F%2F127.0.0.1%3A9944#/explorer
To collect prometheus data, after running the script, create `prometheus.yml` per the instructions
at https://www.notion.so/paritytechnologies/Setting-up-Prometheus-locally-835cb3a9df7541a781c381006252b5ff
and then run:
```sh
docker run -v `pwd`/prometheus.yml:/etc/prometheus/prometheus.yml:z --network host prom/prometheus
```
Demonstrates that data makes it across to prometheus, though it is likely to be useful in the future
to tweak the buckets.
* Update parachain/test-parachains/adder/collator/src/cli.rs
Co-authored-by: Andronik Ordian <write@reusable.software>
* use the grandpa-pause parameter
* skip metrics in tracing instrumentation
* remove unnecessary grandpa_pause cli param
Co-authored-by: Andronik Ordian <write@reusable.software>
* drop in tracing to replace log
* add structured logging to trace messages
* add structured logging to debug messages
* add structured logging to info messages
* add structured logging to warn messages
* add structured logging to error messages
* normalize spacing and Display vs Debug
* add instrumentation to the various 'fn run'
* use explicit tracing module throughout
* fix availability distribution test
* don't double-print errors
* remove further redundancy from logs
* fix test errors
* fix more test errors
* remove unused kv_log_macro
* fix unused variable
* add tracing spans to collation generation
* add tracing spans to av-store
* add tracing spans to backing
* add tracing spans to bitfield-signing
* add tracing spans to candidate-selection
* add tracing spans to candidate-validation
* add tracing spans to chain-api
* add tracing spans to provisioner
* add tracing spans to runtime-api
* add tracing spans to availability-distribution
* add tracing spans to bitfield-distribution
* add tracing spans to network-bridge
* add tracing spans to collator-protocol
* add tracing spans to pov-distribution
* add tracing spans to statement-distribution
* add tracing spans to overseer
* cleanup
* Make `CandidateHash` a real type
This pr adds a new type `CandidateHash` that is used instead of the
opaque `Hash` type. This helps to ensure on the type system level that
we are passing the correct types.
This pr also fixes wrong usage of `relay_parent` as `candidate_hash`
when communicating with the av storage.
* Update core-primitives/src/lib.rs
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>
* Wrap the lines
Co-authored-by: Peter Goodspeed-Niklaus <coriolinus@users.noreply.github.com>