Consolidate subsystem spans so they are all children of the leaf-activated root span (#6458)

* Pass the PerLeafSpan as mutable reference to handle_new_head function

* cargo +nightly fmt --all

* Add mock span for test

* cargo +nightly fmt --all

* add new-blocks-hashes to span

* ref span in match statement, set span to disabled if not passed

* remove second match clause, make handle_new_head_span mutable

* cargo +nightly fmt --all

* improve tag on error and warning

* add imported blocks and info span

* cargo +nightly fmt --all

* Improve error for imported_blocks_and_info trace

* format tags on get_header_span

* add lost-to-finality tag

* add missing bracket

* - Add bitfield child span
- Add block db insertion span

* - fix update-bitfield span tag

* - Fix type conversion to u64
- Add missing argument

* - Cargo fmt

* - Test add_follows_from

* - Revert as  relationship between spans not working correctly

* - use drop to test if parent-child relationship can be re-established

* - remove bitfield span, check if parent-child relationship can be reestablished

* - Remove dangling bitfield span which is not used, to see if parent-child relationship can be re-established

* Another dangling bitfield span

* cargo fmt

* - add imported blocks and info span
- add candidate span per candidate

* add tags before moving block_header to push scope

* - Add db-insertion span

* cargo fmt

* fix types

* * Pass mutable reference to span in handle_new_head
* Change get-header-span tags in handle_new_head
* Create cache-session-info span in handle_new_head
* Create optional argument in determine_new_blocks
* Pass mutable reference to handle_new_head_span in determine_new_blocks in handle_new_head function
* Add candidate-hash, candidate-number, lost-to-finality tags to candidate_span in handle_new_head function
* Manually drop db_insertion_span and remove superfluous tags  to it, only keeping approved-bitfields tag
* Add ApprovalVoting stage in jaeger

* * Pass mutable reference to jaeger::Span in stead of PerLeafSpan
* Add block-import span

* *Pass optional_span (optional argument) to determine_new_blocks util function

* * Add num-candidates int tag to block_import_span

* * Add head tag to cache_session_span

* * Create PerLeafSpan in handle_from_overseer (this is required to establish parent-child relationship between approval-voting span, and leaf-activated root span)

* * Add candidate-import-span as child of block-import-span
* Add candidate-hash and num-approval tags to candidate-import-span

* * Fix num-candidate tag to bitvec-len tag in candidate-import-span

* *Fix imported_blocKs_and_info span to create new-block-span as not dealing with candidates

* Consider the future::select! block

* Use HashMap<Hash, jaeger::PerLeafSpan>

* Remove Stage 9

* Add missing spans

* cargo +nightly fmt --all

* Remove optional span argument for determine_new_blocks

* * Remove no-longer needed default PerLeafSpan implementation
* Remove no-longer necessary mock span given re-factoring of handle_new_head() no longer neeing mutable span
* Split validation-result and request-data (availability and validation code) spans into two by dropping request_validation_data_spans
* Remove drop statements for cache_session_info_span
*

* Remove unnecessary span

* Remove another excessively spammy span

* Add missing spans from State in import tests

* Use functional approach to get spans

* - Add functional approach for the approval-voting span
- Add doc on block_numbers given labelling ambiguity
- Add span pruning logic
- Use .add_para_id on validation_result_span

* Replace for hash_set in hash_set_iter with map closure

* cargo +nightly fmt --all

* Change from unconsumed `map` to `.for_each`

* cargo +nightly fmt --all

* Refactor add_para_id to validation_result_span

* cargo +nightly fmt --all

* Remove duplicate tag

* Add missing tag to handle-approved-ancestor span

* Refactor span pruning to only invoke retain once

* Typo in span name

* - Replace unwrap_or with unwrap_or_else due to lazy evaluation of trace-identifier in polkadot_node_jaeger
- Remove some redundant spans

* Add approval-distribution spans

* - Add unwrap_or_else on note-approved-in-chain-selection
- Use child_with_trace_id to add traceID string tag on span (note this does not change the traceID, but just adds a tag)

* cargo +nightly fmt --all

* - Add traceID tags were necessary in approval-voting and availability-distribution
- Always use block-hash tag in stead of relay-parent tag in approval-distribution

* Remove schedule-wakeup span as it will duplicate spans on existing wakeups (which should be a no-op)

* Remove a couple of warnings related to mutability

* Fix failing tests in availability distribution

* Add traceID tag to launch-approval and validation-result

* Reshuffle the validation and validation result spans to where more appropriate and add block-hash tag

* - Add tranche and should-trigger tag to process-wakeup span
- Add candidate-hash and traceID to check-and-import-approval span

* cargo fmt

* - Adjustments after PR comments

* Move span pruning after other pruning logic

* Remove DerefMut - no longer needed

* Relabel request-chunk spans

* - Fix typo in span label
- Add docs for drops

* Add new approval-voting span pruning logic

* Undo removal of !

* cargo fmt
This commit is contained in:
Mattia L.V. Bradascio
2023-03-31 16:54:19 +01:00
committed by GitHub
parent 9fe528d5c7
commit 713f6625fa
12 changed files with 349 additions and 90 deletions
@@ -140,7 +140,18 @@ impl FetchTaskConfig {
sender: mpsc::Sender<FromFetchTask>,
metrics: Metrics,
session_info: &SessionInfo,
span: jaeger::Span,
) -> Self {
let span = span
.child("fetch-task-config")
.with_trace_id(core.candidate_hash)
.with_string_tag("leaf", format!("{:?}", leaf))
.with_validator_index(session_info.our_index)
.with_uint_tag("group-index", core.group_responsible.0 as u64)
.with_relay_parent(core.candidate_descriptor.relay_parent)
.with_string_tag("pov-hash", format!("{:?}", core.candidate_descriptor.pov_hash))
.with_stage(jaeger::Stage::AvailabilityDistribution);
let live_in = vec![leaf].into_iter().collect();
// Don't run tasks for our backing group:
@@ -148,9 +159,6 @@ impl FetchTaskConfig {
return FetchTaskConfig { live_in, prepared_running: None }
}
let span = jaeger::Span::new(core.candidate_hash, "availability-distribution")
.with_stage(jaeger::Stage::AvailabilityDistribution);
let prepared_running = RunningTask {
session_index: session_info.session_index,
group_index: core.group_responsible,
@@ -251,20 +259,18 @@ impl RunningTask {
let mut bad_validators = Vec::new();
let mut succeeded = false;
let mut count: u32 = 0;
let mut _span = self
.span
.child("fetch-task")
.with_chunk_index(self.request.index.0)
.with_relay_parent(self.relay_parent);
let mut span = self.span.child("run-fetch-chunk-task").with_relay_parent(self.relay_parent);
// Try validators in reverse order:
while let Some(validator) = self.group.pop() {
let _try_span = _span.child("try");
// Report retries:
if count > 0 {
self.metrics.on_retry();
}
count += 1;
let _chunk_fetch_span = span
.child("fetch-chunk-request")
.with_chunk_index(self.request.index.0)
.with_stage(jaeger::Stage::AvailabilityDistribution);
// Send request:
let resp = match self.do_request(&validator).await {
Ok(resp) => resp,
@@ -281,6 +287,12 @@ impl RunningTask {
continue
},
};
// We drop the span here, so that the span is not active while we recombine the chunk.
drop(_chunk_fetch_span);
let _chunk_recombine_span = span
.child("recombine-chunk")
.with_chunk_index(self.request.index.0)
.with_stage(jaeger::Stage::AvailabilityDistribution);
let chunk = match resp {
ChunkFetchingResponse::Chunk(resp) => resp.recombine_into_chunk(&self.request),
ChunkFetchingResponse::NoSuchChunk => {
@@ -298,6 +310,12 @@ impl RunningTask {
continue
},
};
// We drop the span so that the span is not active whilst we validate and store the chunk.
drop(_chunk_recombine_span);
let _chunk_validate_and_store_span = span
.child("validate-and-store-chunk")
.with_chunk_index(self.request.index.0)
.with_stage(jaeger::Stage::AvailabilityDistribution);
// Data genuine?
if !self.validate_chunk(&validator, &chunk) {
@@ -308,10 +326,9 @@ impl RunningTask {
// Ok, let's store it and be happy:
self.store_chunk(chunk).await;
succeeded = true;
_span.add_string_tag("success", "true");
break
}
_span.add_int_tag("tries", count as _);
span.add_int_tag("tries", count as _);
if succeeded {
self.metrics.on_fetch(SUCCEEDED);
self.conclude(bad_validators).await;