Fix algorithmic complexity of on-demand scheduler with regards to number of cores. (#3190)

We witnessed really poor performance on Rococo, where we ended up with
50 on-demand cores. This was due to the fact that for each core the full
queue was processed. With this change full queue processing will happen
way less often (most of the time complexity is O(1) or O(log(n))) and if
it happens then only for one core (in expectation).

Also spot price is now updated before each order to ensure economic back
pressure.


TODO:

- [x] Implement
- [x] Basic tests
- [x] Add more tests (see todos)
- [x] Run benchmark to confirm better performance, first results suggest
> 100x faster.
- [x] Write migrations
- [x] Bump scale-info version and remove patch in Cargo.toml
- [x] Write PR docs: on-demand performance improved, more on-demand
cores are now non problematic anymore. If need by also the max queue
size can be increased again. (Maybe not to 10k)

Optional: Performance can be improved even more, if we called
`pop_assignment_for_core()`, before calling `report_processed` (Avoid
needless affinity drops). The effect gets smaller the larger the claim
queue and I would only go for it, if it does not add complexity to the
scheduler.

---------

Co-authored-by: eskimor <eskimor@no-such-url.com>
Co-authored-by: antonva <anton.asgeirsson@parity.io>
Co-authored-by: command-bot <>
Co-authored-by: Anton Vilhelm Ásgeirsson <antonva@users.noreply.github.com>
Co-authored-by: ordian <write@reusable.software>
This commit is contained in:
eskimor
2024-03-20 14:53:55 +01:00
committed by GitHub
parent b686bfefba
commit b74353d3e9
13 changed files with 1051 additions and 551 deletions
@@ -29,6 +29,7 @@ use primitives::{
vstaging::{ApprovalVotingParams, NodeFeatures},
AsyncBackingParams, Balance, ExecutorParamError, ExecutorParams, SessionIndex,
LEGACY_MIN_BACKING_VOTES, MAX_CODE_SIZE, MAX_HEAD_DATA_SIZE, MAX_POV_SIZE,
ON_DEMAND_MAX_QUEUE_MAX_SIZE,
};
use sp_runtime::{traits::Zero, Perbill};
use sp_std::prelude::*;
@@ -312,6 +313,8 @@ pub enum InconsistentError<BlockNumber> {
InconsistentExecutorParams { inner: ExecutorParamError },
/// TTL should be bigger than lookahead
LookaheadExceedsTTL,
/// Passed in queue size for on-demand was too large.
OnDemandQueueSizeTooLarge,
}
impl<BlockNumber> HostConfiguration<BlockNumber>
@@ -405,6 +408,10 @@ where
return Err(LookaheadExceedsTTL)
}
if self.scheduler_params.on_demand_queue_max_size > ON_DEMAND_MAX_QUEUE_MAX_SIZE {
return Err(OnDemandQueueSizeTooLarge)
}
Ok(())
}
@@ -630,7 +637,7 @@ pub mod pallet {
/// Set the number of coretime execution cores.
///
/// Note that this configuration is managed by the coretime chain. Only manually change
/// NOTE: that this configuration is managed by the coretime chain. Only manually change
/// this, if you really know what you are doing!
#[pallet::call_index(6)]
#[pallet::weight((
@@ -1133,6 +1140,7 @@ pub mod pallet {
config.scheduler_params.on_demand_queue_max_size = new;
})
}
/// Set the on demand (parathreads) fee variability.
#[pallet::call_index(50)]
#[pallet::weight((