mirror of
https://github.com/pezkuwichain/pezkuwi-subxt.git
synced 2026-04-30 11:57:56 +00:00
Use CPU clock timeout for PVF jobs (#6282)
* Put in skeleton logic for CPU-time-preparation Still needed: - Flesh out logic - Refactor some spots - Tests * Continue filling in logic for prepare worker CPU time changes * Fix compiler errors * Update lenience factor * Fix some clippy lints for PVF module * Fix compilation errors * Address some review comments * Add logging * Add another log * Address some review comments; change Mutex to AtomicBool * Refactor handling response bytes * Add CPU clock timeout logic for execute jobs * Properly handle AtomicBool flag * Use `Ordering::Relaxed` * Refactor thread coordination logic * Fix bug * Add some timing information to execute tests * Add section about the mitigation to the IG * minor: Change more `Ordering`s to `Relaxed` * candidate-validation: Fix build errors
This commit is contained in:
@@ -77,10 +77,18 @@ time they can take. As the time for a job can vary depending on the machine and
|
||||
load on the machine, this can potentially lead to disputes where some validators
|
||||
successfuly execute a PVF and others don't.
|
||||
|
||||
One mitigation we have in place is a more lenient timeout for preparation during
|
||||
execution than during pre-checking. The rationale is that the PVF has already
|
||||
passed pre-checking, so we know it should be valid, and we allow it to take
|
||||
longer than expected, as this is likely due to an issue with the machine and not
|
||||
the PVF.
|
||||
One dispute mitigation we have in place is a more lenient timeout for
|
||||
preparation during execution than during pre-checking. The rationale is that the
|
||||
PVF has already passed pre-checking, so we know it should be valid, and we allow
|
||||
it to take longer than expected, as this is likely due to an issue with the
|
||||
machine and not the PVF.
|
||||
|
||||
#### CPU clock timeouts
|
||||
|
||||
Another timeout-related mitigation we employ is to measure the time taken by
|
||||
jobs using CPU time, rather than wall clock time. This is because the CPU time
|
||||
of a process is less variable under different system conditions. When the
|
||||
overall system is under heavy load, the wall clock time of a job is affected
|
||||
more than the CPU time.
|
||||
|
||||
[CVM]: ../../types/overseer-protocol.md#validationrequesttype
|
||||
|
||||
Reference in New Issue
Block a user