PVF: Don't dispute on missing artifact (#7011)

* PVF: Don't dispute on missing artifact

A dispute should never be raised if the local cache doesn't provide a certain
artifact. You can not dispute based on this reason, as it is a local hardware
issue and not related to the candidate to check.

Design:

Currently we assume that if we prepared an artifact, it remains there on-disk
until we prune it, i.e. we never check again if it's still there.

We can change it so that instead of artifact-not-found triggering a dispute, we
retry once (like we do for AmbiguousWorkerDeath, except we don't dispute if it
still doesn't work). And when enqueuing an execute job, we check for the
artifact on-disk, and start preparation if not found.

Changes:

- [x] Integration test (should fail without the following changes)
- [x] Check if artifact exists when executing, prepare if not
- [x] Return an internal error when file is missing
- [x] Retry once on internal errors
- [x] Document design (update impl guide)

* Add some context to wasm error message (it is quite long)

* Fix impl guide

* Add check for missing/inaccessible file

* Add comment referencing Substrate issue

* Add test for retrying internal errors

---------

Co-authored-by: parity-processbot <>
This commit is contained in:
Marcin S
2023-04-20 15:38:31 +02:00
committed by GitHub
parent 023d459857
commit 0940cdd1d7
8 changed files with 286 additions and 99 deletions
+57 -2
View File
@@ -33,7 +33,7 @@ const TEST_EXECUTION_TIMEOUT: Duration = Duration::from_secs(3);
const TEST_PREPARATION_TIMEOUT: Duration = Duration::from_secs(3);
struct TestHost {
_cache_dir: tempfile::TempDir,
cache_dir: tempfile::TempDir,
host: Mutex<ValidationHost>,
}
@@ -52,7 +52,7 @@ impl TestHost {
f(&mut config);
let (host, task) = start(config, Metrics::default());
let _ = tokio::task::spawn(task);
Self { _cache_dir: cache_dir, host: Mutex::new(host) }
Self { cache_dir, host: Mutex::new(host) }
}
async fn validate_candidate(
@@ -240,3 +240,58 @@ async fn execute_queue_doesnt_stall_with_varying_executor_params() {
max_duration.as_millis()
);
}
// Test that deleting a prepared artifact does not lead to a dispute when we try to execute it.
#[tokio::test]
async fn deleting_prepared_artifact_does_not_dispute() {
let host = TestHost::new();
let cache_dir = host.cache_dir.path().clone();
let result = host
.validate_candidate(
halt::wasm_binary_unwrap(),
ValidationParams {
block_data: BlockData(Vec::new()),
parent_head: Default::default(),
relay_parent_number: 1,
relay_parent_storage_root: Default::default(),
},
Default::default(),
)
.await;
match result {
Err(ValidationError::InvalidCandidate(InvalidCandidate::HardTimeout)) => {},
r => panic!("{:?}", r),
}
// Delete the prepared artifact.
{
// Get the artifact path (asserting it exists).
let mut cache_dir: Vec<_> = std::fs::read_dir(cache_dir).unwrap().collect();
assert_eq!(cache_dir.len(), 1);
let artifact_path = cache_dir.pop().unwrap().unwrap();
// Delete the artifact.
std::fs::remove_file(artifact_path.path()).unwrap();
}
// Try to validate again, artifact should get recreated.
let result = host
.validate_candidate(
halt::wasm_binary_unwrap(),
ValidationParams {
block_data: BlockData(Vec::new()),
parent_head: Default::default(),
relay_parent_number: 1,
relay_parent_storage_root: Default::default(),
},
Default::default(),
)
.await;
match result {
Err(ValidationError::InvalidCandidate(InvalidCandidate::HardTimeout)) => {},
r => panic!("{:?}", r),
}
}