PVF: re-preparing artifact on failed runtime construction (#3187)

resolve https://github.com/paritytech/polkadot-sdk/issues/3139

- [x] use a distinguishable error for `execute_artifact`
- [x] remove artifact in case of a `RuntimeConstruction` error during
the execution
- [x] augment the `validate_candidate_with_retry` of `ValidationBackend`
with the case of retriable `RuntimeConstruction` error during the
execution
- [x] update the book
(https://paritytech.github.io/polkadot-sdk/book/node/utility/pvf-host-and-workers.html#retrying-execution-requests)
- [x] add a test
- [x] run zombienet tests

---------

Co-authored-by: s0me0ne-unkn0wn <48632512+s0me0ne-unkn0wn@users.noreply.github.com>
This commit is contained in:
maksimryndin
2024-02-28 17:29:27 +01:00
committed by GitHub
parent 14530269b7
commit 426136671a
15 changed files with 294 additions and 49 deletions
@@ -125,6 +125,14 @@ execution request:
reason, which may or may not be independent of the candidate or PVF.
5. **Internal errors:** See "Internal Errors" section. In this case, after the
retry we abstain from voting.
6. **RuntimeConstruction** error. The precheck handles a general case of a wrong
artifact but doesn't guarantee its consistency between the preparation and
the execution. If something happened with the artifact between
the preparation of the artifact and its execution (e.g. the artifact was
corrupted on disk or a dirty node upgrade happened when the prepare worker
has a wasmtime version different from the execute worker's wasmtime version).
We treat such an error as possibly transient due to local issues and retry
one time.
### Preparation timeouts