PVF: add landlock sandboxing (#7303)

* Begin adding landlock + test * Move PVF implementer's guide section to own page, document security * Implement test * Add some docs * Do some cleanup * Fix typo * Warn on host startup if landlock is not supported * Clarify docs a bit * Minor improvements * Add some docs about determinism * Address review comments (mainly add warning on landlock error) * Update node/core/pvf/src/host.rs Co-authored-by: Andrei Sandu <54316454+sandreim@users.noreply.github.com> * Update node/core/pvf/src/host.rs Co-authored-by: Andrei Sandu <54316454+sandreim@users.noreply.github.com> * Fix unused fn * Update ABI docs to reflect latest discussions * Remove outdated notes * Try to trigger new test-linux-oldkernel-stable job Job introduced in https://github.com/paritytech/polkadot/pull/7371. --------- Co-authored-by: Andrei Sandu <54316454+sandreim@users.noreply.github.com>
2026-07-25 04:45:42 +00:00 · 2023-07-05 12:57:53 -04:00
parent a40417da96
commit 2b9c4f82a7
10 changed files with 445 additions and 95 deletions
@@ -44,86 +44,10 @@ Once we have all parameters, we can spin up a background task to perform the val
  * The collator signature is valid
  * The PoV provided matches the `pov_hash` field of the descriptor

+For more details please see [PVF Host and Workers](pvf-host-and-workers.md).
+
 ### Checking Validation Outputs

 If we can assume the presence of the relay-chain state (that is, during processing [`CandidateValidationMessage`][CVM]`::ValidateFromChainState`) we can run all the checks that the relay-chain would run at the inclusion time thus confirming that the candidate will be accepted.

-### PVF Host
-
-The PVF host is responsible for handling requests to prepare and execute PVF
-code blobs.
-
-One high-level goal is to make PVF operations as deterministic as possible, to
-reduce the rate of disputes. Disputes can happen due to e.g. a job timing out on
-one machine, but not another. While we do not yet have full determinism, there
-are some dispute reduction mechanisms in place right now.
-
-#### Retrying execution requests
-
-If the execution request fails during **preparation**, we will retry if it is
-possible that the preparation error was transient (e.g. if the error was a panic
-or time out). We will only retry preparation if another request comes in after
-15 minutes, to ensure any potential transient conditions had time to be
-resolved. We will retry up to 5 times.
-
-If the actual **execution** of the artifact fails, we will retry once if it was
-a possibly transient error, to allow the conditions that led to the error to
-hopefully resolve. We use a more brief delay here (1 second as opposed to 15
-minutes for preparation (see above)), because a successful execution must happen
-in a short amount of time.
-
-We currently know of the following specific cases that will lead to a retried
-execution request:
-
-1. **OOM:** The host might have been temporarily low on memory due to other
-   processes running on the same machine. **NOTE:** This case will lead to
-   voting against the candidate (and possibly a dispute) if the retry is still
-   not successful.
-2. **Artifact missing:** The prepared artifact might have been deleted due to
-   operator error or some bug in the system.
-3. **Panic:** The worker thread panicked for some indeterminate reason, which
-   may or may not be independent of the candidate or PVF.
-
-#### Preparation timeouts
-
-We use timeouts for both preparation and execution jobs to limit the amount of
-time they can take. As the time for a job can vary depending on the machine and
-load on the machine, this can potentially lead to disputes where some validators
-successfuly execute a PVF and others don't.
-
-One dispute mitigation we have in place is a more lenient timeout for
-preparation during execution than during pre-checking. The rationale is that the
-PVF has already passed pre-checking, so we know it should be valid, and we allow
-it to take longer than expected, as this is likely due to an issue with the
-machine and not the PVF.
-
-#### CPU clock timeouts
-
-Another timeout-related mitigation we employ is to measure the time taken by
-jobs using CPU time, rather than wall clock time. This is because the CPU time
-of a process is less variable under different system conditions. When the
-overall system is under heavy load, the wall clock time of a job is affected
-more than the CPU time.
-
-#### Internal errors
-
-In general, for errors not raising a dispute we have to be very careful. This is
-only sound, if we either:
-
-1. Ruled out that error in pre-checking. If something is not checked in
-   pre-checking, even if independent of the candidate and PVF, we must raise a
-   dispute.
-2. We are 100% confident that it is a hardware/local issue: Like corrupted file,
-   etc.
-
-Reasoning: Otherwise it would be possible to register a PVF where candidates can
-not be checked, but we don't get a dispute - so nobody gets punished. Second, we
-end up with a finality stall that is not going to resolve!
-
-There are some error conditions where we can't be sure whether the candidate is
-really invalid or some internal glitch occurred, e.g. panics. Whenever we are
-unsure, we can never treat an error as internal as we would abstain from voting.
-So we will first retry the candidate, and if the issue persists we are forced to
-vote invalid.
-
 [CVM]: ../../types/overseer-protocol.md#validationrequesttype
@@ -0,0 +1,127 @@
+# PVF Host and Workers
+
+The PVF host is responsible for handling requests to prepare and execute PVF
+code blobs, which it sends to PVF workers running in their own child processes.
+
+This system has two high-levels goals that we will touch on here: *determinism*
+and *security*.
+
+## Determinism
+
+One high-level goal is to make PVF operations as deterministic as possible, to
+reduce the rate of disputes. Disputes can happen due to e.g. a job timing out on
+one machine, but not another. While we do not have full determinism, there are
+some dispute reduction mechanisms in place right now.
+
+### Retrying execution requests
+
+If the execution request fails during **preparation**, we will retry if it is
+possible that the preparation error was transient (e.g. if the error was a panic
+or time out). We will only retry preparation if another request comes in after
+15 minutes, to ensure any potential transient conditions had time to be
+resolved. We will retry up to 5 times.
+
+If the actual **execution** of the artifact fails, we will retry once if it was
+a possibly transient error, to allow the conditions that led to the error to
+hopefully resolve. We use a more brief delay here (1 second as opposed to 15
+minutes for preparation (see above)), because a successful execution must happen
+in a short amount of time.
+
+We currently know of the following specific cases that will lead to a retried
+execution request:
+
+1. **OOM:** The host might have been temporarily low on memory due to other
+   processes running on the same machine. **NOTE:** This case will lead to
+   voting against the candidate (and possibly a dispute) if the retry is still
+   not successful.
+2. **Artifact missing:** The prepared artifact might have been deleted due to
+   operator error or some bug in the system.
+3. **Panic:** The worker thread panicked for some indeterminate reason, which
+   may or may not be independent of the candidate or PVF.
+
+### Preparation timeouts
+
+We use timeouts for both preparation and execution jobs to limit the amount of
+time they can take. As the time for a job can vary depending on the machine and
+load on the machine, this can potentially lead to disputes where some validators
+successfuly execute a PVF and others don't.
+
+One dispute mitigation we have in place is a more lenient timeout for
+preparation during execution than during pre-checking. The rationale is that the
+PVF has already passed pre-checking, so we know it should be valid, and we allow
+it to take longer than expected, as this is likely due to an issue with the
+machine and not the PVF.
+
+### CPU clock timeouts
+
+Another timeout-related mitigation we employ is to measure the time taken by
+jobs using CPU time, rather than wall clock time. This is because the CPU time
+of a process is less variable under different system conditions. When the
+overall system is under heavy load, the wall clock time of a job is affected
+more than the CPU time.
+
+### Internal errors
+
+In general, for errors not raising a dispute we have to be very careful. This is
+only sound, if we either:
+
+1. Ruled out that error in pre-checking. If something is not checked in
+   pre-checking, even if independent of the candidate and PVF, we must raise a
+   dispute.
+2. We are 100% confident that it is a hardware/local issue: Like corrupted file,
+   etc.
+
+Reasoning: Otherwise it would be possible to register a PVF where candidates can
+not be checked, but we don't get a dispute - so nobody gets punished. Second, we
+end up with a finality stall that is not going to resolve!
+
+There are some error conditions where we can't be sure whether the candidate is
+really invalid or some internal glitch occurred, e.g. panics. Whenever we are
+unsure, we can never treat an error as internal as we would abstain from voting.
+So we will first retry the candidate, and if the issue persists we are forced to
+vote invalid.
+
+## Security
+
+With [on-demand parachains](https://github.com/orgs/paritytech/projects/67), it
+is much easier to submit PVFs to the chain for preparation and execution. This
+makes it easier for erroneous disputes and slashing to occur, whether
+intentional (as a result of a malicious attacker) or not (a bug or operator
+error occurred).
+
+Therefore, another goal of ours is to harden our security around PVFs, in order
+to protect the economic interests of validators and increase overall confidence
+in the system.
+
+### Possible attacks / threat model
+
+Webassembly is already sandboxed, but there have already been reported multiple
+CVEs enabling remote code execution. See e.g. these two advisories from
+[Mar 2023](https://github.com/bytecodealliance/wasmtime/security/advisories/GHSA-ff4p-7xrq-q5r8)
+and [Jul 2022](https://github.com/bytecodealliance/wasmtime/security/advisories/GHSA-7f6x-jwh5-m9r4).
+
+So what are we actually worried about? Things that come to mind:
+
+1. **Consensus faults** - If an attacker can get some source of randomness they
+   could vote against with 50% chance and cause unresolvable disputes.
+2. **Targeted slashes** - An attacker can target certain validators (e.g. some
+   validators running on vulnerable hardware) and make them vote invalid and get
+   them slashed.
+3. **Mass slashes** - With some source of randomness they can do an untargeted
+   attack. I.e. a baddie can do significant economic damage by voting against
+   with 1/3 chance, without even stealing keys or completely replacing the
+   binary.
+4. **Stealing keys** - That would be pretty bad. Should not be possible with
+   sandboxing. We should at least not allow filesystem-access or network access.
+5. **Taking control over the validator.** E.g. replacing the `polkadot` binary
+   with a `polkadot-evil` binary. Should again not be possible with the above
+   sandboxing in place.
+6. **Intercepting and manipulating packages** - Effect very similar to the
+   above, hard to do without also being able to do 4 or 5.
+
+### Restricting file-system access
+
+A basic security mechanism is to make sure that any thread directly interfacing
+with untrusted code does not have access to the file-system. This provides some
+protection against attackers accessing sensitive data or modifying data on the
+host machine.