Files
pezkuwi-fellows/stale/0102-offchain-parachain-runtime-upgrades.html
T
paritytech-rfc-bot[bot] b637db84e6 deploy: db260ea2f9
2024-09-03 15:27:53 +00:00

568 lines
40 KiB
HTML

<!DOCTYPE HTML>
<html lang="en" class="polkadot" dir="ltr">
<head>
<!-- Book generated using mdBook -->
<meta charset="UTF-8">
<title>RFC-0000: Feature Name Here - Polkadot Fellowship RFCs</title>
<!-- Custom HTML head -->
<meta name="description" content="An online book of RFCs approved or proposed within the Polkadot Fellowship.">
<meta name="viewport" content="width=device-width, initial-scale=1">
<meta name="theme-color" content="#ffffff">
<link rel="icon" href="../favicon.svg">
<link rel="shortcut icon" href="../favicon.png">
<link rel="stylesheet" href="../css/variables.css">
<link rel="stylesheet" href="../css/general.css">
<link rel="stylesheet" href="../css/chrome.css">
<link rel="stylesheet" href="../css/print.css" media="print">
<!-- Fonts -->
<link rel="stylesheet" href="../FontAwesome/css/font-awesome.css">
<link rel="stylesheet" href="../fonts/fonts.css">
<!-- Highlight.js Stylesheets -->
<link rel="stylesheet" href="../highlight.css">
<link rel="stylesheet" href="../tomorrow-night.css">
<link rel="stylesheet" href="../ayu-highlight.css">
<!-- Custom theme stylesheets -->
<link rel="stylesheet" href="../theme/polkadot.css">
</head>
<body class="sidebar-visible no-js">
<div id="body-container">
<!-- Provide site root to javascript -->
<script>
var path_to_root = "../";
var default_theme = window.matchMedia("(prefers-color-scheme: dark)").matches ? "polkadot" : "polkadot";
</script>
<!-- Work around some values being stored in localStorage wrapped in quotes -->
<script>
try {
var theme = localStorage.getItem('mdbook-theme');
var sidebar = localStorage.getItem('mdbook-sidebar');
if (theme.startsWith('"') && theme.endsWith('"')) {
localStorage.setItem('mdbook-theme', theme.slice(1, theme.length - 1));
}
if (sidebar.startsWith('"') && sidebar.endsWith('"')) {
localStorage.setItem('mdbook-sidebar', sidebar.slice(1, sidebar.length - 1));
}
} catch (e) { }
</script>
<!-- Set the theme before any content is loaded, prevents flash -->
<script>
var theme;
try { theme = localStorage.getItem('mdbook-theme'); } catch(e) { }
if (theme === null || theme === undefined) { theme = default_theme; }
var html = document.querySelector('html');
html.classList.remove('polkadot')
html.classList.add(theme);
var body = document.querySelector('body');
body.classList.remove('no-js')
body.classList.add('js');
</script>
<input type="checkbox" id="sidebar-toggle-anchor" class="hidden">
<!-- Hide / unhide sidebar before it is displayed -->
<script>
var body = document.querySelector('body');
var sidebar = null;
var sidebar_toggle = document.getElementById("sidebar-toggle-anchor");
if (document.body.clientWidth >= 1080) {
try { sidebar = localStorage.getItem('mdbook-sidebar'); } catch(e) { }
sidebar = sidebar || 'visible';
} else {
sidebar = 'hidden';
}
sidebar_toggle.checked = sidebar === 'visible';
body.classList.remove('sidebar-visible');
body.classList.add("sidebar-" + sidebar);
</script>
<nav id="sidebar" class="sidebar" aria-label="Table of contents">
<div class="sidebar-scrollbox">
<ol class="chapter"><li class="chapter-item expanded affix "><a href="../introduction.html">Introduction</a></li><li class="spacer"></li><li class="chapter-item expanded affix "><li class="part-title">Newly Proposed</li><li class="spacer"></li><li class="chapter-item expanded affix "><li class="part-title">Proposed</li><li class="chapter-item expanded "><a href="../proposed/00xx-smart-contracts-coretime-chain.html">RFC-0002: Smart Contracts on the Coretime Chain</a></li><li class="chapter-item expanded "><a href="../proposed/0103-introduce-core-index-commitment.html">RFC-0103: Introduce a CoreIndex commitment and a SessionIndex field in candidate receipts</a></li><li class="chapter-item expanded "><a href="../proposed/0111-pure-proxy-replication.html">RFC-0111: Pure Proxy Replication</a></li><li class="chapter-item expanded "><a href="../proposed/0112-compress-state-response-message-in-state-sync.html">RFC-0112: Compress the State Response Message in State Sync</a></li><li class="chapter-item expanded "><a href="../proposed/0114-secp256r1-hostfunction.html">RFC-0114: Introduce secp256r1_ecdsa_verify_prehashed Host Function to verify NIST-P256 elliptic curve signatures</a></li><li class="chapter-item expanded "><a href="../proposed/0117-unbrick-collective.html">RFC-0117: The Unbrick Collective</a></li><li class="chapter-item expanded "><a href="../proposed/RFC-114 Adjust Tipper Track Confirmation Periods.html">RFC-114: Adjust Tipper Track Confirmation Periods</a></li><li class="spacer"></li><li class="chapter-item expanded affix "><li class="part-title">Approved</li><li class="chapter-item expanded "><a href="../approved/0001-agile-coretime.html">RFC-1: Agile Coretime</a></li><li class="chapter-item expanded "><a href="../approved/0005-coretime-interface.html">RFC-5: Coretime Interface</a></li><li class="chapter-item expanded "><a href="../approved/0007-system-collator-selection.html">RFC-0007: System Collator Selection</a></li><li class="chapter-item expanded "><a href="../approved/0008-parachain-bootnodes-dht.html">RFC-0008: Store parachain bootnodes in relay chain DHT</a></li><li class="chapter-item expanded "><a href="../approved/0010-burn-coretime-revenue.html">RFC-0010: Burn Coretime Revenue</a></li><li class="chapter-item expanded "><a href="../approved/0012-process-for-adding-new-collectives.html">RFC-0012: Process for Adding New System Collectives</a></li><li class="chapter-item expanded "><a href="../approved/0013-prepare-blockbuilder-and-core-runtime-apis-for-mbms.html">RFC-0013: Prepare Core runtime API for MBMs</a></li><li class="chapter-item expanded "><a href="../approved/0014-improve-locking-mechanism-for-parachains.html">RFC-0014: Improve locking mechanism for parachains</a></li><li class="chapter-item expanded "><a href="../approved/0022-adopt-encointer-runtime.html">RFC-0022: Adopt Encointer Runtime</a></li><li class="chapter-item expanded "><a href="../approved/0026-sassafras-consensus.html">RFC-0026: Sassafras Consensus Protocol</a></li><li class="chapter-item expanded "><a href="../approved/0032-minimal-relay.html">RFC-0032: Minimal Relay</a></li><li class="chapter-item expanded "><a href="../approved/0042-extrinsics-state-version.html">RFC-0042: Add System version that replaces StateVersion on RuntimeVersion</a></li><li class="chapter-item expanded "><a href="../approved/0043-storage-proof-size-hostfunction.html">RFC-0043: Introduce storage_proof_size Host Function for Improved Parachain Block Utilization</a></li><li class="chapter-item expanded "><a href="../approved/0045-nft-deposits-asset-hub.html">RFC-0045: Lowering NFT Deposits on Asset Hub</a></li><li class="chapter-item expanded "><a href="../approved/0047-assignment-of-availability-chunks.html">RFC-0047: Assignment of availability chunks to validators</a></li><li class="chapter-item expanded "><a href="../approved/0048-session-keys-runtime-api.html">RFC-0048: Generate ownership proof for SessionKeys</a></li><li class="chapter-item expanded "><a href="../approved/0050-fellowship-salaries.html">RFC-0050: Fellowship Salaries</a></li><li class="chapter-item expanded "><a href="../approved/0056-one-transaction-per-notification.html">RFC-0056: Enforce only one transaction per notification</a></li><li class="chapter-item expanded "><a href="../approved/0059-nodes-capabilities-discovery.html">RFC-0059: Add a discovery mechanism for nodes based on their capabilities</a></li><li class="chapter-item expanded "><a href="../approved/0078-merkleized-metadata.html">RFC-0078: Merkleized Metadata</a></li><li class="chapter-item expanded "><a href="../approved/0084-general-transaction-extrinsic-format.html">RFC-0084: General transactions in extrinsic format</a></li><li class="chapter-item expanded "><a href="../approved/0091-dht-record-creation-time.html">RFC-0091: DHT Authority discovery record creation time</a></li><li class="chapter-item expanded "><a href="../approved/0097-unbonding_queue.html">RFC-0097: Unbonding Queue</a></li><li class="chapter-item expanded "><a href="../approved/0099-transaction-extension-version.html">RFC-0099: Introduce a transaction extension version</a></li><li class="chapter-item expanded "><a href="../approved/0100-xcm-multi-type-asset-transfer.html">RFC-0100: New XCM instruction: InitiateAssetsTransfer</a></li><li class="chapter-item expanded "><a href="../approved/0101-xcm-transact-remove-max-weight-param.html">RFC-0101: XCM Transact remove require_weight_at_most parameter</a></li><li class="chapter-item expanded "><a href="../approved/0105-xcm-improved-fee-mechanism.html">RFC-0105: XCM improved fee mechanism</a></li><li class="chapter-item expanded "><a href="../approved/0107-xcm-execution-hints.html">RFC-0107: XCM Execution hints</a></li><li class="chapter-item expanded "><a href="../approved/0108-xcm-remove-testnet-ids.html">RFC-0108: Remove XCM testnet NetworkIds</a></li><li class="spacer"></li><li class="chapter-item expanded affix "><li class="part-title">Stale</li><li class="chapter-item expanded "><a href="../stale/0004-remove-unnecessary-allocator-usage.html">RFC-0004: Remove the host-side runtime memory allocator</a></li><li class="chapter-item expanded "><a href="../stale/0006-dynamic-pricing-for-bulk-coretime-sales.html">RFC-0006: Dynamic Pricing for Bulk Coretime Sales</a></li><li class="chapter-item expanded "><a href="../stale/0009-improved-net-light-client-requests.html">RFC-0009: Improved light client requests networking protocol</a></li><li class="chapter-item expanded "><a href="../stale/0015-market-design-revisit.html">RFC-0015: Market Design Revisit</a></li><li class="chapter-item expanded "><a href="../stale/0034-xcm-absolute-location-account-derivation.html">RFC-34: XCM Absolute Location Account Derivation</a></li><li class="chapter-item expanded "><a href="../stale/0035-conviction-voting-delegation-modifications.html"> RFC-0035: Conviction Voting Delegation Modifications</a></li><li class="chapter-item expanded "><a href="../stale/0044-rent-based-registration.html">RFC-0044: Rent based registration model</a></li><li class="chapter-item expanded "><a href="../stale/0054-remove-heap-pages.html">RFC-0054: Remove the concept of "heap pages" from the client</a></li><li class="chapter-item expanded "><a href="../stale/0070-x-track-kusamanetwork.html">RFC-0070: X Track for @kusamanetwork</a></li><li class="chapter-item expanded "><a href="../stale/0073-referedum-deposit-track.html">RFC-0073: Decision Deposit Referendum Track</a></li><li class="chapter-item expanded "><a href="../stale/0074-stateful-multisig-pallet.html">RFC-0074: Stateful Multisig Pallet</a></li><li class="chapter-item expanded "><a href="../stale/0077-increase-max-length-of-identity-pgp-fingerprint-value.html">RFC-0077: Increase maximum length of identity PGP fingerprint values from 20 bytes</a></li><li class="chapter-item expanded "><a href="../stale/0088-broker-pallet-slashable-deposit-purchaser-reputation-reserved-cores.html">RFC-0088: Add slashable locked deposit, purchaser reputation, and reserved cores for on-chain identities to broker pallet</a></li><li class="chapter-item expanded "><a href="../stale/0089-flexible-inflation.html">RFC-0089: Flexible Inflation</a></li><li class="chapter-item expanded "><a href="../stale/00xx-secondary-marketplace-for-regions.html">RFC-0001: Secondary Market for Regions</a></li><li class="chapter-item expanded "><a href="../stale/0102-offchain-parachain-runtime-upgrades.html" class="active">RFC-0000: Feature Name Here</a></li><li class="chapter-item expanded "><a href="../stale/0106-xcm-remove-fees-mode.html">RFC-0106: Remove XCM fees mode</a></li><li class="chapter-item expanded "><a href="../stale/0109-xcm-descend-instead-of-clear-origin.html">RFC-0109: Descend XCM origin instead of clearing it where possible</a></li><li class="chapter-item expanded "><a href="../stale/TODO-stale-nomination-reward-curve.html">RFC-TODO: Stale Nomination Reward Curve</a></li></ol>
</div>
<div id="sidebar-resize-handle" class="sidebar-resize-handle"></div>
</nav>
<!-- Track and set sidebar scroll position -->
<script>
var sidebarScrollbox = document.querySelector('#sidebar .sidebar-scrollbox');
sidebarScrollbox.addEventListener('click', function(e) {
if (e.target.tagName === 'A') {
sessionStorage.setItem('sidebar-scroll', sidebarScrollbox.scrollTop);
}
}, { passive: true });
var sidebarScrollTop = sessionStorage.getItem('sidebar-scroll');
sessionStorage.removeItem('sidebar-scroll');
if (sidebarScrollTop) {
// preserve sidebar scroll position when navigating via links within sidebar
sidebarScrollbox.scrollTop = sidebarScrollTop;
} else {
// scroll sidebar to current active section when navigating via "next/previous chapter" buttons
var activeSection = document.querySelector('#sidebar .active');
if (activeSection) {
activeSection.scrollIntoView({ block: 'center' });
}
}
</script>
<div id="page-wrapper" class="page-wrapper">
<div class="page">
<div id="menu-bar-hover-placeholder"></div>
<div id="menu-bar" class="menu-bar sticky">
<div class="left-buttons">
<label id="sidebar-toggle" class="icon-button" for="sidebar-toggle-anchor" title="Toggle Table of Contents" aria-label="Toggle Table of Contents" aria-controls="sidebar">
<i class="fa fa-bars"></i>
</label>
<button id="theme-toggle" class="icon-button" type="button" title="Change theme" aria-label="Change theme" aria-haspopup="true" aria-expanded="false" aria-controls="theme-list">
<i class="fa fa-paint-brush"></i>
</button>
<ul id="theme-list" class="theme-popup" aria-label="Themes" role="menu">
<li role="none"><button role="menuitem" class="theme" id="polkadot">Polkadot</button></li>
<li role="none"><button role="menuitem" class="theme" id="light">Light</button></li>
<li role="none"><button role="menuitem" class="theme" id="rust">Rust</button></li>
<li role="none"><button role="menuitem" class="theme" id="coal">Coal</button></li>
<li role="none"><button role="menuitem" class="theme" id="navy">Navy</button></li>
<li role="none"><button role="menuitem" class="theme" id="ayu">Ayu</button></li>
</ul>
<button id="search-toggle" class="icon-button" type="button" title="Search. (Shortkey: s)" aria-label="Toggle Searchbar" aria-expanded="false" aria-keyshortcuts="S" aria-controls="searchbar">
<i class="fa fa-search"></i>
</button>
</div>
<h1 class="menu-title">Polkadot Fellowship RFCs</h1>
<div class="right-buttons">
<a href="../print.html" title="Print this book" aria-label="Print this book">
<i id="print-button" class="fa fa-print"></i>
</a>
</div>
</div>
<div id="search-wrapper" class="hidden">
<form id="searchbar-outer" class="searchbar-outer">
<input type="search" id="searchbar" name="searchbar" placeholder="Search this book ..." aria-controls="searchresults-outer" aria-describedby="searchresults-header">
</form>
<div id="searchresults-outer" class="searchresults-outer hidden">
<div id="searchresults-header" class="searchresults-header"></div>
<ul id="searchresults">
</ul>
</div>
</div>
<!-- Apply ARIA attributes after the sidebar and the sidebar toggle button are added to the DOM -->
<script>
document.getElementById('sidebar-toggle').setAttribute('aria-expanded', sidebar === 'visible');
document.getElementById('sidebar').setAttribute('aria-hidden', sidebar !== 'visible');
Array.from(document.querySelectorAll('#sidebar a')).forEach(function(link) {
link.setAttribute('tabIndex', sidebar === 'visible' ? 0 : -1);
});
</script>
<div id="content" class="content">
<main>
<p><a href="https://github.com/polkadot-fellows/RFCs/pull/102">(source)</a></p>
<p><strong>Table of Contents</strong></p>
<ul>
<li><a href="#rfc-0000-feature-name-here">RFC-0000: Feature Name Here</a>
<ul>
<li><a href="#summary">Summary</a></li>
<li><a href="#motivation">Motivation</a></li>
<li><a href="#stakeholders">Stakeholders</a></li>
<li><a href="#explanation">Explanation</a>
<ul>
<li><a href="#introduce-a-new-ump-message-type-requestcodeupgrade">Introduce a new UMP message type <code>RequestCodeUpgrade</code></a></li>
<li><a href="#handle-requestcodeupgrade-on-backers">Handle <code>RequestCodeUpgrade</code> on backers</a></li>
<li><a href="#get-the-new-code-to-all-validators">Get the new code to all validators</a></li>
<li><a href="#on-chain-code-upgrade-process">On-chain code upgrade process</a></li>
<li><a href="#handling-new-validators">Handling new validators</a></li>
<li><a href="#how-do-other-parties-get-hold-of-the-pvf">How do other parties get hold of the PVF?</a></li>
<li><a href="#pruning">Pruning</a></li>
</ul>
</li>
<li><a href="#drawbacks">Drawbacks</a></li>
<li><a href="#testing-security-and-privacy">Testing, Security, and Privacy</a></li>
<li><a href="#performance-ergonomics-and-compatibility">Performance, Ergonomics, and Compatibility</a>
<ul>
<li><a href="#performance">Performance</a></li>
<li><a href="#ergonomics">Ergonomics</a></li>
<li><a href="#compatibility">Compatibility</a></li>
</ul>
</li>
<li><a href="#prior-art-and-references">Prior Art and References</a></li>
<li><a href="#unresolved-questions">Unresolved Questions</a></li>
<li><a href="#future-directions-and-related-material">Future Directions and Related Material</a>
<ul>
<li><a href="#further-hardening">Further Hardening</a></li>
<li><a href="#generalize-this-off-chain-storage-mechanism">Generalize this off-chain storage mechanism?</a></li>
</ul>
</li>
</ul>
</li>
</ul>
<h1 id="rfc-0000-feature-name-here"><a class="header" href="#rfc-0000-feature-name-here">RFC-0000: Feature Name Here</a></h1>
<div class="table-wrapper"><table><thead><tr><th></th><th></th></tr></thead><tbody>
<tr><td><strong>Start Date</strong></td><td>13 July 2024</td></tr>
<tr><td><strong>Description</strong></td><td>Implement off-chain parachain runtime upgrades</td></tr>
<tr><td><strong>Authors</strong></td><td>eskimor</td></tr>
</tbody></table>
</div>
<h2 id="summary"><a class="header" href="#summary">Summary</a></h2>
<p>Change the upgrade process of a parachain runtime upgrade to become an off-chain
process with regards to the relay chain. Upgrades are still contained in
parachain blocks, but will no longer need to end up in relay chain blocks nor in
relay chain state.</p>
<h2 id="motivation"><a class="header" href="#motivation">Motivation</a></h2>
<p>Having parachain runtime upgrades go through the relay chain has always been
seen as a scalability concern. Due to optimizations in statement
distribution and asynchronous backing it became less crucial and got
de-prioritized, the original issue can be found
<a href="https://github.com/paritytech/polkadot-sdk/issues/971">here</a>.</p>
<p>With the introduction of Agile Coretime and in general our efforts to reduce
barrier to entry more for Polkadot more, the issue becomes more relevant again:
We would like to reduce the required storage deposit for PVF registration, with
the aim to not only make it cheaper to run a parachain (bulk + on-demand
coretime), but also reduce the amount of capital required for the deposit. With
this we would hope for far more parachains to get registered, thousands
potentially even ten thousands. With so many PVFs registered, updates are
expected to become more frequent and even attacks on service quality for other
parachains would become a higher risk.</p>
<h2 id="stakeholders"><a class="header" href="#stakeholders">Stakeholders</a></h2>
<ul>
<li>Parachain Teams</li>
<li>Relay Chain Node implementation teams</li>
<li>Relay Chain runtime developers</li>
</ul>
<h2 id="explanation"><a class="header" href="#explanation">Explanation</a></h2>
<p>The issues with on-chain runtime upgrades are:</p>
<ol>
<li>Needlessly costly.</li>
<li>A single runtime upgrade more or less occupies an entire relay chain block, thus it
might affect also other parachains, especially if their candidates are also
not negligible due to messages for example or they want to uprade their
runtime at the same time.</li>
<li>The signalling of the parachain to notify the relay chain of an upcoming
runtime upgrade already contains the upgrade. Therefore the only way to rate
limit upgrades is to drop an already distributed update in the size of
megabytes: With the result that the parachain missed a block and more
importantly it will try again with the very next block, until it finally
succeeds. If we imagine to reduce capacity of runtime upgrades to let's say 1
every 100 relay chain blocks, this results in lot's of wasted effort and lost
blocks.</li>
</ol>
<p>We discussed introducing a separate signalling before submitting the actual
runtime, but I think we should just go one step further and make upgrades fully
off-chain. Which also helps bringing down deposit costs in a secure way, as we
are also actually reducing costs for the network.</p>
<h3 id="introduce-a-new-ump-message-type-requestcodeupgrade"><a class="header" href="#introduce-a-new-ump-message-type-requestcodeupgrade">Introduce a new UMP message type <code>RequestCodeUpgrade</code></a></h3>
<p>As part of elastic scaling we are already planning to increase flexibility of <a href="https://github.com/polkadot-fellows/RFCs/issues/92#issuecomment-2144538974">UMP
messages</a>, we can now use this to our advantage and introduce another UMP message:</p>
<pre><pre class="playground"><code class="language-rust"><span class="boring">#![allow(unused)]
</span><span class="boring">fn main() {
</span>enum UMPSignal {
// For elastic scaling
OnCore(CoreIndex),
// For off-chain upgrades
RequestCodeUpgrade(Hash),
}
<span class="boring">}</span></code></pre></pre>
<p>We could also make that new message a regular XCM, calling an extrinsic on the
relay chain, but we will want to look into that message right after validation
on the backers on the node side, making a straight forward semantic message more
apt for the purpose.</p>
<h3 id="handle-requestcodeupgrade-on-backers"><a class="header" href="#handle-requestcodeupgrade-on-backers">Handle <code>RequestCodeUpgrade</code> on backers</a></h3>
<p>We will introduce a new request/response protocol for both collators and
validators, with the following request/response:</p>
<pre><pre class="playground"><code class="language-rust"><span class="boring">#![allow(unused)]
</span><span class="boring">fn main() {
</span>struct RequestBlob {
blob_hash: Hash,
}
struct BlobResponse {
blob: Vec&lt;u8&gt;
}
<span class="boring">}</span></code></pre></pre>
<p>This protocol will be used by backers to request the PVF from collators in the
following conditions:</p>
<ol>
<li>They received a collation sending <code>RequestCodeUpgrade</code>.</li>
<li>They received a collation, but they don't yet have the code that was
previously registered on the relaychain. (E.g. disk pruned, new validator)</li>
</ol>
<p>In case they received the collation via PoV distribution instead of from the
collator itself, they will use the exact same message to fetch from the valiator
they got the PoV from.</p>
<h3 id="get-the-new-code-to-all-validators"><a class="header" href="#get-the-new-code-to-all-validators">Get the new code to all validators</a></h3>
<p>Once the candidate issuing <code>RequestCodeUpgrade</code> got backed on chain, validators
will start fetching the code from the backers as part of availability
distribution.</p>
<p>To mitigate attack vectors we should make sure that serving requests for code
can be treated as low priority requests. Thus I am suggesting the following
scheme:</p>
<p>Validators will notice via a runtime API (TODO: Define) that a new code has been requested, the
API will return the <code>Hash</code> and a counter, which starts at some configurable
value e.g. 10. The validators are now aware of the new hash and start fetching,
but they don't have to wait for the fetch to succeed to sign their bitfield.</p>
<p>Then on each further candidate from that chain that counter gets decremented.
Validators which have not yet succeeded fetching will now try again. This game
continues until the counter reached <code>0</code>. Now it is mandatory to have to code in
order to sign a <code>1</code> in the bitfield.</p>
<p>PVF pre-checking will happen after the candidate which brought the counter to
<code>0</code> has been successfully included and thus is also able to assume that 2/3 of
the validators have the code.</p>
<p>This scheme serves two purposes:</p>
<ol>
<li>Fetching can happen over a longer period of time with low priority. E.g. if
we waited for the PVF at the very first avaialbility distribution, this might
actually affect liveness of other chains on the same core. Distributing
megabytes of data to a thousand validators, might take a bit. Thus this helps
isolating parachains from each other.</li>
<li>By configuring the initial counter value we can affect how much an upgrade
costs. E.g. forcing the parachain to produce 10 blocks, means 10x the cost
for issuing an update. If too frequent upgrades ever become a problem for the
system, we have a knob to make them more costly.</li>
</ol>
<h3 id="on-chain-code-upgrade-process"><a class="header" href="#on-chain-code-upgrade-process">On-chain code upgrade process</a></h3>
<p>First when a candidate is backed we need to make the new hash available
(together with a counter) via a
runtime API so validators in availability distribution can check for it and
fetch it if changed (see previous section). For performance reasons, I think we
should not do an additional call, but replace the <a href="https://github.com/paritytech/polkadot-sdk/blob/d2fd53645654d3b8e12cbf735b67b93078d70113/polkadot/node/subsystem-util/src/runtime/mod.rs#L355">existing one</a> with one containing the new additional information (Option&lt;(Hash, Counter)&gt;).</p>
<p>Once the candidate gets included (counter 0), the hash is given to pre-checking
and only after pre-checking succeeded (and a full session passed) it is finally
enacted and the parachain can switch to the new code. (Same process as it used
to be.)</p>
<h3 id="handling-new-validators"><a class="header" href="#handling-new-validators">Handling new validators</a></h3>
<h4 id="backers"><a class="header" href="#backers">Backers</a></h4>
<p>If a backer receives a collation for a parachain it does not yet have the code
as enacted on chain (see &quot;On-chain code upgrade process&quot;), it will use above
request/response protocol to fetch it from whom it received the collation.</p>
<h4 id="availablity-distribution"><a class="header" href="#availablity-distribution">Availablity Distribution</a></h4>
<p>Validators in availability distribution will be changed to only sign a <code>1</code> in
the bitfield of a candidate if they not only have the chunk, but also the
currently active PVF. They will fetch it from backers in case they don't have it
yet.</p>
<h3 id="how-do-other-parties-get-hold-of-the-pvf"><a class="header" href="#how-do-other-parties-get-hold-of-the-pvf">How do other parties get hold of the PVF?</a></h3>
<p>Two ways:</p>
<ol>
<li>Discover collators via <a href="https://github.com/polkadot-fellows/RFCs/pull/8">relay chain DHT</a> and request from them: Preferred way,
as it is less load on validators.</li>
<li>Request from validators, which will serve on a best effort basis.</li>
</ol>
<h3 id="pruning"><a class="header" href="#pruning">Pruning</a></h3>
<p>We covered how validators get hold of new code, but when can they prune old ones?
In principle it is not an issue, if some validors prune code, because:</p>
<ol>
<li>We changed it so that a candidate is not deemed available if validators were
not able to fetch the PVF.</li>
<li>Backers can always fetch the PVF from collators as part of the collation
fetching.</li>
</ol>
<p>But the majority of validators should always keep the latest code of any
parachain and only prune the previous one, once the first candidate using the
new code got finalized. This ensures that disputes will always be able to
resolve.</p>
<h2 id="drawbacks"><a class="header" href="#drawbacks">Drawbacks</a></h2>
<p>The major drawback of this solution is the same as any solution the moves work
off-chain, it adds complexity to the node. E.g. nodes needing the PVF, need to
store them separately, together with their own pruning strategy as well.</p>
<h2 id="testing-security-and-privacy"><a class="header" href="#testing-security-and-privacy">Testing, Security, and Privacy</a></h2>
<p>Implementations adhering to this RFC, will respond to PVF requests with the
actual PVF, if they have it. Requesters will persist received PVFs on disk for
as long as they are replaced by a new one. Implementations must not be lazy
here, if validators only fetched the PVF when needed, they can be prevented from
participating in disputes.</p>
<p>Validators should treat incoming requests for PVFs in general with rather low
priority, but should prefer fetches from other validators over requests from
random peers.</p>
<p>Given that we are altering what set bits in the availability bitfields mean (not
only chunk, but also PVF available), it is important to have enough validators
upgraded, before we allow collators to make use of the new runtime upgrade
mechanism. Otherwise we would risk disputes to not being able to succeed.</p>
<p>This RFC has no impact on privacy.</p>
<h2 id="performance-ergonomics-and-compatibility"><a class="header" href="#performance-ergonomics-and-compatibility">Performance, Ergonomics, and Compatibility</a></h2>
<h3 id="performance"><a class="header" href="#performance">Performance</a></h3>
<p>This proposal lightens the load on the relay chain and is thus in general
beneficial for the performance of the network, this is achieved by the
following:</p>
<ol>
<li>Code upgrades are still propagated to all validators, but only once, not
twice (First statements, then via the containing relay chain block).</li>
<li>Code upgrades are only communicated to validators and other nodes which are
interested, not any full node as it has been before.</li>
<li>Relay chain block space is preserved. Previously we could only do one runtime
upgrade per relay chain block, occupying almost all of the blockspace.</li>
<li>Signalling an upgrade no longer contains the upgrade, hence if we need to
push back on an upgrade for whatever reason, no network bandwidth and core
time gets wasted because of this.</li>
</ol>
<h3 id="ergonomics"><a class="header" href="#ergonomics">Ergonomics</a></h3>
<p>End users are only affected by better performance and more stable block times.
Parachains will need to implement the introduced request/response protocol and
adapt to the new signalling mechanism via an <code>UMP</code> message, instead of sending
the code upgrade directly.</p>
<p>For parachain operators we should emit events on initiated runtime upgrade and
each block reporting the current counter and how many blocks to go until the
upgrade gets passed to pre-checking. This is especially important for on-demand
chains or bulk users not occupying a full core. Further more that behaviour of
requiring multiple blocks to fully initiate a runtime upgrade needs to be well
documented.</p>
<h3 id="compatibility"><a class="header" href="#compatibility">Compatibility</a></h3>
<p>We will continue to support the old mechanism for code upgrades for a while, but
will start to impose stricter limits over time, with the number of registered
parachains going up. With those limits in place parachains not migrating to the
new scheme might be having a harder time upgrading and will miss more blocks. I
guess we can be lenient for a while still, so the upgrade path for
parachains should be rather smooth.</p>
<p>In total the protocol changes we need are:</p>
<p>For validators and collators:</p>
<ol>
<li>New request/response protocol for fetching PVF data from collators and
validators.</li>
<li>New UMP message type for signalling a runtime upgrade.</li>
</ol>
<p>Only for validators:</p>
<ol>
<li>New runtime API for determining to be enacted code upgrades.</li>
<li>Different behaviour of bitfields (only sign a 1 bit, if validator has chunk +
&quot;hot&quot; PVF).</li>
<li>Altered behaviour in availability-distribution: Fetch missing PVFS.</li>
</ol>
<h2 id="prior-art-and-references"><a class="header" href="#prior-art-and-references">Prior Art and References</a></h2>
<p>Off-chain runtime upgrades have been discussed before, the architecture
described here is simpler though as it piggybacks on already existing features,
namely:</p>
<ol>
<li>availability-distribution: No separate <code>I have code</code> messages anymore.</li>
<li>Existing pre-checking.</li>
</ol>
<p>https://github.com/paritytech/polkadot-sdk/issues/971</p>
<h2 id="unresolved-questions"><a class="header" href="#unresolved-questions">Unresolved Questions</a></h2>
<ol>
<li>What about the initial runtime, shall we make that off-chain as well?</li>
<li>Good news, at least after the first upgrade, no code will be stored on chain
any more, this means that we also have to redefine the storage deposit now.
We no longer charge for chain storage, but validator disk storage -&gt; Should
be cheaper. Solution to this: Not only store the hash on chain, but also the
size of the data. Then define a price per byte and charge that, but:
<ul>
<li>how do we charge - I guess deposit has to be provided via other means,
runtime upgrade fails if not provided.</li>
<li>how do we signal to the chain that the code is too large for it to reject
the upgrade? Easy: Make available and vote nay in pre-checking.</li>
</ul>
</li>
</ol>
<p>TODO: Fully resolve these questions and incorporate in RFC text.</p>
<h2 id="future-directions-and-related-material"><a class="header" href="#future-directions-and-related-material">Future Directions and Related Material</a></h2>
<h3 id="further-hardening"><a class="header" href="#further-hardening">Further Hardening</a></h3>
<p>By no longer having code upgrade go through the relay chain, occupying a full relay
chain block, the impact on other parachains is already greatly reduced, if we
make distribution and PVF pre-checking low-priority processes on validators. The
only thing attackers might be able to do is delay upgrades of other parachains.</p>
<p>Which seems like a problem to be solved once we actually see it as a problem in
the wild (and can already be mitigated by adjusting the counter). The good thing
is that we have all the ingredients to go further if need be. Signalling no
longer actually includes the code, hence there is no need to reject the
candidate: The parachain can make progress even if we choose not to immediately
act on the request and no relay chain resources are wasted either.</p>
<p>We could for example introduce another UMP Signalling message
<code>RequestCodeUpgradeWithPriority</code> which not just requests a code upgrade, but
also offers some DOT to get ranked up in a queue.</p>
<h3 id="generalize-this-off-chain-storage-mechanism"><a class="header" href="#generalize-this-off-chain-storage-mechanism">Generalize this off-chain storage mechanism?</a></h3>
<p>Making this storage mechanism more general purpose is worth thinking about. E.g.
by resolving above &quot;fee&quot; question, we might also be able to resolve the pruning
question in a more generic way and thus could indeed open this storage facility
for other purposes as well. E.g. smart contracts, so the PoV would only need to
reference contracts by hash and the actual PoV is stored on validators and
collators and thus no longer needs to be part of the PoV.</p>
<p>A possible avenue would be to change the response to:</p>
<pre><pre class="playground"><code class="language-rust"><span class="boring">#![allow(unused)]
</span><span class="boring">fn main() {
</span>enum BlobResponse {
Blob(Vec&lt;u8&gt;),
Blobs(MerkleTree),
}
<span class="boring">}</span></code></pre></pre>
<p>With this the hash specified in the request can also be a merkle root and the
responder will respond with the entire merkle tree (only hashes, no payload).
Then the requester can traverse the leaf hashes and use the same request
response protocol to request any locally missing blobs in that tree.</p>
<p>One leaf would for example be the PVF others could be smart contracts. With a
properly specified format (e.g. which leaf is the PVF?), what we got here is
that a parachain can not only update its PVF, but additional data,
incrementally. E.g. adding another smart contract, does not require resubmitting
the entire PVF to validators, only the root hash on the relay chain gets
updated, then validators fetch the merkle tree and only fetch any missing
leaves. That additional data could be made available to the PVF via a to be
added host function. The nice thing about this approach is, that while we can
upgrade incrementally, lifetime is still tied to the PVF and we get all the same
guarantees. Assuming the validators store blobs by hash, we even get disk
sharing if multiple parachains use the same data (e.g. same smart contracts).</p>
</main>
<nav class="nav-wrapper" aria-label="Page navigation">
<!-- Mobile navigation buttons -->
<a rel="prev" href="../stale/00xx-secondary-marketplace-for-regions.html" class="mobile-nav-chapters previous" title="Previous chapter" aria-label="Previous chapter" aria-keyshortcuts="Left">
<i class="fa fa-angle-left"></i>
</a>
<a rel="next prefetch" href="../stale/0106-xcm-remove-fees-mode.html" class="mobile-nav-chapters next" title="Next chapter" aria-label="Next chapter" aria-keyshortcuts="Right">
<i class="fa fa-angle-right"></i>
</a>
<div style="clear: both"></div>
</nav>
</div>
</div>
<nav class="nav-wide-wrapper" aria-label="Page navigation">
<a rel="prev" href="../stale/00xx-secondary-marketplace-for-regions.html" class="nav-chapters previous" title="Previous chapter" aria-label="Previous chapter" aria-keyshortcuts="Left">
<i class="fa fa-angle-left"></i>
</a>
<a rel="next prefetch" href="../stale/0106-xcm-remove-fees-mode.html" class="nav-chapters next" title="Next chapter" aria-label="Next chapter" aria-keyshortcuts="Right">
<i class="fa fa-angle-right"></i>
</a>
</nav>
</div>
<script>
window.playground_copyable = true;
</script>
<script src="../elasticlunr.min.js"></script>
<script src="../mark.min.js"></script>
<script src="../searcher.js"></script>
<script src="../clipboard.min.js"></script>
<script src="../highlight.js"></script>
<script src="../book.js"></script>
<!-- Custom JS scripts -->
</div>
</body>
</html>