The revive compiler documentation (#424)

This PR adds comprehensive project documentation in the form of an
mdBook.

---------

Signed-off-by: xermicus <cyrill@parity.io>
Signed-off-by: Cyrill Leutwiler <bigcyrill@hotmail.com>
Co-authored-by: LJ <81748770+elle-j@users.noreply.github.com>
Co-authored-by: PG Herveou <pgherveou@gmail.com>
This commit is contained in:
xermicus
2025-12-01 14:58:02 +01:00
committed by GitHub
parent 94b14b079b
commit e7e40a0ded
87 changed files with 14012 additions and 43 deletions
+113
View File
@@ -0,0 +1,113 @@
# CLI usage
We aim to keep the `resolc` CLI usage close to `solc`. There are a few things and options worthwhile to know about in `resolc` which do not exist in the Ethereum world. This chapter explains those in more detail than the CLI help message.
> [!TIP]
>
> For the complete help about CLI options, please see `resolc --help`.
### LLVM optimization levels
```bash
-O, --optimization <OPTIMIZATION>
```
`resolc` exposes the optimization level setting for the LLVM backend. The performance and size of compiled contracts varies wiedly between different optimization levels.
Valid levels are the following:
- `0`: No optimizations are applied.
- `1`: Basic optimizations for execution time.
- `2`: Advanced optimizations for execution time.
- `3`: Aggressive optimizations for execution time.
- `s`: Optimize for code size.
- `z`: Aggressively optimize for code size.
By default, `-O3` is applied.
### Stack size
```bash
--stack-size <STACK_SIZE>
```
PVM is a register machine with a traditional stack memory space for local variables. This controls the total amount of stack space the contract can use.
You are incentivized to keep this value as small as possible:
1. Increasing the stack size will increase startup costs.
2. The stack size contributes to the total memory size a contract can use, which includes the contract's code size.
Default value: 32768
> [!WARNING]
>
> If the contract uses more stack memory than configured, it will compile fine but eventually revert execution at runtime!
### Heap size
```bash
--heap-size <HEAP_SIZE>
```
Unlike the EVM, due to the lack of dynamic memory metering, PVM contracts emulate the EVM heap memory with a static buffer. Consequentially, instead of infinite memory with exponentially growing gas costs, PVM contracts have a finite amount of memory with constant gas costs available.
You are incentivized to keep this value as small as possible:
1.Increasing the heap size will increase startup costs.
2.The heap size contributes to the total memory size a contract can use, which includes the contract's code size
Default value: 65536
> [!WARNING]
>
> If the contract uses more heap memory than configured, it will compile fine but eventually revert execution at runtime!
### solc
```bash
--solc <SOLC>
```
Specify the path to the `solc` executable. By default, the one in `${PATH}` is used.
### Debug artifacts
```bash
--debug-output-dir <DEBUG_OUTPUT_DIRECTORY>
```
Dump all intermediary compiler artifacts to files in the specified directory. This includes the YUL IR, optimized and unoptimized LLVM IR, the ELF object and the PVM assembly. Useful for debugging and development purposes.
### Debug info
```bash
-g
```
Generate source based debug information in the output code file. Useful for debugging and development purposes and disabled by default.
### Deploy time linking
```bash
--link [--libraries <LIBRARIES>] <INPUT_FILES>
```
In Solidity, 3 things can happen with libraries:
1. They are not `extern`ally callable and thus can be inlined.
1. The solc Solidity optimizer inlines those (usually the case). Note: `resolc` always activates the solc Solidity optimizer.
2. If the solc Solidity optimizer is disabled or for some reason fails to inline them (both rare), they are not inlined and require linking.
2. They are `extern`ally callable but still linked at compile time. This is the case if at compile time the library address is known (i.e. `--libraries` supplied in CLI or the corresponding setting in STD JSON input).
3. They are linked at deploy time. This happens when the compiler does not know the library address (i.e. `--libraries` flag is missing or the provided libraries are incomplete, same for STD JSON input). This case is rare because it's discourage and should never be used by production dApps.
In cases `1.2` and `3`:
- Some of the produced code blobs will be in the "unlinked" raw `ELF` object format and not yet deployable.
- To make them deployable, they need to be "linked" (done using the `resolc --link` linker mode explained below).
- The compiler emitted `DELEGATECALL` instructions to call non-inlined (unlinked) libraries. The contract deployer must make sure to deploy any libraries prior to contract deployment.
> [!WARNING]
>
> Using deploy time linking is officially **discouraged**. Mainly due to bytecode hashes changing after the fact. We decided to support it in `resolc` regardless, due to popular request.
Similar to how it works in `solc`, `--libraries` may be used to provide libraries during linking mode.
Unlike with `solc`, where linking implies a simple string substitution mechanism, `resolc` needs to resolve actual missing `ELF` symbols. This is due to how factory dependencies work in PVM. As a consequence, it isn't sufficient to just provide the unlinked blobs to the linker. Instead, they must be provided in the exact same directory structure the Solidity source code was found during compile time.
Example:
- The contract `src/foo/bar.sol:Bar` is involved in deploy time linking. It may be a factory dependency.
- The contract blob needs to be provided inside a relative `src/foo/` directory to `--link`. Otherwise symbol resolution may fail.
> [!NOTE]
>
> Tooling is supposed to take care of this. In the future, we may append explicit linkage data to simplify the deploy time linking feature.
+94
View File
@@ -0,0 +1,94 @@
# Differences to EVM
This section highlights some potentially observable differences in the [YUL EVM dialect](https://docs.soliditylang.org/en/latest/yul.html#evm-dialect) translation compared to Ethereum Solidity.
Solidity developers deploying dApps to [`pallet-revive`](https://github.com/paritytech/polkadot-sdk/tree/master/substrate/frame/revive) ought to read and understand this section well.
## Deploy code vs. runtime code
Our contract runtime does not differentiate between runtime code and deploy (constructor) code.
Instead, both are emitted into a single PVM contract code blob and live on-chain.
Therefore, in EVM terminology, the deploy code equals the runtime code.
> [!TIP]
>
> In constructor code, the `codesize` instruction will return the call data size instead of the actual code blob size.
## Solidity
We are aware of the following differences in the translation of Solidity code.
### `address.creationCode`
This returns the bytecode keccak256 hash instead.
## YUL functions
The below list contains noteworthy differences in the translation of YUL functions.
> [!NOTE]
>
> Many functions receive memory buffer offset pointer or size arguments. Since the PVM pointer size is 32 bit, supplying memory offset or buffer size values above `2^32-1` will trap the contract immediately.
The `solc` compiler ought to always emit valid memory references, so Solidity dApp authors don't need to worry about this unless they deal with low level `assembly` code.
### `mload`, `mstore`, `msize`, `mcopy` (memory related functions)
In general, revive preserves the memory layout, meaning low level memory operations are supported. However, a few caveats apply:
- The EVM linear heap memory is emulated using a fixed byte buffer of 64kb. This implies that the maximum memory a contract can use is limited to 64kbit (on Ethereum, contract memory is capped by gas and therefore varies).
- Thus, accessing memory offsets larger than the fixed buffer size will trap the contract at runtime with an `OutOfBound` error.
- The compiler might detect and optimize unused memory reads and writes, leading to a different `msize` compared to what the EVM would see.
### `calldataload`, `calldatacopy`
In the constructor code, the offset is ignored and this always returns `0`.
### `codecopy`
Only supported in constructor code.
### `invalid`
Traps the contract but does not consume the remaining gas.
### `create`, `create2`
Deployments on revive work different than on EVM. In a nutshell: Instead of supplying the deploy code concatenated with the constructor arguments (the EVM deploy model), the [revive runtime expects two pointers](https://docs.rs/pallet-revive/latest/pallet_revive/trait.SyscallDoc.html#tymethod.instantiate):
1. A buffer containing the code hash to deploy.
2. The constructor arguments buffer.
To make contract instantiation using the `new` keyword in Solidity work seamlessly,
`revive` translates the `dataoffset` and `datasize` instructions so that they assume the contract hash instead of the contract code.
The hash is always of constant size.
Thus, `revive` is able to supply the expected code hash and constructor arguments pointer to the runtime.
> [!WARNING]
>
> This might fall apart in code creating contracts inside `assembly` blocks. **We strongly discourage using the `create` family opcodes to manually craft deployments in `assembly` blocks!** Usually, the reason for using `assembly` blocks is to save gas, which is futile on revive anyways due to lower transaction costs.
### `dataoffset`
Returns the contract hash.
### `datasize`
Returns the contract hash size (constant value of `32`).
### `prevrandao`, `difficulty`
Translates to a constant value of `2500000000000000`.
### `pc`, `extcodecopy`
Only valid to use in EVM (they also have no use case in PVM) and produce a compile time error.
### `blobhash`, `blobbasefee`
Related to the Ethereum rollup model and produce a compile time error. Polkadot offers a superior rollup model, removing the use case for blob data related opcodes.
## Difference regarding the `solc` `via-ir` mode
There are two different compilation pipelines available in `solc` and [there are small differences between them](https://docs.soliditylang.org/en/latest/ir-breaking-changes.html).
Since `resolc` processes the YUL IR, always assume the `solc` IR based codegen behavior for contracts compiled with the `revive` compiler.
+27
View File
@@ -0,0 +1,27 @@
# Installation
Building Solidity contracts for PolkaVM requires installing the following two compilers:
- `solc`: The [Ethereum Solidity reference compiler](https://github.com/argotorg/solidity) implementation.
- `resolc`: The revive Solidity compiler YUL frontend and PolkaVM code generator.
## `resolc` binary releases
`resolc` is supported an all major operating systems and installation is straightforward.
Please find our [binary releases](https://github.com/paritytech/revive/releases) for the following platforms:
- Linux (MUSL)
- MacOS (universal)
- Windows
- Wasm via emscripten
## Installing the `solc` dependency
`resolc` uses `solc` during the compilation process, please refer to the [Ethereum Solidity documentation](https://docs.soliditylang.org/en/latest/installing-solidity.html) for installation instructions.
## `revive` NPM package
We distribute the revive compiler as [node.js module](https://github.com/paritytech/revive/tree/main/js/resolc).
## Buidling `resolc` from source
Please follow the build [instructions in the revive `README.md`](https://github.com/paritytech/revive?tab=readme-ov-file#building-from-source).
+13
View File
@@ -0,0 +1,13 @@
# JS NPM package
The `resolc` compiler driver is published as an NPM package under [@parity/resolc](https://www.npmjs.com/package/@parity/resolc).
It's usable from `Node.js` code or directly from the command line:
```shell
npx @parity/resolc@latest --bin crates/integration/contracts/flipper.sol -o /tmp/out
```
> [!NOTE]
>
> While the npm package makes a nice portable option, it doesn't expose all options.
+15
View File
@@ -0,0 +1,15 @@
# Rust contract libraries
> [!NOTE]
>
> This is not yet implemented but something for consideration on the roadmap.
Solidity - tightly coupled to the EVM - introduces some inherent inefficiencies that are by design and either needs to be followed or can't be easily worked around, even with efforts like better optimized compiler and VM implementations. This represents a technical dead end. So far the EVM sees no adoption beyond the blockchain industry. Chances are that [the EVM end up deprecated](https://ethereum-magicians.org/t/long-term-l1-execution-layer-proposal-replace-the-evm-with-risc-v) for technical reasons (or maybe not and the RISC-V idea gets abandoned, who knows).
PVM, however, is a general purpose VM. It supports LLVM based mainstream programming languages like Rust. It's a common software engineering practice to compose applications from pieces written in multiple languages, using each to their own strength. For example, AI solutions traditionally use the python scripting language for convenient developer experience, while the underlying AI models get implemented in a lower level language such as C++.
The same pattern can of course be applied to dApps, where we'd expect application specific languages like Solidity mixed with libraries implementing computationally complex algorithms in a lower level language. Business logic and user interfaces are naturally implemented as regular Solidity dApps which can include (link against) Rust libraries. Rust is a fast, safe low level language and the Polkadot SDK is written in Rust itself, making it an excellent choice.
For example, [ZK proof verifiers](https://en.wikipedia.org/wiki/Zero-knowledge_proof) or expensive [DeFi](https://en.wikipedia.org/wiki/Decentralized_finance) primitives would benefit greatly from Rust implementations.
`revive` provides tooling support and a small Rust contracts SDK for seamless integration with Rust libraries.
+36
View File
@@ -0,0 +1,36 @@
# Standard JSON interface
The `revive` compiler is mostly compatible with the `solc` standard JSON interface. There are a few additional (PVM related) __input__ configurations:
## The `settings.polkavm` object
Used to configure PVM specific compiler settings.
### `settings.polkavm.debugInformation`
A boolean value allowing to enable debug information. Corresponds to `resolc -g`.
### The `settings.polkavm.memoryConfig` object
Used to apply PVM specific memory configuration settings.
#### `settings.polkavm.heapSize`
A numerical value allowing to configure the contract heap size. Corresponds to `resolc --heap-size`.
#### `settings.polkavm.stackSize`
A numerical value allowing to configure the contract stack size. Corresponds to `resolc --stack-size`.
## The `settings.optimizer` object
The `settings.optimizer` object is augmented with support for PVM specific optimization settings.
### `settings.optimizer.mode`
A single char value to configure the LLVM optimizer settings. Corresponds to `resolc -O`.
## `settings.llvmArguments`
Allows to specify arbitrary command line arguments to LLVM initialization. Used mainly for development and debugging purposes.
+19
View File
@@ -0,0 +1,19 @@
# Tooling integration
`resolc` achieved successful integration with a variety of third party developer tools.
## Solidity toolkits
Support for `resolc` is available in forks of the [hardhat](https://hardhat.org) and [foundry](https://getfoundry.sh) Solidity toolkits:
- [The Parity Hardhat fork](https://github.com/paritytech/hardhat-polkadot)
- [The Parity Foundry fork](https://github.com/paritytech/foundry-polkadot?tab=readme-ov-file#2-resolc-compiler-integration)
## Compiler explorer
`resolc` is available on [godbolt.org](https://godbolt.org/z/6GM6n4Ka3) for the Solidity and Yul input languages. See also the announcement post on the [forum](https://forum.polkadot.network/t/resolc-is-live-on-compiler-explorer).
## Remix IDE
There is remix IDE fork with `resolc` support at [remix.polkadot.io](https://remix.polkadot.io). Unfortunately this is no longer actively maintained (there might be bugs and outdated `resolc` versions).