Files
pezkuwi-subxt/substrate/frame/contracts/COMPLEXITY.md
T
Alexander Theißen 04b185e3d4 seal: Change prefix and module name from "ext_" to "seal_" for contract callable functions (#6798)
* seal: Change prefix "ext_" to "seal_" for contract callable functions

The word Ext is a overloaded term in the context of substrate. It usually
is a trait which abstracts away access to external resources usually in order
to mock them away for the purpose of tests. The contract module has its own
`Ext` trait in addition the the substrate `Ext` which makes things even more
confusing.

In order to differentiate the contract callable functions more clearly from
this `Ext` concept we rename them to use the "seal_" prefix instead.

This should change no behaviour at all. This is a pure renaming commit.

* seal: Rename import module from "env" to "seal0"

* seal: Fixup integration test

* seal: Add more tests for new import module names
2020-08-10 13:14:34 +00:00

24 KiB
Raw Blame History

Complexity

This analysis is on the computing and memory complexity of specific procedures. It provides a rough estimate of operations performed in general and especially focusing on DB reads and writes. It is also an attempt to estimate the memory consumption at its peak.

The primary goal is to come up with decent pricing for functions that can be invoked by a user (via extrinsics) or by untrusted code that prevents DoS attacks.

Sandboxing

It makes sense to describe the sandboxing module first because the smart-contract module is built upon it.

Memory

set

Copies data from the supervisor's memory to the guest's memory.

complexity: It doesn't allocate, and the computational complexity is proportional to the number of bytes to copy.

get

Copies data from the guest's memory to the supervisor's memory.

complexity: It doesn't allocate, and the computational complexity is proportional to the number of bytes to copy.

Instance

Instantiation

Instantiation of a sandbox module consists of the following steps:

  1. Loading the wasm module in the in-memory representation,
  2. Performing validation of the wasm code,
  3. Setting up the environment which will be used to instantiate the module,
  4. Performing the standard wasm instantiation process, which includes (but is not limited to):
    1. Allocating of memory requested by the instance,
    2. Copying static data from the module to newly allocated memory,
    3. Executing the start function.

Note that the start function can be viewed as a normal function and can do anything that a normal function can do, including allocation of more memory or calling the host environment. The complexity of running the start function should be considered separately.

In order to start the process of instantiation, the supervisor should provide the wasm module code being instantiated and the environment definition (a set of functions, memories (and maybe globals and tables in the future) available for import by the guest module) for that module. While the environment definition typically is of the constant size (unless mechanisms like dynamic linking are used), the size of wasm is not.

Validation and instantiation in WebAssembly are designed to be able to be performed in linear time. The allocation and computational complexity of loading a wasm module depend on the underlying wasm VM being used. For example, for JIT compilers it can and probably will be non-linear because of compilation. However, for wasmi, it should be linear. We can try to use other VMs that are able to compile code with memory and time consumption proportional to the size of the code.

Since the module itself requests memory, the amount of allocation depends on the module code itself. If untrusted code is being instantiated, it's up to the supervisor to limit the amount of memory available to allocate.

complexity: The computational complexity is proportional to the size of wasm code. Memory complexity is proportional to the size of wasm code and the amount of memory requested by the module.

Preparation to invoke

Invocation of an exported function in the sandboxed module consists of the following steps:

  1. Marshalling, copying and unmarshalling the arguments when passing them between the supervisor and executor,
  2. Calling into the underlying VM,
  3. Marshalling, copying and unmarshalling the result when passing it between the executor and supervisor,

Note that the complexity of running the function code itself should be considered separately.

The actual complexity of invocation depends on the underlying VM. Wasmi will reserve a relatively large chunk of memory for the stack before execution of the code, although it's of constant size.

The size of the arguments and the return value depends on the exact function in question, but can be considered as constant.

complexity: Memory and computational complexity can be considered as a constant.

Call from the guest to the supervisor

The executor handles each call from the guest. The execution of it consists of the following steps:

  1. Marshalling, copying and unmarshalling the arguments when passing them between the guest and executor,
  2. Calling into the supervisor,
  3. Marshaling, copying and unmarshalling the result when passing it between the executor and guest.

Note that the complexity of running the supervisor handler should be considered separately.

Because calling into the supervisor requires invoking a wasm VM, the actual complexity of invocation depends on the actual VM used for the runtime/supervisor. Wasmi will reserve a relatively large chunk of memory for the stack before execution of the code, although it's of constant size.

The size of the arguments and the return value depends on the exact function in question, but can be considered as a constant.

complexity: Memory and computational complexity can be considered as a constant.

Transactional Storage

The contracts module makes use of the nested storage transactions feature offered by the underlying storage which allows efficient roll back of changes made by contracts.

The underlying storage has a overlay layer implemented as a Map. If the runtime reads a storage location and the respective key doesn't exist in the overlay, then the underlying storage performs a DB access, but the value won't be placed into the overlay. The overlay is only filled with writes.

This means that the overlay can be abused in the following ways:

  • The overlay can be inflated by issuing a lot of writes to unique locations,
  • Deliberate cache misses can be induced by reading non-modified storage locations,

It also worth noting that the performance degrades with more state stored in the trie. Due to this there is not negligible chance that gas schedule will be updated for all operations that involve storage access.

get_storage, get_code_hash, get_rent_allowance, get_balance, contract_exists

Those query the underlying storage for the requested value. If the value was modified in the current block they are served from the cache. Otherwise a database read is performed.

complexity: The memory complexity is proportional to the size of the value. The computational complexity is proportional the size of the value; the cost is dominated by the DB read.

set_storage, set_balance, set_rent_allowance

These function write to the underlying storage which caches those values and does not write them to the database immediately.

While these functions only modify the local cache, they trigger a database write later when all changes that were not rolled back are written to storage. Moreover, if the balance of the account is changed to be below existential_deposit then that account along with all its storage will be removed, which requires time proportional to the number of storage entries that account has. It should be ensured that pricing accounts for these facts.

complexity: Each lookup has a logarithmical computing time to the number of already inserted entries. No additional memory is required.

instantiate_contract

Calls contract_exists and if it doesn't exist, do not modify the local Map similarly to set_rent_allowance.

complexity: The computational complexity is proportional to the depth of the overlay cascade and the size of the value; the cost is dominated by the DB read though. No additional memory is required.

commit

In this function, all values modified in the current transactions are committed to the parent transaction.

This will trigger N inserts into parent transaction (O(log M) complexity) or into the storage, where N is the size of the current transaction and M is the size of the parent transaction. Consider adjusting the price of modifying the current transaction to account for this (since pricing for the count of entries in commit will make the price of commit way less predictable). No additional memory is required.

Note that in case of storage modification we need to construct a key in the underlying storage. In order to do that we need:

  • perform twox_128 hashing over a concatenation of some prefix literal and the AccountId of the storage owner.
  • then perform blake2_256 hashing of the storage key.
  • concatenation of these hashes will constitute the key in the underlying storage.

There is also a special case to think of: if the balance of some account goes below existential_deposit, then all storage entries of that account will be erased, which requires time proportional to the number of storage entries that account has.

complexity: N inserts into a transaction or eventually into the storage (if committed). Every deleted account will induce removal of all its storage which is proportional to the number of storage entries that account has.

revert

Consists of dropping (in the Rust sense) of the current transaction.

complexity: Computing complexity is proportional to a number of changed entries in a overlay. No additional memory is required.

Executive

Transfer

This function performs the following steps:

  1. Querying source and destination balances from the current transaction (see get_balance),
  2. Querying existential_deposit.
  3. Executing ensure_account_liquid hook.
  4. Updating source and destination balance in the overlay (see set_balance).

Note that the complexity of executing ensure_account_liquid hook should be considered separately.

In the course of the execution this function can perform up to 2 DB reads to get_balance of source and destination accounts. It can also induce up to 2 DB writes via set_balance if flushed to the storage.

Moreover, if the source balance goes below existential_deposit then the transfer is denied and returns with an error.

Assuming marshaled size of a balance value is of the constant size we can neglect its effect on the performance.

complexity: up to 2 DB reads and up to 2 DB writes (if flushed to the storage) in the standard case. If removal of the source account takes place then it will additionally perform a DB write per one storage entry that the account has. Memorywise it can be assumed to be constant.

Initialization

Before a call or instantiate can be performed the execution context must be initialized.

For the first call or instantiation in the handling of an extrinsic, this involves two calls:

  1. <timestamp::Module<T>>::now()
  2. <system::Module<T>>::block_number()

The complexity of initialization depends on the complexity of these functions. In the current implementation they just involve a DB read.

For subsequent calls and instantiations during contract execution, the initialization requires no expensive operations.

Terminate

This function performs the following steps:

  1. Check the calling contract is not already on the callstack by calling is_live.
  2. transfer funds from caller to the beneficiary.
  3. Flag the caller contract as deleted in the overlay.

is_live does not do any database access nor does it allocate memory. It walks up the call stack and therefore executes in linear time depending on size of the call stack. Because the call stack is of a fixed maximum size we consider this operation as constant time.

complexity: Database accesses as described in Transfer + Removal of the contract. Currently, we are using child trie removal which is linear in the amount of stored keys. Upcoming changes will make the account removal constant time.

Call

This function receives input data for the contract execution. The execution consists of the following steps:

  1. Initialization of the execution context.
  2. Checking rent payment.
  3. Loading code from the DB.
  4. Starting a new storage transaction.
  5. transfer-ing funds between the caller and the destination account.
  6. Executing the code of the destination account.
  7. Committing or rolling back the storage transaction.

Note that the complexity of executing the contract code should be considered separately.

Checking for rent involves 2 unconditional DB reads: ContractInfoOf and block_number and on top of that at most once per block:

  • DB read to free_balance and
  • rent_deposit_offset and
  • rent_byte_price and
  • Currency::minimum_balance and
  • tombstone_deposit.
  • Calls to ensure_can_withdraw, withdraw, make_free_balance_be can perform arbitrary logic and should be considered separately,
  • child_storage_root
  • kill_child_storage
  • mutation of ContractInfoOf

Loading code most likely will trigger a DB read, since the code is immutable and therefore will not get into the cache (unless a suicide removes it, or it has been instantiated in the same call chain).

Also, transfer can make up to 2 DB reads and up to 2 DB writes (if flushed to the storage) in the standard case. If removal of the source account takes place then it will additionally perform a DB write per one storage entry that the account has.

Finally, the current storage transaction is closed. The complexity of this depends on the number of changes performed by the code. Thus, the pricing of storage modification should account for that.

complexity:

  • Only for the first invocation of the contract: up to 5 DB reads and one DB write as well as logic executed by ensure_can_withdraw, withdraw, make_free_balance_be.
  • On top of that for every invocation: Up to 5 DB reads. DB read of the code is of dynamic size. There can also be up to 2 DB writes (if flushed to the storage). Additionally, if the source account removal takes place a DB write will be performed per one storage entry that the account has.

Instantiate

This function takes the code of the constructor and input data. Instantiation of a contract consists of the following steps:

  1. Initialization of the execution context.
  2. Calling DetermineContractAddress hook to determine an address for the contract,
  3. Starting a new storage transaction.
  4. transfer-ing funds between self and the newly instantiated contract.
  5. Executing the constructor code. This will yield the final code of the code.
  6. Storing the code for the newly instantiated contract in the overlay.
  7. Committing or rolling back the storage transaction.

Note that the complexity of executing the constructor code should be considered separately.

Note that the complexity of DetermineContractAddress hook should be considered separately as well. Most likely it will use some kind of hashing over the code of the constructor and input data. The default SimpleAddressDeterminer does precisely that.

Note that the constructor returns code in the owned form and it's obtained via return facilities, which should have take fee for the return value.

Also, transfer can make up to 2 DB reads and up to 2 DB writes (if flushed to the storage) in the standard case. If removal of the source account takes place then it will additionally perform a DB write per one storage entry that the account has.

Storing the code in the overlay may induce another DB write (if flushed to the storage) with the size proportional to the size of the constructor code.

Finally, the current storage transaction is closed.. The complexity of this depends on the number of changes performed by the constructor code. Thus, the pricing of storage modification should account for that.

complexity: Up to 2 DB reads and induces up to 3 DB writes (if flushed to the storage), one of which is dependent on the size of the code. Additionally, if the source account removal takes place a DB write will be performed per one storage entry that the account has.

Contracts API

Each API function invoked from a contract can involve some overhead.

Getter functions

Those are simple getter functions which copy a requested value to contract memory. They all have the following two arguments:

  • output_ptr: Pointer into contract memory where to copy the value.
  • output_len_ptr: Pointer into contract memory where the size of the buffer is stored. The size of the copied value is also stored there.

complexity: The size of the returned value is constant for a given runtime. Therefore we consider its complexity constant even though some of them might involve at most one DB read. Some of those functions call into other pallets of the runtime. The assumption here is that those functions are also linear in regard to the size of the data that is returned and therefore considered constant for a given runtime.

This is the list of getters:

  • seal_caller
  • seal_address
  • seal_weight_to_fee
  • seal_gas_left
  • seal_balance
  • seal_value_transferred
  • seal_now
  • seal_minimum_balance
  • seal_tombstone_deposit
  • seal_rent_allowance
  • seal_block_number

seal_set_storage

This function receives a key and value as arguments. It consists of the following steps:

  1. Reading the sandbox memory for key and value (see sandboxing memory get).
  2. Setting the storage at the given key to the given value (see set_storage).

complexity: Complexity is proportional to the size of the value. This function induces a DB write of size proportional to the value size (if flushed to the storage), so should be priced accordingly.

seal_clear_storage

This function receives a key as argument. It consists of the following steps:

  1. Reading the sandbox memory for key (see sandboxing memory get).
  2. Clearing the storage at the given key (see set_storage).

complexity: Complexity is constant. This function induces a DB write to clear the storage entry (upon being flushed to the storage) and should be priced accordingly.

seal_get_storage

This function receives a key as an argument. It consists of the following steps:

  1. Reading the sandbox memory for key (see sandboxing memory get).
  2. Reading the storage with the given key (see get_storage). It receives back the owned result buffer.
  3. Writing the storage value to contract memory.

Key is of a constant size. Therefore, the sandbox memory load can be considered to be of constant complexity.

Unless the value is cached, a DB read will be performed. The size of the value is not known until the read is performed. Moreover, the DB read has to be synchronous and no progress can be made until the value is fetched.

complexity: The memory and computing complexity is proportional to the size of the fetched value. This function performs a DB read.

seal_transfer

This function receives the following arguments:

  • account buffer of a marshaled AccountId,
  • value buffer of a marshaled Balance,

It consists of the following steps:

  1. Loading account buffer from the sandbox memory (see sandboxing memory get) and then decoding it.
  2. Loading value buffer from the sandbox memory and then decoding it.
  3. Invoking the executive function transfer.

Loading of account and value buffers should be charged. This is because the sizes of buffers are specified by the calling code, even though marshaled representations are, essentially, of constant size. This can be fixed by assigning an upper bound for sizes of AccountId and Balance.

seal_call

This function receives the following arguments:

  • callee buffer of a marshaled AccountId,
  • gas limit which is plain u64,
  • value buffer of a marshaled Balance,
  • input_data an arbitrarily sized byte vector.
  • output_ptr pointer to contract memory.

It consists of the following steps:

  1. Loading callee buffer from the sandbox memory (see sandboxing memory get) and then decoding it.
  2. Loading value buffer from the sandbox memory and then decoding it.
  3. Loading input_data buffer from the sandbox memory.
  4. Invoking the executive function call.
  5. Writing output buffer to contract memory.

Loading of callee and value buffers should be charged. This is because the sizes of buffers are specified by the calling code, even though marshaled representations are, essentially, of constant size. This can be fixed by assigning an upper bound for sizes of AccountId and Balance.

Loading input_data should be charged in any case.

complexity: All complexity comes from loading and writing buffers and executing call executive function. The former component is proportional to the sizes of callee, value, input_data and output_ptr buffers. The latter component completely depends on the complexity of call executive function, and also dominated by it.

seal_instantiate

This function receives the following arguments:

  • init_code, a buffer which contains the code of the constructor.
  • gas limit which is plain u64
  • value buffer of a marshaled Balance
  • input_data. an arbitrarily sized byte vector.

It consists of the following steps:

  1. Loading init_code buffer from the sandbox memory (see sandboxing memory get) and then decoding it.
  2. Loading value buffer from the sandbox memory and then decoding it.
  3. Loading input_data buffer from the sandbox memory.
  4. Invoking instantiate executive function.

Loading of value buffer should be charged. This is because the size of the buffer is specified by the calling code, even though marshaled representation is, essentially, of constant size. This can be fixed by assigning an upper bound for size for Balance.

Loading init_code and input_data should be charged in any case.

complexity: All complexity comes from loading buffers and executing instantiate executive function. The former component is proportional to the sizes of init_code, value and input_data buffers. The latter component completely depends on the complexity of instantiate executive function and also dominated by it.

seal_terminate

This function receives the following arguments:

  • beneficiary, buffer of a marshaled AccountId

It consists of the following steps:

  1. Loading beneficiary buffer from the sandbox memory (see sandboxing memory get) and then decoding it.

Loading of the beneficiary buffer should be charged. This is because the sizes of buffers are specified by the calling code, even though marshaled representations are, essentially, of constant size. This can be fixed by assigning an upper bound for sizes of AccountId.

complexity: All complexity comes from loading buffers and executing terminate executive function. The former component is proportional to the size of the beneficiary buffer. The latter component completely depends on the complexity of terminate executive function and also dominated by it.

seal_input

This function receives a pointer to contract memory. It copies the input to the contract call to this location.

complexity: The complextity is proportional to the size of the input buffer.

seal_return

This function receives a data buffer and flags arguments. Execution of the function consists of the following steps:

  1. Loading data buffer from the sandbox memory (see sandboxing memory get).
  2. Storing the u32 flags value.
  3. Trapping

complexity: The complexity of this function is proportional to the size of the data buffer.

seal_deposit_event

This function receives a data buffer as an argument. Execution of the function consists of the following steps:

  1. Loading data buffer from the sandbox memory (see sandboxing memory get),
  2. Insert to nested context execution
  3. Copies from nested to underlying contexts
  4. Call system deposit event

complexity: The complexity of this function is proportional to the size of the data buffer.

seal_set_rent_allowance

This function receives the following argument:

  • value buffer of a marshaled Balance,

It consists of the following steps:

  1. Loading value buffer from the sandbox memory and then decoding it.
  2. Invoking set_rent_allowance AccountDB function.

complexity: Complexity is proportional to the size of the value. This function induces a DB write of size proportional to the value size (if flushed to the storage), so should be priced accordingly.

Built-in hashing functions

This paragraph concerns the following supported built-in hash functions:

  • SHA2 with 256-bit width
  • KECCAK with 256-bit width
  • BLAKE2 with 128-bit and 256-bit widths

These functions compute a cryptographic hash on the given inputs and copy the resulting hash directly back into the sandboxed Wasm contract output buffer.

Execution of the function consists of the following steps:

  1. Load data stored in the input buffer into an intermediate buffer.
  2. Compute the cryptographic hash H on the intermediate buffer.
  3. Copy back the bytes of H into the contract side output buffer.

complexity: Complexity is proportional to the size of the input buffer in bytes as well as to the size of the output buffer in bytes. Also different cryptographic algorithms have different inherent complexity so users must expect the above mentioned crypto hashes to have varying gas costs. The complexity of each cryptographic hash function highly depends on the underlying implementation.