Liam Aharon 55f354429c remote-ext: fix state download stall on slow connections and reduce memory usage (#1295)
Original PR https://github.com/paritytech/substrate/pull/14746

---

## Fixing stall

### Introduction
I experienced an apparent stall downloading state from
`https://rococo-try-runtime-node.parity-chains.parity.io:443` which was
having networking difficulties only responding to my JSONRPC requests
with 50-200KB/s of bandwidth.

This PR fixes the issue causing the stall, and generally improves
performance remote-ext when it downloads state by greatly reducing the
chances of a timeout occuring.

### Description
Introduces a new `REQUEST_DURATION_TARGET` constant and modifies
`get_storage_data_dynamic_batch_size` to

- Increase or decrease the batch size of the next request depending on
whether the elapsed time of the last request was gt or lt the target
- Reset the batch size to 1 if the request times out

This fixes an issue on slow connections that can otherwise cause
multiple timeouts and a stalled download when:

1. The batch size increases rapidly as remote-ext downloads keys with
small associated storage values
2. remote-ext tries to process a large series of subsequent keys all
with extremely large associated storage values (Rococo has a series of
keys 1-5MB large)
3. The huge storage values download for 5 minutes until the request
times out
4. The partially downloaded keys are thrown out and remote-ext tries
again with a smaller batch size, but the batch size is still far too
large and takes 5 minutes to be reduced again
5. The download will be essentially stalled for many hours while the
above step cycles


After this PR, the request size will

- Not grow as large to begin with, as it is regulated downwards as the
request duration exceeds the target
- Drop immediately to 1 if the request times out. A timeout indicates
the keys next in line to download have extremely large storage values
compared to previously downloaded keys, and we need to reset the batch
size to figure out what our new ideal batch size is. By not resetting
down to 1, we risk the next request timing out again.

## Reducing memory

As suggested by @bkchr, I adjusted `get_storage_data_dynamic_batch_size`
from being recursive to a loop which allows removing a bunch of clones
that were chewing through a lot of memory. I noticed actually it was
using up to 50GB swap previously when downloading Polkadot keys on a
slow connection, because it needed to recurse and clone a lot.

After this change it uses only ~1.5GB memory.
2023-10-11 01:41:58 +11:00
2023-08-25 16:35:22 +02:00
2023-08-25 16:35:22 +02:00
2023-10-10 14:20:25 +03:00
2023-09-04 15:45:33 +03:00
2023-09-11 11:47:45 +03:00
2023-10-09 14:56:30 +01:00
2023-09-04 12:02:32 +03:00

NOTE: We have recently made significant changes to our repository structure. In order to streamline our development process and foster better contributions, we have merged three separate repositories Cumulus, Substrate and Polkadot into this repository. Read more about the changes here.

Polkadot SDK

StackExchange

The Polkadot SDK repository provides all the resources needed to start building on the Polkadot network, a multi-chain blockchain platform that enables different blockchains to interoperate and share information in a secure and scalable way. The Polkadot SDK comprises three main pieces of software:

Polkadot

PolkadotForum Polkadot-license

Implementation of a node for the https://polkadot.network in Rust, using the Substrate framework. This directory currently contains runtimes for the Polkadot, Kusama, Westend, and Rococo networks. In the future, these will be relocated to the runtimes repository.

Substrate

SubstrateRustDocs Substrate-license

Substrate is the primary blockchain SDK used by developers to create the parachains that make up the Polkadot network. Additionally, it allows for the development of self-sovereign blockchains that operate completely independently of Polkadot.

Cumulus

CumulusRustDocs Cumulus-license

Cumulus is a set of tools for writing Substrate-based Polkadot parachains.

Upstream Dependencies

Below are the primary upstream dependencies utilized in this project:

Security

The security policy and procedures can be found in docs/SECURITY.md.

Contributing & Code of Conduct

Ensure you follow our contribution guidelines. In every interaction and contribution, this project adheres to the Contributor Covenant Code of Conduct.

Additional Resources

  • For monitoring upcoming changes and current proposals related to the technical implementation of the Polkadot network, visit the Requests for Comment (RFC) repository. While it's maintained by the Polkadot Fellowship, the RFC process welcomes contributions from everyone.
S
Description
No description provided
Readme 529 MiB
Languages
Rust 99.8%
Shell 0.2%