Introduce trie level cache and remove state cache (#11407)

* trie state cache

* Also cache missing access on read.

* fix comp

* bis

* fix

* use has_lru

* remove local storage cache on size 0.

* No cache.

* local cache only

* trie cache and local cache

* storage cache (with local)

* trie cache no local cache

* Add state access benchmark

* Remove warnings etc

* Add trie cache benchmark

* No extra "clone" required

* Change benchmark to use multiple blocks

* Use patches

* Integrate shitty implementation

* More stuff

* Revert "Merge branch 'master' into trie_state_cache"

This reverts commit 947cd8e6d43fced10e21b76d5b92ffa57b57c318, reversing
changes made to 29ff036463.

* Improve benchmark

* Adapt to latest changes

* Adapt to changes in trie

* Add a test that uses iterator

* Start fixing it

* Remove obsolete file

* Make it compile

* Start rewriting the trie node cache

* More work on the cache

* More docs and code etc

* Make data cache an optional

* Tests

* Remove debug stuff

* Recorder

* Some docs and a simple test for the recorder

* Compile fixes

* Make it compile

* More fixes

* More fixes

* Fix fix fix

* Make sure cache and recorder work together for basic stuff

* Test that data caching and recording works

* Test `TrieDBMut` with caching

* Try something

* Fixes, fixes, fixes

* Forward the recorder

* Make it compile

* Use recorder in more places

* Switch to new `with_optional_recorder` fn

* Refactor and cleanups

* Move `ProvingBackend` tests

* Simplify

* Move over all functionality to the essence

* Fix compilation

* Implement estimate encoded size for StorageProof

* Start using the `cache` everywhere

* Use the cache everywhere

* Fix compilation

* Fix tests

* Adds `TrieBackendBuilder` and enhances the tests

* Ensure that recorder drain checks that values are found as expected

* Switch over to `TrieBackendBuilder`

* Start fixing the problem with child tries and recording

* Fix recording of child tries

* Make it compile

* Overwrite `storage_hash` in `TrieBackend`

* Add `storage_cache` to  the benchmarks

* Fix `no_std` build

* Speed up cache lookup

* Extend the state access benchmark to also hash a runtime

* Fix build

* Fix compilation

* Rewrite value cache

* Add lru cache

* Ensure that the cache lru works

* Value cache should not be optional

* Add support for keeping the shared node cache in its bounds

* Make the cache configurable

* Check that the cache respects the bounds

* Adds a new test

* Fixes

* Docs and some renamings

* More docs

* Start using the new recorder

* Fix more code

* Take `self` argument

* Remove warnings

* Fix benchmark

* Fix accounting

* Rip off the state cache

* Start fixing fallout after removing the state cache

* Make it compile after trie changes

* Fix test

* Add some logging

* Some docs

* Some fixups and clean ups

* Fix benchmark

* Remove unneeded file

* Use git for patching

* Make CI happy

* Update primitives/trie/Cargo.toml

Co-authored-by: Koute <koute@users.noreply.github.com>

* Update primitives/state-machine/src/trie_backend.rs

Co-authored-by: cheme <emericchevalier.pro@gmail.com>

* Introduce new `AsTrieBackend` trait

* Make the LocalTrieCache not clonable

* Make it work in no_std and add docs

* Remove duplicate dependency

* Switch to ahash for better performance

* Speedup value cache merge

* Output errors on underflow

* Ensure the internal LRU map doesn't grow too much

* Use const fn to calculate the value cache element size

* Remove cache configuration

* Fix

* Clear the cache in between for more testing

* Try to come up with a failing test case

* Make the test fail

* Fix the child trie recording

* Make everything compile after the changes to trie

* Adapt to latest trie-db changes

* Fix on stable

* Update primitives/trie/src/cache.rs

Co-authored-by: cheme <emericchevalier.pro@gmail.com>

* Fix wrong merge

* Docs

* Fix warnings

* Cargo.lock

* Bump pin-project

* Fix warnings

* Switch to released crate version

* More fixes

* Make clippy and rustdocs happy

* More clippy

* Print error when using deprecated `--state-cache-size`

* 🤦

* Fixes

* Fix storage_hash linkings

* Update client/rpc/src/dev/mod.rs

Co-authored-by: Arkadiy Paronyan <arkady.paronyan@gmail.com>

* Review feedback

* encode bound

* Rework the shared value cache

Instead of using a `u64` to represent the key we now use an `Arc<[u8]>`. This arc is also stored in
some extra `HashSet`. We store the key are in an extra `HashSet` to de-duplicate the keys accross
different storage roots. When the latest key usage is dropped in the lru, we also remove the key
from the `HashSet`.

* Improve of the cache by merging the old and new solution

* FMT

* Please stop coming back all the time :crying:

* Update primitives/trie/src/cache/shared_cache.rs

Co-authored-by: Arkadiy Paronyan <arkady.paronyan@gmail.com>

* Fixes

* Make clippy happy

* Ensure we don't deadlock

* Only use one lock to simplify the code

* Do not depend on `Hasher`

* Fix tests

* FMT

* Clippy 🤦

Co-authored-by: cheme <emericchevalier.pro@gmail.com>
Co-authored-by: Koute <koute@users.noreply.github.com>
Co-authored-by: Arkadiy Paronyan <arkady.paronyan@gmail.com>
This commit is contained in:
Bastian Köcher
2022-08-18 20:59:22 +02:00
committed by GitHub
parent d46f6f0d34
commit 73d9ae3284
55 changed files with 3977 additions and 1344 deletions
+686
View File
@@ -0,0 +1,686 @@
// This file is part of Substrate.
// Copyright (C) 2022 Parity Technologies (UK) Ltd.
// SPDX-License-Identifier: Apache-2.0
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
//! Trie Cache
//!
//! Provides an implementation of the [`TrieCache`](trie_db::TrieCache) trait.
//! The implementation is split into three types [`SharedTrieCache`], [`LocalTrieCache`] and
//! [`TrieCache`]. The [`SharedTrieCache`] is the instance that should be kept around for the entire
//! lifetime of the node. It will store all cached trie nodes and values on a global level. Then
//! there is the [`LocalTrieCache`] that should be kept around per state instance requested from the
//! backend. As there are very likely multiple accesses to the state per instance, this
//! [`LocalTrieCache`] is used to cache the nodes and the values before they are merged back to the
//! shared instance. Last but not least there is the [`TrieCache`] that is being used per access to
//! the state. It will use the [`SharedTrieCache`] and the [`LocalTrieCache`] to fulfill cache
//! requests. If both of them don't provide the requested data it will be inserted into the
//! [`LocalTrieCache`] and then later into the [`SharedTrieCache`].
//!
//! The [`SharedTrieCache`] is bound to some maximum number of bytes. It is ensured that it never
//! runs above this limit. However as long as data is cached inside a [`LocalTrieCache`] it isn't
//! taken into account when limiting the [`SharedTrieCache`]. This means that for the lifetime of a
//! [`LocalTrieCache`] the actual memory usage could be above the allowed maximum.
use crate::{Error, NodeCodec};
use hash_db::Hasher;
use hashbrown::HashSet;
use nohash_hasher::BuildNoHashHasher;
use parking_lot::{Mutex, MutexGuard, RwLockReadGuard};
use shared_cache::{SharedValueCache, ValueCacheKey};
use std::{
collections::{hash_map::Entry as MapEntry, HashMap},
sync::Arc,
};
use trie_db::{node::NodeOwned, CachedValue};
mod shared_cache;
pub use shared_cache::SharedTrieCache;
use self::shared_cache::{SharedTrieCacheInner, ValueCacheKeyHash};
const LOG_TARGET: &str = "trie-cache";
/// The size of the cache.
#[derive(Debug, Clone, Copy)]
pub enum CacheSize {
/// Do not limit the cache size.
Unlimited,
/// Let the cache in maximum use the given amount of bytes.
Maximum(usize),
}
impl CacheSize {
/// Returns `true` if the `current_size` exceeds the allowed size.
fn exceeds(&self, current_size: usize) -> bool {
match self {
Self::Unlimited => false,
Self::Maximum(max) => *max < current_size,
}
}
}
/// The local trie cache.
///
/// This cache should be used per state instance created by the backend. One state instance is
/// referring to the state of one block. It will cache all the accesses that are done to the state
/// which could not be fullfilled by the [`SharedTrieCache`]. These locally cached items are merged
/// back to the shared trie cache when this instance is dropped.
///
/// When using [`Self::as_trie_db_cache`] or [`Self::as_trie_db_mut_cache`], it will lock Mutexes.
/// So, it is important that these methods are not called multiple times, because they otherwise
/// deadlock.
pub struct LocalTrieCache<H: Hasher> {
/// The shared trie cache that created this instance.
shared: SharedTrieCache<H>,
/// The local cache for the trie nodes.
node_cache: Mutex<HashMap<H::Out, NodeOwned<H::Out>>>,
/// Keeps track of all the trie nodes accessed in the shared cache.
///
/// This will be used to ensure that these nodes are brought to the front of the lru when this
/// local instance is merged back to the shared cache.
shared_node_cache_access: Mutex<HashSet<H::Out>>,
/// The local cache for the values.
value_cache: Mutex<
HashMap<
ValueCacheKey<'static, H::Out>,
CachedValue<H::Out>,
BuildNoHashHasher<ValueCacheKey<'static, H::Out>>,
>,
>,
/// Keeps track of all values accessed in the shared cache.
///
/// This will be used to ensure that these nodes are brought to the front of the lru when this
/// local instance is merged back to the shared cache. This can actually lead to collision when
/// two [`ValueCacheKey`]s with different storage roots and keys map to the same hash. However,
/// as we only use this set to update the lru position it is fine, even if we bring the wrong
/// value to the top. The important part is that we always get the correct value from the value
/// cache for a given key.
shared_value_cache_access:
Mutex<HashSet<ValueCacheKeyHash, BuildNoHashHasher<ValueCacheKeyHash>>>,
}
impl<H: Hasher> LocalTrieCache<H> {
/// Return self as a [`TrieDB`](trie_db::TrieDB) compatible cache.
///
/// The given `storage_root` needs to be the storage root of the trie this cache is used for.
pub fn as_trie_db_cache(&self, storage_root: H::Out) -> TrieCache<'_, H> {
let shared_inner = self.shared.read_lock_inner();
let value_cache = ValueCache::ForStorageRoot {
storage_root,
local_value_cache: self.value_cache.lock(),
shared_value_cache_access: self.shared_value_cache_access.lock(),
};
TrieCache {
shared_inner,
local_cache: self.node_cache.lock(),
value_cache,
shared_node_cache_access: self.shared_node_cache_access.lock(),
}
}
/// Return self as [`TrieDBMut`](trie_db::TrieDBMut) compatible cache.
///
/// After finishing all operations with [`TrieDBMut`](trie_db::TrieDBMut) and having obtained
/// the new storage root, [`TrieCache::merge_into`] should be called to update this local
/// cache instance. If the function is not called, cached data is just thrown away and not
/// propagated to the shared cache. So, accessing these new items will be slower, but nothing
/// would break because of this.
pub fn as_trie_db_mut_cache(&self) -> TrieCache<'_, H> {
TrieCache {
shared_inner: self.shared.read_lock_inner(),
local_cache: self.node_cache.lock(),
value_cache: ValueCache::Fresh(Default::default()),
shared_node_cache_access: self.shared_node_cache_access.lock(),
}
}
}
impl<H: Hasher> Drop for LocalTrieCache<H> {
fn drop(&mut self) {
let mut shared_inner = self.shared.write_lock_inner();
shared_inner
.node_cache_mut()
.update(self.node_cache.lock().drain(), self.shared_node_cache_access.lock().drain());
shared_inner
.value_cache_mut()
.update(self.value_cache.lock().drain(), self.shared_value_cache_access.lock().drain());
}
}
/// The abstraction of the value cache for the [`TrieCache`].
enum ValueCache<'a, H> {
/// The value cache is fresh, aka not yet associated to any storage root.
/// This is used for example when a new trie is being build, to cache new values.
Fresh(HashMap<Arc<[u8]>, CachedValue<H>>),
/// The value cache is already bound to a specific storage root.
ForStorageRoot {
shared_value_cache_access: MutexGuard<
'a,
HashSet<ValueCacheKeyHash, nohash_hasher::BuildNoHashHasher<ValueCacheKeyHash>>,
>,
local_value_cache: MutexGuard<
'a,
HashMap<
ValueCacheKey<'static, H>,
CachedValue<H>,
nohash_hasher::BuildNoHashHasher<ValueCacheKey<'static, H>>,
>,
>,
storage_root: H,
},
}
impl<H: AsRef<[u8]> + std::hash::Hash + Eq + Clone + Copy> ValueCache<'_, H> {
/// Get the value for the given `key`.
fn get<'a>(
&'a mut self,
key: &[u8],
shared_value_cache: &'a SharedValueCache<H>,
) -> Option<&CachedValue<H>> {
match self {
Self::Fresh(map) => map.get(key),
Self::ForStorageRoot { local_value_cache, shared_value_cache_access, storage_root } => {
let key = ValueCacheKey::new_ref(key, *storage_root);
// We first need to look up in the local cache and then the shared cache.
// It can happen that some value is cached in the shared cache, but the
// weak reference of the data can not be upgraded anymore. This for example
// happens when the node is dropped that contains the strong reference to the data.
//
// So, the logic of the trie would lookup the data and the node and store both
// in our local caches.
local_value_cache
.get(unsafe {
// SAFETY
//
// We need to convert the lifetime to make the compiler happy. However, as
// we only use the `key` to looking up the value this lifetime conversion is
// safe.
std::mem::transmute::<&ValueCacheKey<'_, H>, &ValueCacheKey<'static, H>>(
&key,
)
})
.or_else(|| {
shared_value_cache.get(&key).map(|v| {
shared_value_cache_access.insert(key.get_hash());
v
})
})
},
}
}
/// Insert some new `value` under the given `key`.
fn insert(&mut self, key: &[u8], value: CachedValue<H>) {
match self {
Self::Fresh(map) => {
map.insert(key.into(), value);
},
Self::ForStorageRoot { local_value_cache, storage_root, .. } => {
local_value_cache.insert(ValueCacheKey::new_value(key, *storage_root), value);
},
}
}
}
/// The actual [`TrieCache`](trie_db::TrieCache) implementation.
///
/// If this instance was created for using it with a [`TrieDBMut`](trie_db::TrieDBMut), it needs to
/// be merged back into the [`LocalTrieCache`] with [`Self::merge_into`] after all operations are
/// done.
pub struct TrieCache<'a, H: Hasher> {
shared_inner: RwLockReadGuard<'a, SharedTrieCacheInner<H>>,
shared_node_cache_access: MutexGuard<'a, HashSet<H::Out>>,
local_cache: MutexGuard<'a, HashMap<H::Out, NodeOwned<H::Out>>>,
value_cache: ValueCache<'a, H::Out>,
}
impl<'a, H: Hasher> TrieCache<'a, H> {
/// Merge this cache into the given [`LocalTrieCache`].
///
/// This function is only required to be called when this instance was created through
/// [`LocalTrieCache::as_trie_db_mut_cache`], otherwise this method is a no-op. The given
/// `storage_root` is the new storage root that was obtained after finishing all operations
/// using the [`TrieDBMut`](trie_db::TrieDBMut).
pub fn merge_into(self, local: &LocalTrieCache<H>, storage_root: H::Out) {
let cache = if let ValueCache::Fresh(cache) = self.value_cache { cache } else { return };
if !cache.is_empty() {
let mut value_cache = local.value_cache.lock();
let partial_hash = ValueCacheKey::hash_partial_data(&storage_root);
cache
.into_iter()
.map(|(k, v)| {
let hash =
ValueCacheKeyHash::from_hasher_and_storage_key(partial_hash.clone(), &k);
(ValueCacheKey::Value { storage_key: k, storage_root, hash }, v)
})
.for_each(|(k, v)| {
value_cache.insert(k, v);
});
}
}
}
impl<'a, H: Hasher> trie_db::TrieCache<NodeCodec<H>> for TrieCache<'a, H> {
fn get_or_insert_node(
&mut self,
hash: H::Out,
fetch_node: &mut dyn FnMut() -> trie_db::Result<NodeOwned<H::Out>, H::Out, Error<H::Out>>,
) -> trie_db::Result<&NodeOwned<H::Out>, H::Out, Error<H::Out>> {
if let Some(res) = self.shared_inner.node_cache().get(&hash) {
tracing::trace!(target: LOG_TARGET, ?hash, "Serving node from shared cache");
self.shared_node_cache_access.insert(hash);
return Ok(res)
}
match self.local_cache.entry(hash) {
MapEntry::Occupied(res) => {
tracing::trace!(target: LOG_TARGET, ?hash, "Serving node from local cache");
Ok(res.into_mut())
},
MapEntry::Vacant(vacant) => {
let node = (*fetch_node)();
tracing::trace!(
target: LOG_TARGET,
?hash,
fetch_successful = node.is_ok(),
"Node not found, needed to fetch it."
);
Ok(vacant.insert(node?))
},
}
}
fn get_node(&mut self, hash: &H::Out) -> Option<&NodeOwned<H::Out>> {
if let Some(node) = self.shared_inner.node_cache().get(hash) {
tracing::trace!(target: LOG_TARGET, ?hash, "Getting node from shared cache");
self.shared_node_cache_access.insert(*hash);
return Some(node)
}
let res = self.local_cache.get(hash);
tracing::trace!(
target: LOG_TARGET,
?hash,
found = res.is_some(),
"Getting node from local cache"
);
res
}
fn lookup_value_for_key(&mut self, key: &[u8]) -> Option<&CachedValue<H::Out>> {
let res = self.value_cache.get(key, self.shared_inner.value_cache());
tracing::trace!(
target: LOG_TARGET,
key = ?sp_core::hexdisplay::HexDisplay::from(&key),
found = res.is_some(),
"Looked up value for key",
);
res
}
fn cache_value_for_key(&mut self, key: &[u8], data: CachedValue<H::Out>) {
tracing::trace!(
target: LOG_TARGET,
key = ?sp_core::hexdisplay::HexDisplay::from(&key),
"Caching value for key",
);
self.value_cache.insert(key.into(), data);
}
}
#[cfg(test)]
mod tests {
use super::*;
use trie_db::{Bytes, Trie, TrieDBBuilder, TrieDBMutBuilder, TrieHash, TrieMut};
type MemoryDB = crate::MemoryDB<sp_core::Blake2Hasher>;
type Layout = crate::LayoutV1<sp_core::Blake2Hasher>;
type Cache = super::SharedTrieCache<sp_core::Blake2Hasher>;
type Recorder = crate::recorder::Recorder<sp_core::Blake2Hasher>;
const TEST_DATA: &[(&[u8], &[u8])] =
&[(b"key1", b"val1"), (b"key2", &[2; 64]), (b"key3", b"val3"), (b"key4", &[4; 64])];
const CACHE_SIZE_RAW: usize = 1024 * 10;
const CACHE_SIZE: CacheSize = CacheSize::Maximum(CACHE_SIZE_RAW);
fn create_trie() -> (MemoryDB, TrieHash<Layout>) {
let mut db = MemoryDB::default();
let mut root = Default::default();
{
let mut trie = TrieDBMutBuilder::<Layout>::new(&mut db, &mut root).build();
for (k, v) in TEST_DATA {
trie.insert(k, v).expect("Inserts data");
}
}
(db, root)
}
#[test]
fn basic_cache_works() {
let (db, root) = create_trie();
let shared_cache = Cache::new(CACHE_SIZE);
let local_cache = shared_cache.local_cache();
{
let mut cache = local_cache.as_trie_db_cache(root);
let trie = TrieDBBuilder::<Layout>::new(&db, &root).with_cache(&mut cache).build();
assert_eq!(TEST_DATA[0].1.to_vec(), trie.get(TEST_DATA[0].0).unwrap().unwrap());
}
// Local cache wasn't dropped yet, so there should nothing in the shared caches.
assert!(shared_cache.read_lock_inner().value_cache().lru.is_empty());
assert!(shared_cache.read_lock_inner().node_cache().lru.is_empty());
drop(local_cache);
// Now we should have the cached items in the shared cache.
assert!(shared_cache.read_lock_inner().node_cache().lru.len() >= 1);
let cached_data = shared_cache
.read_lock_inner()
.value_cache()
.lru
.peek(&ValueCacheKey::new_value(TEST_DATA[0].0, root))
.unwrap()
.clone();
assert_eq!(Bytes::from(TEST_DATA[0].1.to_vec()), cached_data.data().flatten().unwrap());
let fake_data = Bytes::from(&b"fake_data"[..]);
let local_cache = shared_cache.local_cache();
shared_cache.write_lock_inner().value_cache_mut().lru.put(
ValueCacheKey::new_value(TEST_DATA[1].0, root),
(fake_data.clone(), Default::default()).into(),
);
{
let mut cache = local_cache.as_trie_db_cache(root);
let trie = TrieDBBuilder::<Layout>::new(&db, &root).with_cache(&mut cache).build();
// We should now get the "fake_data", because we inserted this manually to the cache.
assert_eq!(b"fake_data".to_vec(), trie.get(TEST_DATA[1].0).unwrap().unwrap());
}
}
#[test]
fn trie_db_mut_cache_works() {
let (mut db, root) = create_trie();
let new_key = b"new_key".to_vec();
// Use some long value to not have it inlined
let new_value = vec![23; 64];
let shared_cache = Cache::new(CACHE_SIZE);
let mut new_root = root;
{
let local_cache = shared_cache.local_cache();
let mut cache = local_cache.as_trie_db_mut_cache();
{
let mut trie = TrieDBMutBuilder::<Layout>::from_existing(&mut db, &mut new_root)
.with_cache(&mut cache)
.build();
trie.insert(&new_key, &new_value).unwrap();
}
cache.merge_into(&local_cache, new_root);
}
// After the local cache is dropped, all changes should have been merged back to the shared
// cache.
let cached_data = shared_cache
.read_lock_inner()
.value_cache()
.lru
.peek(&ValueCacheKey::new_value(new_key, new_root))
.unwrap()
.clone();
assert_eq!(Bytes::from(new_value), cached_data.data().flatten().unwrap());
}
#[test]
fn trie_db_cache_and_recorder_work_together() {
let (db, root) = create_trie();
let shared_cache = Cache::new(CACHE_SIZE);
for i in 0..5 {
// Clear some of the caches.
if i == 2 {
shared_cache.reset_node_cache();
} else if i == 3 {
shared_cache.reset_value_cache();
}
let local_cache = shared_cache.local_cache();
let recorder = Recorder::default();
{
let mut cache = local_cache.as_trie_db_cache(root);
let mut recorder = recorder.as_trie_recorder();
let trie = TrieDBBuilder::<Layout>::new(&db, &root)
.with_cache(&mut cache)
.with_recorder(&mut recorder)
.build();
for (key, value) in TEST_DATA {
assert_eq!(*value, trie.get(&key).unwrap().unwrap());
}
}
let storage_proof = recorder.drain_storage_proof();
let memory_db: MemoryDB = storage_proof.into_memory_db();
{
let trie = TrieDBBuilder::<Layout>::new(&memory_db, &root).build();
for (key, value) in TEST_DATA {
assert_eq!(*value, trie.get(&key).unwrap().unwrap());
}
}
}
}
#[test]
fn trie_db_mut_cache_and_recorder_work_together() {
const DATA_TO_ADD: &[(&[u8], &[u8])] = &[(b"key11", &[45; 78]), (b"key33", &[78; 89])];
let (db, root) = create_trie();
let shared_cache = Cache::new(CACHE_SIZE);
// Run this twice so that we use the data cache in the second run.
for i in 0..5 {
// Clear some of the caches.
if i == 2 {
shared_cache.reset_node_cache();
} else if i == 3 {
shared_cache.reset_value_cache();
}
let recorder = Recorder::default();
let local_cache = shared_cache.local_cache();
let mut new_root = root;
{
let mut db = db.clone();
let mut cache = local_cache.as_trie_db_cache(root);
let mut recorder = recorder.as_trie_recorder();
let mut trie = TrieDBMutBuilder::<Layout>::from_existing(&mut db, &mut new_root)
.with_cache(&mut cache)
.with_recorder(&mut recorder)
.build();
for (key, value) in DATA_TO_ADD {
trie.insert(key, value).unwrap();
}
}
let storage_proof = recorder.drain_storage_proof();
let mut memory_db: MemoryDB = storage_proof.into_memory_db();
let mut proof_root = root;
{
let mut trie =
TrieDBMutBuilder::<Layout>::from_existing(&mut memory_db, &mut proof_root)
.build();
for (key, value) in DATA_TO_ADD {
trie.insert(key, value).unwrap();
}
}
assert_eq!(new_root, proof_root)
}
}
#[test]
fn cache_lru_works() {
let (db, root) = create_trie();
let shared_cache = Cache::new(CACHE_SIZE);
{
let local_cache = shared_cache.local_cache();
let mut cache = local_cache.as_trie_db_cache(root);
let trie = TrieDBBuilder::<Layout>::new(&db, &root).with_cache(&mut cache).build();
for (k, _) in TEST_DATA {
trie.get(k).unwrap().unwrap();
}
}
// Check that all items are there.
assert!(shared_cache
.read_lock_inner()
.value_cache()
.lru
.iter()
.map(|d| d.0)
.all(|l| TEST_DATA.iter().any(|d| l.storage_key().unwrap() == d.0)));
{
let local_cache = shared_cache.local_cache();
let mut cache = local_cache.as_trie_db_cache(root);
let trie = TrieDBBuilder::<Layout>::new(&db, &root).with_cache(&mut cache).build();
for (k, _) in TEST_DATA.iter().take(2) {
trie.get(k).unwrap().unwrap();
}
}
// Ensure that the accessed items are most recently used items of the shared value cache.
assert!(shared_cache
.read_lock_inner()
.value_cache()
.lru
.iter()
.take(2)
.map(|d| d.0)
.all(|l| { TEST_DATA.iter().take(2).any(|d| l.storage_key().unwrap() == d.0) }));
let most_recently_used_nodes = shared_cache
.read_lock_inner()
.node_cache()
.lru
.iter()
.map(|d| *d.0)
.collect::<Vec<_>>();
// Delete the value cache, so that we access the nodes.
shared_cache.reset_value_cache();
{
let local_cache = shared_cache.local_cache();
let mut cache = local_cache.as_trie_db_cache(root);
let trie = TrieDBBuilder::<Layout>::new(&db, &root).with_cache(&mut cache).build();
for (k, _) in TEST_DATA.iter().take(2) {
trie.get(k).unwrap().unwrap();
}
}
// Ensure that the most recently used nodes changed as well.
assert_ne!(
most_recently_used_nodes,
shared_cache
.read_lock_inner()
.node_cache()
.lru
.iter()
.map(|d| *d.0)
.collect::<Vec<_>>()
);
}
#[test]
fn cache_respects_bounds() {
let (mut db, root) = create_trie();
let shared_cache = Cache::new(CACHE_SIZE);
{
let local_cache = shared_cache.local_cache();
let mut new_root = root;
{
let mut cache = local_cache.as_trie_db_cache(root);
{
let mut trie =
TrieDBMutBuilder::<Layout>::from_existing(&mut db, &mut new_root)
.with_cache(&mut cache)
.build();
let value = vec![10u8; 100];
// Ensure we add enough data that would overflow the cache.
for i in 0..CACHE_SIZE_RAW / 100 * 2 {
trie.insert(format!("key{}", i).as_bytes(), &value).unwrap();
}
}
cache.merge_into(&local_cache, new_root);
}
}
let node_cache_size = shared_cache.read_lock_inner().node_cache().size_in_bytes;
let value_cache_size = shared_cache.read_lock_inner().value_cache().size_in_bytes;
assert!(node_cache_size + value_cache_size < CACHE_SIZE_RAW);
}
}
+677
View File
@@ -0,0 +1,677 @@
// This file is part of Substrate.
// Copyright (C) 2022 Parity Technologies (UK) Ltd.
// SPDX-License-Identifier: Apache-2.0
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
///! Provides the [`SharedNodeCache`], the [`SharedValueCache`] and the [`SharedTrieCache`]
///! that combines both caches and is exported to the outside.
use super::{CacheSize, LOG_TARGET};
use hash_db::Hasher;
use hashbrown::{hash_set::Entry as SetEntry, HashSet};
use lru::LruCache;
use nohash_hasher::BuildNoHashHasher;
use parking_lot::{RwLock, RwLockReadGuard, RwLockWriteGuard};
use std::{
hash::{BuildHasher, Hasher as _},
mem,
sync::Arc,
};
use trie_db::{node::NodeOwned, CachedValue};
lazy_static::lazy_static! {
static ref RANDOM_STATE: ahash::RandomState = ahash::RandomState::default();
}
/// No hashing [`LruCache`].
type NoHashingLruCache<K, T> = lru::LruCache<K, T, BuildNoHashHasher<K>>;
/// The shared node cache.
///
/// Internally this stores all cached nodes in a [`LruCache`]. It ensures that when updating the
/// cache, that the cache stays within its allowed bounds.
pub(super) struct SharedNodeCache<H> {
/// The cached nodes, ordered by least recently used.
pub(super) lru: LruCache<H, NodeOwned<H>>,
/// The size of [`Self::lru`] in bytes.
pub(super) size_in_bytes: usize,
/// The maximum cache size of [`Self::lru`].
maximum_cache_size: CacheSize,
}
impl<H: AsRef<[u8]> + Eq + std::hash::Hash> SharedNodeCache<H> {
/// Create a new instance.
fn new(cache_size: CacheSize) -> Self {
Self { lru: LruCache::unbounded(), size_in_bytes: 0, maximum_cache_size: cache_size }
}
/// Get the node for `key`.
///
/// This doesn't change the least recently order in the internal [`LruCache`].
pub fn get(&self, key: &H) -> Option<&NodeOwned<H>> {
self.lru.peek(key)
}
/// Update the cache with the `added` nodes and the `accessed` nodes.
///
/// The `added` nodes are the ones that have been collected by doing operations on the trie and
/// now should be stored in the shared cache. The `accessed` nodes are only referenced by hash
/// and represent the nodes that were retrieved from this shared cache through [`Self::get`].
/// These `accessed` nodes are being put to the front of the internal [`LruCache`] like the
/// `added` ones.
///
/// After the internal [`LruCache`] was updated, it is ensured that the internal [`LruCache`] is
/// inside its bounds ([`Self::maximum_size_in_bytes`]).
pub fn update(
&mut self,
added: impl IntoIterator<Item = (H, NodeOwned<H>)>,
accessed: impl IntoIterator<Item = H>,
) {
let update_size_in_bytes = |size_in_bytes: &mut usize, key: &H, node: &NodeOwned<H>| {
if let Some(new_size_in_bytes) =
size_in_bytes.checked_sub(key.as_ref().len() + node.size_in_bytes())
{
*size_in_bytes = new_size_in_bytes;
} else {
*size_in_bytes = 0;
tracing::error!(target: LOG_TARGET, "`SharedNodeCache` underflow detected!",);
}
};
accessed.into_iter().for_each(|key| {
// Access every node in the lru to put it to the front.
self.lru.get(&key);
});
added.into_iter().for_each(|(key, node)| {
self.size_in_bytes += key.as_ref().len() + node.size_in_bytes();
if let Some((r_key, r_node)) = self.lru.push(key, node) {
update_size_in_bytes(&mut self.size_in_bytes, &r_key, &r_node);
}
// Directly ensure that we respect the maximum size. By doing it directly here we ensure
// that the internal map of the [`LruCache`] doesn't grow too much.
while self.maximum_cache_size.exceeds(self.size_in_bytes) {
// This should always be `Some(_)`, otherwise something is wrong!
if let Some((key, node)) = self.lru.pop_lru() {
update_size_in_bytes(&mut self.size_in_bytes, &key, &node);
}
}
});
}
/// Reset the cache.
fn reset(&mut self) {
self.size_in_bytes = 0;
self.lru.clear();
}
}
/// The hash of [`ValueCacheKey`].
#[derive(Eq, Clone, Copy)]
pub struct ValueCacheKeyHash(u64);
impl ValueCacheKeyHash {
pub fn from_hasher_and_storage_key(
mut hasher: impl std::hash::Hasher,
storage_key: &[u8],
) -> Self {
hasher.write(storage_key);
Self(hasher.finish())
}
}
impl PartialEq for ValueCacheKeyHash {
fn eq(&self, other: &Self) -> bool {
self.0 == other.0
}
}
impl std::hash::Hash for ValueCacheKeyHash {
fn hash<Hasher: std::hash::Hasher>(&self, state: &mut Hasher) {
state.write_u64(self.0);
}
}
impl nohash_hasher::IsEnabled for ValueCacheKeyHash {}
/// A type that can only be constructed inside of this file.
///
/// It "requires" that the user has read the docs to prevent fuck ups.
#[derive(Eq, PartialEq)]
pub(super) struct IReadTheDocumentation(());
/// The key type that is being used to address a [`CachedValue`].
///
/// This type is implemented as `enum` to improve the performance when accessing the value cache.
/// The problem being that we need to calculate the `hash` of [`Self`] in worst case three times
/// when trying to find a value in the value cache. First to lookup the local cache, then the shared
/// cache and if we found it in the shared cache a third time to insert it into the list of accessed
/// values. To work around each variant stores the `hash` to identify a unique combination of
/// `storage_key` and `storage_root`. However, be aware that this `hash` can lead to collisions when
/// there are two different `storage_key` and `storage_root` pairs that map to the same `hash`. This
/// type also has the `Hash` variant. This variant should only be used for the use case of updating
/// the lru for a key. Because when using only the `Hash` variant to getting a value from a hash map
/// it could happen that a wrong value is returned when there is another key in the same hash map
/// that maps to the same `hash`. The [`PartialEq`] implementation is written in a way that when one
/// of the two compared instances is the `Hash` variant, we will only compare the hashes. This
/// ensures that we can use the `Hash` variant to bring values up in the lru.
#[derive(Eq)]
pub(super) enum ValueCacheKey<'a, H> {
/// Variant that stores the `storage_key` by value.
Value {
/// The storage root of the trie this key belongs to.
storage_root: H,
/// The key to access the value in the storage.
storage_key: Arc<[u8]>,
/// The hash that identifying this instance of `storage_root` and `storage_key`.
hash: ValueCacheKeyHash,
},
/// Variant that only references the `storage_key`.
Ref {
/// The storage root of the trie this key belongs to.
storage_root: H,
/// The key to access the value in the storage.
storage_key: &'a [u8],
/// The hash that identifying this instance of `storage_root` and `storage_key`.
hash: ValueCacheKeyHash,
},
/// Variant that only stores the hash that represents the `storage_root` and `storage_key`.
///
/// This should be used by caution, because it can lead to accessing the wrong value in a
/// hash map/set when there exists two different `storage_root`s and `storage_key`s that
/// map to the same `hash`.
Hash { hash: ValueCacheKeyHash, _i_read_the_documentation: IReadTheDocumentation },
}
impl<'a, H> ValueCacheKey<'a, H> {
/// Constructs [`Self::Value`].
pub fn new_value(storage_key: impl Into<Arc<[u8]>>, storage_root: H) -> Self
where
H: AsRef<[u8]>,
{
let storage_key = storage_key.into();
let hash = Self::hash_data(&storage_key, &storage_root);
Self::Value { storage_root, storage_key, hash }
}
/// Constructs [`Self::Ref`].
pub fn new_ref(storage_key: &'a [u8], storage_root: H) -> Self
where
H: AsRef<[u8]>,
{
let storage_key = storage_key.into();
let hash = Self::hash_data(storage_key, &storage_root);
Self::Ref { storage_root, storage_key, hash }
}
/// Returns a hasher prepared to build the final hash to identify [`Self`].
///
/// See [`Self::hash_data`] for building the hash directly.
pub fn hash_partial_data(storage_root: &H) -> impl std::hash::Hasher + Clone
where
H: AsRef<[u8]>,
{
let mut hasher = RANDOM_STATE.build_hasher();
hasher.write(storage_root.as_ref());
hasher
}
/// Hash the `key` and `storage_root` that identify [`Self`].
///
/// Returns a `u64` which represents the unique hash for the given inputs.
pub fn hash_data(key: &[u8], storage_root: &H) -> ValueCacheKeyHash
where
H: AsRef<[u8]>,
{
let hasher = Self::hash_partial_data(storage_root);
ValueCacheKeyHash::from_hasher_and_storage_key(hasher, key)
}
/// Returns the `hash` that identifies the current instance.
pub fn get_hash(&self) -> ValueCacheKeyHash {
match self {
Self::Value { hash, .. } | Self::Ref { hash, .. } | Self::Hash { hash, .. } => *hash,
}
}
/// Returns the stored storage root.
pub fn storage_root(&self) -> Option<&H> {
match self {
Self::Value { storage_root, .. } | Self::Ref { storage_root, .. } => Some(storage_root),
Self::Hash { .. } => None,
}
}
/// Returns the stored storage key.
pub fn storage_key(&self) -> Option<&[u8]> {
match self {
Self::Ref { storage_key, .. } => Some(&storage_key),
Self::Value { storage_key, .. } => Some(storage_key),
Self::Hash { .. } => None,
}
}
}
// Implement manually to ensure that the `Value` and `Hash` are treated equally.
impl<H: std::hash::Hash> std::hash::Hash for ValueCacheKey<'_, H> {
fn hash<Hasher: std::hash::Hasher>(&self, state: &mut Hasher) {
self.get_hash().hash(state)
}
}
impl<H> nohash_hasher::IsEnabled for ValueCacheKey<'_, H> {}
// Implement manually to ensure that the `Value` and `Hash` are treated equally.
impl<H: PartialEq> PartialEq for ValueCacheKey<'_, H> {
fn eq(&self, other: &Self) -> bool {
// First check if `self` or `other` is only the `Hash`.
// Then we only compare the `hash`. So, there could actually be some collision
// if two different storage roots and keys are mapping to the same key. See the
// [`ValueCacheKey`] docs for more information.
match (self, other) {
(Self::Hash { hash, .. }, Self::Hash { hash: other_hash, .. }) => hash == other_hash,
(Self::Hash { hash, .. }, _) => *hash == other.get_hash(),
(_, Self::Hash { hash: other_hash, .. }) => self.get_hash() == *other_hash,
// If both are not the `Hash` variant, we compare all the values.
_ =>
self.get_hash() == other.get_hash() &&
self.storage_root() == other.storage_root() &&
self.storage_key() == other.storage_key(),
}
}
}
/// The shared value cache.
///
/// The cache ensures that it stays in the configured size bounds.
pub(super) struct SharedValueCache<H> {
/// The cached nodes, ordered by least recently used.
pub(super) lru: NoHashingLruCache<ValueCacheKey<'static, H>, CachedValue<H>>,
/// The size of [`Self::lru`] in bytes.
pub(super) size_in_bytes: usize,
/// The maximum cache size of [`Self::lru`].
maximum_cache_size: CacheSize,
/// All known storage keys that are stored in [`Self::lru`].
///
/// This is used to de-duplicate keys in memory that use the
/// same [`SharedValueCache::storage_key`], but have a different
/// [`SharedValueCache::storage_root`].
known_storage_keys: HashSet<Arc<[u8]>>,
}
impl<H: Eq + std::hash::Hash + Clone + Copy + AsRef<[u8]>> SharedValueCache<H> {
/// Create a new instance.
fn new(cache_size: CacheSize) -> Self {
Self {
lru: NoHashingLruCache::unbounded_with_hasher(Default::default()),
size_in_bytes: 0,
maximum_cache_size: cache_size,
known_storage_keys: Default::default(),
}
}
/// Get the [`CachedValue`] for `key`.
///
/// This doesn't change the least recently order in the internal [`LruCache`].
pub fn get<'a>(&'a self, key: &ValueCacheKey<H>) -> Option<&'a CachedValue<H>> {
debug_assert!(
!matches!(key, ValueCacheKey::Hash { .. }),
"`get` can not be called with `Hash` variant as this may returns the wrong value."
);
self.lru.peek(unsafe {
// SAFETY
//
// We need to convert the lifetime to make the compiler happy. However, as
// we only use the `key` to looking up the value this lifetime conversion is
// safe.
mem::transmute::<&ValueCacheKey<'_, H>, &ValueCacheKey<'static, H>>(key)
})
}
/// Update the cache with the `added` values and the `accessed` values.
///
/// The `added` values are the ones that have been collected by doing operations on the trie and
/// now should be stored in the shared cache. The `accessed` values are only referenced by the
/// [`ValueCacheKeyHash`] and represent the values that were retrieved from this shared cache
/// through [`Self::get`]. These `accessed` values are being put to the front of the internal
/// [`LruCache`] like the `added` ones.
///
/// After the internal [`LruCache`] was updated, it is ensured that the internal [`LruCache`] is
/// inside its bounds ([`Self::maximum_size_in_bytes`]).
pub fn update(
&mut self,
added: impl IntoIterator<Item = (ValueCacheKey<'static, H>, CachedValue<H>)>,
accessed: impl IntoIterator<Item = ValueCacheKeyHash>,
) {
// The base size in memory per ([`ValueCacheKey<H>`], [`CachedValue`]).
let base_size = mem::size_of::<ValueCacheKey<H>>() + mem::size_of::<CachedValue<H>>();
let known_keys_entry_size = mem::size_of::<Arc<[u8]>>();
let update_size_in_bytes =
|size_in_bytes: &mut usize, r_key: Arc<[u8]>, known_keys: &mut HashSet<Arc<[u8]>>| {
// If the `strong_count == 2`, it means this is the last instance of the key.
// One being `r_key` and the other being stored in `known_storage_keys`.
let last_instance = Arc::strong_count(&r_key) == 2;
let key_len = if last_instance {
known_keys.remove(&r_key);
r_key.len() + known_keys_entry_size
} else {
// The key is still in `keys`, because it is still used by another
// `ValueCacheKey<H>`.
0
};
if let Some(new_size_in_bytes) = size_in_bytes.checked_sub(key_len + base_size) {
*size_in_bytes = new_size_in_bytes;
} else {
*size_in_bytes = 0;
tracing::error!(target: LOG_TARGET, "`SharedValueCache` underflow detected!",);
}
};
accessed.into_iter().for_each(|key| {
// Access every node in the lru to put it to the front.
// As we are using the `Hash` variant here, it may leads to putting the wrong value to
// the top. However, the only consequence of this is that we may prune a recently used
// value to early.
self.lru.get(&ValueCacheKey::Hash {
hash: key,
_i_read_the_documentation: IReadTheDocumentation(()),
});
});
added.into_iter().for_each(|(key, value)| {
let (storage_root, storage_key, key_hash) = match key {
ValueCacheKey::Hash { .. } => {
// Ignore the hash variant and try the next.
tracing::error!(
target: LOG_TARGET,
"`SharedValueCached::update` was called with a key to add \
that uses the `Hash` variant. This would lead to potential hash collision!",
);
return
},
ValueCacheKey::Ref { storage_key, storage_root, hash } =>
(storage_root, storage_key.into(), hash),
ValueCacheKey::Value { storage_root, storage_key, hash } =>
(storage_root, storage_key, hash),
};
let (size_update, storage_key) =
match self.known_storage_keys.entry(storage_key.clone()) {
SetEntry::Vacant(v) => {
let len = v.get().len();
v.insert();
// If the key was unknown, we need to also take its length and the size of
// the entry of `known_keys` into account.
(len + base_size + known_keys_entry_size, storage_key)
},
SetEntry::Occupied(o) => {
// Key is known
(base_size, o.get().clone())
},
};
self.size_in_bytes += size_update;
if let Some((r_key, _)) = self
.lru
.push(ValueCacheKey::Value { storage_key, storage_root, hash: key_hash }, value)
{
if let ValueCacheKey::Value { storage_key, .. } = r_key {
update_size_in_bytes(
&mut self.size_in_bytes,
storage_key,
&mut self.known_storage_keys,
);
}
}
// Directly ensure that we respect the maximum size. By doing it directly here we
// ensure that the internal map of the [`LruCache`] doesn't grow too much.
while self.maximum_cache_size.exceeds(self.size_in_bytes) {
// This should always be `Some(_)`, otherwise something is wrong!
if let Some((r_key, _)) = self.lru.pop_lru() {
if let ValueCacheKey::Value { storage_key, .. } = r_key {
update_size_in_bytes(
&mut self.size_in_bytes,
storage_key,
&mut self.known_storage_keys,
);
}
}
}
});
}
/// Reset the cache.
fn reset(&mut self) {
self.size_in_bytes = 0;
self.lru.clear();
self.known_storage_keys.clear();
}
}
/// The inner of [`SharedTrieCache`].
pub(super) struct SharedTrieCacheInner<H: Hasher> {
node_cache: SharedNodeCache<H::Out>,
value_cache: SharedValueCache<H::Out>,
}
impl<H: Hasher> SharedTrieCacheInner<H> {
/// Returns a reference to the [`SharedValueCache`].
pub(super) fn value_cache(&self) -> &SharedValueCache<H::Out> {
&self.value_cache
}
/// Returns a mutable reference to the [`SharedValueCache`].
pub(super) fn value_cache_mut(&mut self) -> &mut SharedValueCache<H::Out> {
&mut self.value_cache
}
/// Returns a reference to the [`SharedNodeCache`].
pub(super) fn node_cache(&self) -> &SharedNodeCache<H::Out> {
&self.node_cache
}
/// Returns a mutable reference to the [`SharedNodeCache`].
pub(super) fn node_cache_mut(&mut self) -> &mut SharedNodeCache<H::Out> {
&mut self.node_cache
}
}
/// The shared trie cache.
///
/// It should be instantiated once per node. It will hold the trie nodes and values of all
/// operations to the state. To not use all available memory it will ensure to stay in the
/// bounds given via the [`CacheSize`] at startup.
///
/// The instance of this object can be shared between multiple threads.
pub struct SharedTrieCache<H: Hasher> {
inner: Arc<RwLock<SharedTrieCacheInner<H>>>,
}
impl<H: Hasher> Clone for SharedTrieCache<H> {
fn clone(&self) -> Self {
Self { inner: self.inner.clone() }
}
}
impl<H: Hasher> SharedTrieCache<H> {
/// Create a new [`SharedTrieCache`].
pub fn new(cache_size: CacheSize) -> Self {
let (node_cache_size, value_cache_size) = match cache_size {
CacheSize::Maximum(max) => {
// Allocate 20% for the value cache.
let value_cache_size_in_bytes = (max as f32 * 0.20) as usize;
(
CacheSize::Maximum(max - value_cache_size_in_bytes),
CacheSize::Maximum(value_cache_size_in_bytes),
)
},
CacheSize::Unlimited => (CacheSize::Unlimited, CacheSize::Unlimited),
};
Self {
inner: Arc::new(RwLock::new(SharedTrieCacheInner {
node_cache: SharedNodeCache::new(node_cache_size),
value_cache: SharedValueCache::new(value_cache_size),
})),
}
}
/// Create a new [`LocalTrieCache`](super::LocalTrieCache) instance from this shared cache.
pub fn local_cache(&self) -> super::LocalTrieCache<H> {
super::LocalTrieCache {
shared: self.clone(),
node_cache: Default::default(),
value_cache: Default::default(),
shared_node_cache_access: Default::default(),
shared_value_cache_access: Default::default(),
}
}
/// Returns the used memory size of this cache in bytes.
pub fn used_memory_size(&self) -> usize {
let inner = self.inner.read();
let value_cache_size = inner.value_cache.size_in_bytes;
let node_cache_size = inner.node_cache.size_in_bytes;
node_cache_size + value_cache_size
}
/// Reset the node cache.
pub fn reset_node_cache(&self) {
self.inner.write().node_cache.reset();
}
/// Reset the value cache.
pub fn reset_value_cache(&self) {
self.inner.write().value_cache.reset();
}
/// Reset the entire cache.
pub fn reset(&self) {
self.reset_node_cache();
self.reset_value_cache();
}
/// Returns the read locked inner.
pub(super) fn read_lock_inner(&self) -> RwLockReadGuard<'_, SharedTrieCacheInner<H>> {
self.inner.read()
}
/// Returns the write locked inner.
pub(super) fn write_lock_inner(&self) -> RwLockWriteGuard<'_, SharedTrieCacheInner<H>> {
self.inner.write()
}
}
#[cfg(test)]
mod tests {
use super::*;
use sp_core::H256 as Hash;
#[test]
fn shared_value_cache_works() {
let base_size = mem::size_of::<CachedValue<Hash>>() + mem::size_of::<ValueCacheKey<Hash>>();
let arc_size = mem::size_of::<Arc<[u8]>>();
let mut cache = SharedValueCache::<sp_core::H256>::new(CacheSize::Maximum(
(base_size + arc_size + 10) * 10,
));
let key = vec![0; 10];
let root0 = Hash::repeat_byte(1);
let root1 = Hash::repeat_byte(2);
cache.update(
vec![
(ValueCacheKey::new_value(&key[..], root0), CachedValue::NonExisting),
(ValueCacheKey::new_value(&key[..], root1), CachedValue::NonExisting),
],
vec![],
);
// Ensure that the basics are working
assert_eq!(1, cache.known_storage_keys.len());
assert_eq!(3, Arc::strong_count(cache.known_storage_keys.get(&key[..]).unwrap()));
assert_eq!(base_size * 2 + key.len() + arc_size, cache.size_in_bytes);
// Just accessing a key should not change anything on the size and number of entries.
cache.update(vec![], vec![ValueCacheKey::hash_data(&key[..], &root0)]);
assert_eq!(1, cache.known_storage_keys.len());
assert_eq!(3, Arc::strong_count(cache.known_storage_keys.get(&key[..]).unwrap()));
assert_eq!(base_size * 2 + key.len() + arc_size, cache.size_in_bytes);
// Add 9 other entries and this should move out the key for `root1`.
cache.update(
(1..10)
.map(|i| vec![i; 10])
.map(|key| (ValueCacheKey::new_value(&key[..], root0), CachedValue::NonExisting)),
vec![],
);
assert_eq!(10, cache.known_storage_keys.len());
assert_eq!(2, Arc::strong_count(cache.known_storage_keys.get(&key[..]).unwrap()));
assert_eq!((base_size + key.len() + arc_size) * 10, cache.size_in_bytes);
assert!(matches!(
cache.get(&ValueCacheKey::new_ref(&key, root0)).unwrap(),
CachedValue::<Hash>::NonExisting
));
assert!(cache.get(&ValueCacheKey::new_ref(&key, root1)).is_none());
cache.update(
vec![(ValueCacheKey::new_value(vec![10; 10], root0), CachedValue::NonExisting)],
vec![],
);
assert!(cache.known_storage_keys.get(&key[..]).is_none());
}
#[test]
fn value_cache_key_eq_works() {
let storage_key = &b"something"[..];
let storage_key2 = &b"something2"[..];
let storage_root = Hash::random();
let value = ValueCacheKey::new_value(storage_key, storage_root);
// Ref gets the same hash, but a different storage key
let ref_ =
ValueCacheKey::Ref { storage_root, storage_key: storage_key2, hash: value.get_hash() };
let hash = ValueCacheKey::Hash {
hash: value.get_hash(),
_i_read_the_documentation: IReadTheDocumentation(()),
};
// Ensure that the hash variants is equal to `value`, `ref_` and itself.
assert!(hash == value);
assert!(value == hash);
assert!(hash == ref_);
assert!(ref_ == hash);
assert!(hash == hash);
// But when we compare `value` and `ref_` the different storage key is detected.
assert!(value != ref_);
assert!(ref_ != value);
}
}