Skip to content

Commit 5b8046a

Browse files
committed
Merge bitcoin#30611: validation: write chainstate to disk every hour
e976bd3 validation: add randomness to periodic write interval (Andrew Toth) 2e2f410 refactor: replace m_last_write with m_next_write (Andrew Toth) b557fa7 refactor: rename fDoFullFlush to should_write (Andrew Toth) d73bd9f validation: write chainstate to disk every hour (Andrew Toth) 0ad7d7a test: chainstate write test for periodic chainstate flush (Andrew Toth) Pull request description: Since bitcoin#28233, periodically writing the chainstate to disk every 24 hours does not clear the dbcache. Since bitcoin#28280, periodically writing the chainstate to disk is proportional only to the amount of dirty entries in the cache. Due to these changes, it is no longer beneficial to only write the chainstate to disk every 24 hours. The periodic flush interval was necessary because every write of the chainstate would clear the dbcache. Now, we can get rid of the periodic flush interval and simply write the chainstate along with blocks and block index at least every hour. Three benefits of doing this: 1. For IBD or reindex-chainstate with a combination of large dbcache setting, slow CPU, slow internet speed/unreliable peers, it could be up to 24 hours until the chainstate is persisted to disk. A power outage or crash could potentially lose up to 24 hours of progress. If there is a very large amount of dirty cache entries, writing to disk when a flush finally does occur will take a very long time. Crashing during this window of writing can cause bitcoin#11600. By syncing every hour in unison with the block index we avoid this problem. Only a maximum of one hour of progress can be lost, and the window for crashing during writing is much smaller. For IBD with lower dbcache settings, faster CPU, or better internet speed/reliable peers, chainstate writes are already triggered more often than every hour so this change will have no effect on IBD. 2. Based on discussion in bitcoin#28280, writing only once every 24 hours during long running operation of a node causes IO spikes. Writing smaller chainstate changes every hour like we do with blocks and block index will reduce IO spikes. 3. Faster shutdown speeds. All dirty chainstate entries must be persisted to disk on shutdown. If we have a lot of dirty entries, such as when close to 24 hours or if we sync with a large dbcache, it can take a long time to shutdown. By keeping the chainstate clean we avoid this problem. Inspired by [this comment](bitcoin#28280 (comment)). Resolves bitcoin#11600 ACKs for top commit: achow101: ACK e976bd3 davidgumberg: utACK bitcoin@e976bd3 sipa: utACK e976bd3 l0rinc: ACK e976bd3 Tree-SHA512: 5bccd8f1dea47f9820a3fd32fe3bb6841c0167b3d6870cc8f3f7e2368f124af1a914bca6acb06889cd7183638a8dbdbace54d3237c3683f2b567eb7355e015ee
2 parents fc6346d + e976bd3 commit 5b8046a

File tree

4 files changed

+92
-51
lines changed

4 files changed

+92
-51
lines changed

src/test/CMakeLists.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,7 @@ add_executable(test_bitcoin
2525
blockmanager_tests.cpp
2626
bloom_tests.cpp
2727
bswap_tests.cpp
28+
chainstate_write_tests.cpp
2829
checkqueue_tests.cpp
2930
cluster_linearize_tests.cpp
3031
coins_tests.cpp

src/test/chainstate_write_tests.cpp

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
// Copyright (c) The Bitcoin Core developers
2+
// Distributed under the MIT software license, see the accompanying
3+
// file COPYING or http://www.opensource.org/licenses/mit-license.php.
4+
5+
#include <test/util/setup_common.h>
6+
#include <validation.h>
7+
#include <validationinterface.h>
8+
9+
#include <boost/test/unit_test.hpp>
10+
11+
BOOST_AUTO_TEST_SUITE(chainstate_write_tests)
12+
13+
BOOST_FIXTURE_TEST_CASE(chainstate_write_interval, TestingSetup)
14+
{
15+
struct TestSubscriber final : CValidationInterface {
16+
bool m_did_flush{false};
17+
void ChainStateFlushed(ChainstateRole, const CBlockLocator&) override
18+
{
19+
m_did_flush = true;
20+
}
21+
};
22+
23+
const auto sub{std::make_shared<TestSubscriber>()};
24+
m_node.validation_signals->RegisterSharedValidationInterface(sub);
25+
auto& chainstate{Assert(m_node.chainman)->ActiveChainstate()};
26+
BlockValidationState state_dummy{};
27+
28+
// The first periodic flush sets m_next_write and does not flush
29+
chainstate.FlushStateToDisk(state_dummy, FlushStateMode::PERIODIC);
30+
m_node.validation_signals->SyncWithValidationInterfaceQueue();
31+
BOOST_CHECK(!sub->m_did_flush);
32+
33+
// The periodic flush interval is between 50 and 70 minutes (inclusive)
34+
SetMockTime(GetTime<std::chrono::minutes>() + 49min);
35+
chainstate.FlushStateToDisk(state_dummy, FlushStateMode::PERIODIC);
36+
m_node.validation_signals->SyncWithValidationInterfaceQueue();
37+
BOOST_CHECK(!sub->m_did_flush);
38+
39+
SetMockTime(GetTime<std::chrono::minutes>() + 70min);
40+
chainstate.FlushStateToDisk(state_dummy, FlushStateMode::PERIODIC);
41+
m_node.validation_signals->SyncWithValidationInterfaceQueue();
42+
BOOST_CHECK(sub->m_did_flush);
43+
}
44+
45+
BOOST_AUTO_TEST_SUITE_END()

src/validation.cpp

Lines changed: 44 additions & 49 deletions
Original file line numberDiff line numberDiff line change
@@ -90,10 +90,12 @@ using node::SnapshotMetadata;
9090

9191
/** Size threshold for warning about slow UTXO set flush to disk. */
9292
static constexpr size_t WARN_FLUSH_COINS_SIZE = 1 << 30; // 1 GiB
93-
/** Time to wait between writing blocks/block index to disk. */
94-
static constexpr std::chrono::hours DATABASE_WRITE_INTERVAL{1};
95-
/** Time to wait between flushing chainstate to disk. */
96-
static constexpr std::chrono::hours DATABASE_FLUSH_INTERVAL{24};
93+
/** Time window to wait between writing blocks/block index and chainstate to disk.
94+
* Randomize writing time inside the window to prevent a situation where the
95+
* network over time settles into a few cohorts of synchronized writers.
96+
*/
97+
static constexpr auto DATABASE_WRITE_INTERVAL_MIN{50min};
98+
static constexpr auto DATABASE_WRITE_INTERVAL_MAX{70min};
9799
/** Maximum age of our tip for us to be considered current for fee estimation */
98100
static constexpr std::chrono::hours MAX_FEE_ESTIMATION_TIP_AGE{3};
99101
const std::vector<std::string> CHECKLEVEL_DOC {
@@ -2807,7 +2809,6 @@ bool Chainstate::FlushStateToDisk(
28072809
try {
28082810
{
28092811
bool fFlushForPrune = false;
2810-
bool fDoFullFlush = false;
28112812

28122813
CoinsCacheSizeState cache_state = GetCoinsCacheSizeState();
28132814
LOCK(m_blockman.cs_LastBlockFile);
@@ -2852,26 +2853,17 @@ bool Chainstate::FlushStateToDisk(
28522853
}
28532854
}
28542855
}
2855-
const auto nNow{SteadyClock::now()};
2856-
// Avoid writing/flushing immediately after startup.
2857-
if (m_last_write == decltype(m_last_write){}) {
2858-
m_last_write = nNow;
2859-
}
2860-
if (m_last_flush == decltype(m_last_flush){}) {
2861-
m_last_flush = nNow;
2862-
}
2856+
const auto nNow{NodeClock::now()};
28632857
// The cache is large and we're within 10% and 10 MiB of the limit, but we have time now (not in the middle of a block processing).
28642858
bool fCacheLarge = mode == FlushStateMode::PERIODIC && cache_state >= CoinsCacheSizeState::LARGE;
28652859
// The cache is over the limit, we have to write now.
28662860
bool fCacheCritical = mode == FlushStateMode::IF_NEEDED && cache_state >= CoinsCacheSizeState::CRITICAL;
2867-
// It's been a while since we wrote the block index to disk. Do this frequently, so we don't need to redownload after a crash.
2868-
bool fPeriodicWrite = mode == FlushStateMode::PERIODIC && nNow > m_last_write + DATABASE_WRITE_INTERVAL;
2869-
// It's been very long since we flushed the cache. Do this infrequently, to optimize cache usage.
2870-
bool fPeriodicFlush = mode == FlushStateMode::PERIODIC && nNow > m_last_flush + DATABASE_FLUSH_INTERVAL;
2871-
// Combine all conditions that result in a full cache flush.
2872-
fDoFullFlush = (mode == FlushStateMode::ALWAYS) || fCacheLarge || fCacheCritical || fPeriodicFlush || fFlushForPrune;
2873-
// Write blocks and block index to disk.
2874-
if (fDoFullFlush || fPeriodicWrite) {
2861+
// It's been a while since we wrote the block index and chain state to disk. Do this frequently, so we don't need to redownload or reindex after a crash.
2862+
bool fPeriodicWrite = mode == FlushStateMode::PERIODIC && nNow >= m_next_write;
2863+
// Combine all conditions that result in a write to disk.
2864+
bool should_write = (mode == FlushStateMode::ALWAYS) || fCacheLarge || fCacheCritical || fPeriodicWrite || fFlushForPrune;
2865+
// Write blocks, block index and best chain related state to disk.
2866+
if (should_write) {
28752867
// Ensure we can write block index
28762868
if (!CheckDiskSpace(m_blockman.m_opts.blocks_dir)) {
28772869
return FatalError(m_chainman.GetNotifications(), state, _("Disk space is too low!"));
@@ -2901,35 +2893,38 @@ bool Chainstate::FlushStateToDisk(
29012893

29022894
m_blockman.UnlinkPrunedFiles(setFilesToPrune);
29032895
}
2904-
m_last_write = nNow;
2905-
}
2906-
// Flush best chain related state. This can only be done if the blocks / block index write was also done.
2907-
if (fDoFullFlush && !CoinsTip().GetBestBlock().IsNull()) {
2908-
if (coins_mem_usage >= WARN_FLUSH_COINS_SIZE) LogWarning("Flushing large (%d GiB) UTXO set to disk, it may take several minutes", coins_mem_usage >> 30);
2909-
LOG_TIME_MILLIS_WITH_CATEGORY(strprintf("write coins cache to disk (%d coins, %.2fKiB)",
2910-
coins_count, coins_mem_usage >> 10), BCLog::BENCH);
2911-
2912-
// Typical Coin structures on disk are around 48 bytes in size.
2913-
// Pushing a new one to the database can cause it to be written
2914-
// twice (once in the log, and once in the tables). This is already
2915-
// an overestimation, as most will delete an existing entry or
2916-
// overwrite one. Still, use a conservative safety factor of 2.
2917-
if (!CheckDiskSpace(m_chainman.m_options.datadir, 48 * 2 * 2 * CoinsTip().GetCacheSize())) {
2918-
return FatalError(m_chainman.GetNotifications(), state, _("Disk space is too low!"));
2919-
}
2920-
// Flush the chainstate (which may refer to block index entries).
2921-
const auto empty_cache{(mode == FlushStateMode::ALWAYS) || fCacheLarge || fCacheCritical};
2922-
if (empty_cache ? !CoinsTip().Flush() : !CoinsTip().Sync()) {
2923-
return FatalError(m_chainman.GetNotifications(), state, _("Failed to write to coin database."));
2896+
2897+
if (!CoinsTip().GetBestBlock().IsNull()) {
2898+
if (coins_mem_usage >= WARN_FLUSH_COINS_SIZE) LogWarning("Flushing large (%d GiB) UTXO set to disk, it may take several minutes", coins_mem_usage >> 30);
2899+
LOG_TIME_MILLIS_WITH_CATEGORY(strprintf("write coins cache to disk (%d coins, %.2fKiB)",
2900+
coins_count, coins_mem_usage >> 10), BCLog::BENCH);
2901+
2902+
// Typical Coin structures on disk are around 48 bytes in size.
2903+
// Pushing a new one to the database can cause it to be written
2904+
// twice (once in the log, and once in the tables). This is already
2905+
// an overestimation, as most will delete an existing entry or
2906+
// overwrite one. Still, use a conservative safety factor of 2.
2907+
if (!CheckDiskSpace(m_chainman.m_options.datadir, 48 * 2 * 2 * CoinsTip().GetCacheSize())) {
2908+
return FatalError(m_chainman.GetNotifications(), state, _("Disk space is too low!"));
2909+
}
2910+
// Flush the chainstate (which may refer to block index entries).
2911+
const auto empty_cache{(mode == FlushStateMode::ALWAYS) || fCacheLarge || fCacheCritical};
2912+
if (empty_cache ? !CoinsTip().Flush() : !CoinsTip().Sync()) {
2913+
return FatalError(m_chainman.GetNotifications(), state, _("Failed to write to coin database."));
2914+
}
2915+
full_flush_completed = true;
2916+
TRACEPOINT(utxocache, flush,
2917+
int64_t{Ticks<std::chrono::microseconds>(NodeClock::now() - nNow)},
2918+
(uint32_t)mode,
2919+
(uint64_t)coins_count,
2920+
(uint64_t)coins_mem_usage,
2921+
(bool)fFlushForPrune);
29242922
}
2925-
m_last_flush = nNow;
2926-
full_flush_completed = true;
2927-
TRACEPOINT(utxocache, flush,
2928-
int64_t{Ticks<std::chrono::microseconds>(SteadyClock::now() - nNow)},
2929-
(uint32_t)mode,
2930-
(uint64_t)coins_count,
2931-
(uint64_t)coins_mem_usage,
2932-
(bool)fFlushForPrune);
2923+
}
2924+
2925+
if (should_write || m_next_write == NodeClock::time_point::max()) {
2926+
constexpr auto range{DATABASE_WRITE_INTERVAL_MAX - DATABASE_WRITE_INTERVAL_MIN};
2927+
m_next_write = FastRandomContext().rand_uniform_delay(NodeClock::now() + DATABASE_WRITE_INTERVAL_MIN, range);
29332928
}
29342929
}
29352930
if (full_flush_completed && m_chainman.m_options.signals) {

src/validation.h

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,7 @@
3131
#include <util/fs.h>
3232
#include <util/hasher.h>
3333
#include <util/result.h>
34+
#include <util/time.h>
3435
#include <util/translation.h>
3536
#include <versionbits.h>
3637

@@ -802,8 +803,7 @@ class Chainstate
802803
void UpdateTip(const CBlockIndex* pindexNew)
803804
EXCLUSIVE_LOCKS_REQUIRED(::cs_main);
804805

805-
SteadyClock::time_point m_last_write{};
806-
SteadyClock::time_point m_last_flush{};
806+
NodeClock::time_point m_next_write{NodeClock::time_point::max()};
807807

808808
/**
809809
* In case of an invalid snapshot, rename the coins leveldb directory so

0 commit comments

Comments
 (0)