About a month ago while I was visiting in Panama,(i) a friend whom I'd previously helped get set up with a Bitcoin node had dusted it off to get synced up with the network again, when the machine crashed with a severe if seemingly transient kernel error. Some time after restart, the sync process ran aground, displaying the uncomfortably familiar "safe mode" warning and failing to accept any further blocks.
The plot thickened when a full run of bitcoind -checkblocks=0
returned successfully. Based on recent work I was under the impression that this was a fairly thorough if not quite exhaustive check of the node's data integrity, so I became suspicious of the default lazy hypothesis that flaky hardware had caused some kind of corruption.
We worked the problem a bit on IRC,(ii) and this furnished me with enough detail to go for a dig in the codebase with at least some idea of what I was looking for,(iii) whether actual defects or at least explanations for the observed phenomena. Nonetheless it was difficult going; the problem seemed to be the large number of related functions involved, some of them fairly lengthy (though I've certainly seen worse), with blurry boundaries between their roles, seemingly little consideration given to what goes where and what to call things,(iv) and little commentary to help the reader see whatever structure or model might have been there in the author's head. Basically, I couldn't keep track of enough of the whole at once to figure out the meaning of any given part; I needed a map.
What I drew up amounted to a partial call graph with some extra hints and notes, covering the functions of most interest in view of my goal: to find the circumstances in which the InvalidChainFound routine can be invoked. Tracing outward from that starting point, what came up were the high-level functions dealing with how incoming blocks are examined, accepted or rejected, catalogued, and moved in or out of the active chain as the continuous mining competition proceeds: arguably the very heart of what a Bitcoin node does, at least from the technical perspective. I started with noting things in a plain-text file as I read; once reasonably satisfied with the scope and results and inclined to publish, I promoted it to HTML with all the links, then finally added a diagram using the "dot" program from Graphviz.
As far as the tradeoffs of potential automation, I found the first step of gathering the data wasn't that bad done manually, since it came naturally through the study of the code and produced a nicely compact yet helpful set. An automated tool, even if sophisticated enough to follow C++ semantics rather than blindly pattern-matching, would struggle with the finesse-based decisions of what's worth including; although a blunt yet complete map would have its uses as well.(v) The markup step though was tedious; next time around I'd probably start out in some lightweight yet machine-readable format so as to mechanize that.
Without further ado, here's the full map. Call lists are selective, while I've taken care that the back references ("called by") are exhaustive, covering the whole codebase; I figure this is helpful because unlike the forward references they're not locally apparent when looking at the code.
void InvalidChainFound (main.cpp)
- Calls
- CTxDB::WriteBestInvalidWork
- Called by
- Reads globals
- bnBestInvalidWork
- Sets globals
- bnBestInvalidWork
void SetBestChain (main.cpp) [boring]
- Calls
- Called by
void CWallet::SetBestChain (wallet.h) [boring]
- Calls
- CWalletDB::WriteBestBlock
- Called by
bool CBlock::SetBestChain (main.cpp)
- Calls
- CTxDB::TxnBegin
- on genesis block - CTxDB::WriteHashBestChain, CTxDB::TxnCommit
- on extending current branch - CBlock::ConnectBlock, CTxDB::TxnAbort, InvalidChainFound, CTxDB::TxnCommit, CTransaction::RemoveFromMemoryPool
- else - Reorganize, CTxDB::TxnAbort, InvalidChainFound
- IsInitialBlockDownload
- SetBestChain
- GetTime
- Called by
- CBlock::AddToBlockIndex
- CheckDiskBlocks
- Reads globals
- pindexGenesisBlock
- Sets globals
- pindexGenesisBlock
- hashBestChain
- pindexBest
- nBestHeight
- bnBestChainWork
- nTimeBestReceived
- nTransactionsUpdated
bool CBlock::AddToBlockIndex (main.cpp)
- Calls
- mapBlockIndex.count
- mapBlockIndex.insert
- mapBlockIndex.find
- CTxDB::TxnBegin
- CTxDB::WriteBlockIndex
- CTxDB::TxnCommit
- if chain work is greater than previous best - CBlock::SetBestChain
- Called by
- CBlock::AcceptBlock
- LoadBlockIndex
- Reads globals
- bnBestChainWork
- pindexBest
- Notes
- Woah, is it rewriting all the block index db records as it loads them up on startup??(vi)
bool CBlock::AcceptBlock (main.cpp)
- Calls
- mapBlockIndex.count
- mapBlockIndex.find
- GetNextWorkRequired
- CBlock::GetBlockTime
- CBlockIndex::GetMedianTimePast
- CTransaction::IsFinal
- Checkpoints::CheckBlock
- CheckDiskSpace
- CBlock::WriteToDisk
- CBlock::AddToBlockIndex
- if it's the best chain - CNode::PushInventory
- Called by
- Reads globals
- hashPrevBlock
- hashBestChain
- vNodes
- nBestHeight
bool ProcessBlock (main.cpp)
- Calls
- mapBlockIndex.count
- CBlock::CheckBlock
- Checkpoints::GetLastCheckpoint
- CBlock::GetBlockTime
- CNode::Misbehaving
- CNode::PushGetBlocks
- CBlock::AcceptBlock
- Called by
- ProcessMessage (net)
- CheckWork (miner)
- getmemorypool (rpc)
- eatblock (rpc)
- Reads globals
- hashBestChain
bool CBlock::CheckBlock (main.cpp) ["checks that are independent of context that can be verified before saving an orphan block"]
- Calls
- CheckProofOfWork
- CBlock::GetBlockTime
- GetAdjustedTime
- CTransaction::IsCoinBase
- CTransaction::CheckTransaction
- CBlock::GetSigOpCount
- CBlock::BuildMerkleTree
- Called by
- ProcessBlock
- CBlock::ConnectBlock
- CheckDiskBlocks
bool CheckProofOfWork (main.cpp)
- Called by
- CBlock::CheckBlock
- CBlock::ReadFromDisk
- CBlockIndex::CheckIndex
- Reads globals
- bnProofOfWorkLimit
bool CBlock::ConnectBlock (main.cpp)
- Calls
- CBlock::CheckBlock
- CTxDB::ReadTxIndex
- CTransaction::ConnectInputs
- CTxDB::UpdateTxIndex
- CTransaction::GetValueOut
- GetBlockValue
- CTxDB::WriteBlockIndex
- SyncWithWallets
- Called by
bool Reorganize (main.cpp)
- Calls
- CBlock::ReadFromDisk
- CBlock::DisconnectBlock
- CTransaction::IsCoinBase
- CBlock::ConnectBlock
- CTxDB::TxnAbort
- CTxDB::WriteHashBestChain
- CTxDB::TxnCommit
- CTransaction::AcceptToMemoryPool
- CTransaction::RemoveFromMemoryPool
- Called by
- Reads globals
- pindexBest
- Notes
- Assumes height of pindexNew is greater or equal to that of pindexBest, in LCA search(vii) for pfork (couldn't height be shorter if total work is higher?)
- Looking closer, it works properly even if "plonger" is in fact shorter, because of checking the heights at the start of each iteration. Ironically, it works at ensuring plonger isn't actually longer at the point of stepping back pfork. So, it seems to be just a confusingly named variable and IMO awkward implementation, not actually a bad assumption.
- vResurrect and vDelete could guzzle memory in a deep reorg(viii)
- vResurrect could be pre-filtered by the normal mempool acceptance rules, or given limits since it's basically a courtesy for reverted txns
- vDelete could be pre-filtered by what's actually in the mempool
- Why is txdb.TxnAbort() called only in case of ConnectBlock failure but not the other errors?
- The caller takes care of it in any case. The possible existence of a transaction stack (CDB::vTxn) obscures things, but this functionality appears to be unused (i.e. TxnBegin isn't called repeatedly without an intervening TxnCommit or TxnAbort).
- Assumes height of pindexNew is greater or equal to that of pindexBest, in LCA search(vii) for pfork (couldn't height be shorter if total work is higher?)
As an illustration of how this is used, recall the original question of how InvalidChainFound is called. From the map we can readily see what the call stack must be; then from a look at the functions involved we pick up further conditions that must hold for those calls to be made, such as values returned by intervening completed calls, shown here in parentheses:
ProcessBlock
(CBlock::CheckBlock => true)
CBlock::AcceptBlock
(GetNextWorkRequired => this->nBits)
(CBlock::WriteToDisk => true)
CBlock::AddToBlockIndex
(mapBlockIndex.insert)
(CTxDB::WriteBlockIndex)
CBlock::SetBestChain
(CBlock::ConnectBlock or CTxDB::WriteHashBestChain or Reorganize => false)
InvalidChainFound
Not bad... but on this quest, each question answered only led to more questions, at least for a few rounds before I was satisfied. What floored me when looking at that trace was that an incoming block that passed the cursory checks in CBlock::CheckBlock would get duly written to disk as well as the in-memory block index and its backing database, before ever getting to CBlock::ConnectBlock, the part that does the heavy lifting of checking transaction references and signatures!
In other words: the database can perfectly well contain blocks that aren't valid, weren't ever valid, nor ever could be valid. One would imagine that these can't show up on the active chain, but at the very least my orphans and losers classification was incomplete, and the "dumpblock" flaw worse than previously realized: there's a third category of, shall we say, hopeful blocks, with validity unknown as they've never been on the node's active chain. I don't expect there's any way to distinguish these from the once-active loser blocks just from what's in the database. At least my worst-case analysis for possible impact on gbw-node users was sufficiently pessimistic as to arguably stand intact, in that it doesn't particularly matter whether a false-positive payment confirmation is due to a transaction having been reversed or never having been confirmed at all.
But clearly I had to get to the bottom of this one, and it was at this point that I added the more detailed look at Reorganize to the map, aiming to understand how it works and especially what happens to those invalid blocks once written. The answer was roughly as I'd suspected: they're just left around; each time a descendant of an invalid block arrives with enough cumulative work to challenge the main chain, a reorg is attempted; it disconnects active blocks back to the fork, then fails on the step of connecting that first invalid block, the database transaction is aborted and things carry on as they were.
I've even begun to see why it was implemented this way. When a block comes in that doesn't build on the best chain, you can't reject it simply because it doesn't have enough work to become the new best, because there may be more blocks coming that will cause the fork to overtake the original.(ix) At the same time, it's not clear how you'd go about validating its transactions, because you'd have to check the spent status of the previous outputs as of the fork point (common ancestor), whereas the database only tracks this as of the tip. You could perhaps expand the DB, or do a "trial reorg", disconnecting active blocks so as to reconstruct that state then aborting the DB transaction; but besides adding complexity, this could open a severe denial-of-service vector, since fork blocks with valid proof-of-work are easily constructed against the early chain and each one would then trigger a write-intensive rewind of the whole thing.(x)
Returning at last to my original confusion of how "checkblocks" could miss a case of corrupted block data despite all the layers of hashes on top of hashes, it's quite simply because it checks only the active chain, none of those "losers" or "hopefuls".(xi)
Still I'm looking for positive proof that it's really hardware-level data corruption to blame for this stuck node, and of what sort. One thought was to add an RPC command to fetch raw block data based on hash rather than height, so as to view the problematic block off the active chain. I'm uneasy about doing this, given the warnings that would perhaps need to be attached about how it can return invalid blocks even when correctly functioning; further, it wouldn't do anything to help un-wedge my friend's node despite its ailment being at last so precisely identified. So I thought instead to find the block directly within the data files, as bitcoind
itself does, by using the coordinates stored in the block index (CBlockIndex
objects in memory). The only problem then is the opacity of this structure to the user, but this is easily remedied by adding a getblockindex
command.
JSON encoding issues made this a bit trickier than I'd anticipated; still it's straightforward enough, and the field naming and formatting follows existing precedent. To my mind this command is another one of those things that should have been there from the start; so barring any valid objections, I will consider it official.
I also snuck in a couple minor tweaks for annoyances that came to light through this experience, which you can see for yourself in the patch.
Patch | Seals | Tree |
---|---|---|
bitcoin_getblockindex_etc.vpatch | jfw | Browse |
- For to present at the conference we hosted, among other things. [^]
- Which itself brought in a mini-project to implement in-house archival of pasted material (and any other URLs for that matter), and troubleshooting of various other technical difficulties. [^]
- An earlier attempt of mine to read the Bitcoin code in full ran out of steam fairly quickly, owing in part to the maddening volume of pointless yet elaborate cruft still present. [^]
- Or what exactly would you suppose are the differences between "processing", "accepting", "adding", "connecting" and "checking" a block? Then add to this a side of context-dependent identifiers resulting from the heavy use of C++ features (overloads, classes and other namespaces, templates) in addition to the bad old-fashioned C macros. [^]
- Back in 2015 MP offered a decent bounty for this sort of thing based on Stan's request; the result technically squeaked by on what had been spelled out for the job, yet failed to be of any use. [^]
- Area for future work number one. [^]
- Lowest Common Ancestor, with which I was familiar from implementation of dynamic extents in gscm. [^]
- Area for future work number two. [^]
- Recall that this notion of "fork" versus "original" is only relative to a particular node's view of the sequence of events, as the proof of work along with the signature is the ultimate authority and final arbiter of truth in Bitcoin. [^]
- Come to think of it, I wonder if this already is a DoS of a lesser sort, in that these can be thrown at your database and chew up unbounded storage space... this topic has possibly been beaten to death already under the banner of "checkpoints", but they remain at least conceptually a non-solution, going as they do against the authority of the PoW. [^]
- Area for future work number three. [^]
Hello jfw!
The modified printf() line in the patch is missing a newline code.
Apropos, if wasn't already familiar, you might also care to take a peek at my blog, for related writing.
Comment by cgra — 2022-04-12 @ 12:48
Hi cgra,
Indeed, and I was just noticing this in the extended testing.
Comment by Jacob Welsh — 2022-04-12 @ 17:52
[...] a closer look at the relevant code. This wasn't the first time either, but whether due to my recent mapping of the nearby swamps or simply due to the repeat effort, I made much more headway on this pass. In brief, it's a wonder [...]
Pingback by The passive threads that would like to be given some blocks « Fixpoint — 2022-05-05 @ 18:29
Adding the links to where this was confirmed as a single-bit error in the block data on disk, but merely flipping it back didn't work to recover the node.
Comment by Jacob Welsh — 2024-05-08 @ 18:01