A thorny memory leak in bitcoind, and a way forward

2025-10-09

A thorny memory leak in bitcoind, and a way forward

Filed under: Bitcoin, Networks, Software — Jacob Welsh @ 22:35

In case one remotely exploitable leak wasn't enough, the Bitcoin codebase keeps on giving! The price chart, of course, keeps on taking (and quite unyieldingly so) from those who didn't get on board: since that last fix, every shoe, shirt, horse and buggy you didn't sell for BTC has lost three fourths of its value, not even considering normal depreciation, while even those shiny gold dental fillings are down by half.

Let's have a look at the present problem, first announced on IRC, and what might be done to help keep the ship of open networks and decentralization afloat, daunting as this may seem as the environment shovels ever larger piles of dumb money into central and generally obsequious hands.

Nodes⁽ⁱ⁾ establish a fluctuating set of "connection" relationships among each other, forming a gossip network in which each passes an item of interest on to some or all of its friends until it eventually reaches all interested parties. Some message types are small and relay directly when a node first hears them, such as to announce the address of a new potential peer. The more interesting message types though are for blocks and transactions, which can reach the full 1 MB block size limit or close to it. Multiply that by a desirably large number of peer connections and you'd get major spikes of bandwidth use, in the best case, or else a slowdown in relay latency as your network capacity saturates. Thus, there's an effort to economize by not sending redundant copies of an item to peers that may already have it. This is realized using an inventory notification or "inv" message, broadcasting a fixed 32-byte hash identifying the new item, leaving it up to each peer to ask for the full contents using a "getdata" message, as needed and at its convenience.

The scheme sounds nice in principle, but looking now from the standpoint of the node receiving the inv, it opens up a slew of decisions, considerations, and bookkeeping requirements in order to realize that promise of savings without undue performance loss. Whom and when to ask, when receiving the same inv from many peers? How long to allow before giving up and asking elsewhere? How to handle a flood of messages of unknown value, recalling that an open network is just as open to bad or useless actors as to honest or useful ones, for any chosen definition of honesty or utility?

The prototype codebase as left by Satoshi,⁽ⁱⁱ⁾ still the best reference we've got, proceeds roughly as follows:

Any peer can send any number of inv messages and we'll accept them all into its mapAskFor queue. Because everybody is friends on the Internet.
Ten times per second, loop through all peers checking if there's anything in their queues to send. Because most nodes will be honest so it's not like we'll need very many peer connections.
Ask for the data from whichever peer sent the inv first, regardless of whether it may already be busy on other requests.
Defer each subsequent followup for the same inv by 2 minutes. Thus, a single sluggish peer can delay us by a full 20% of the target block interval - which is fine because we live in modern times so all those honest nodes also have fast connections and never get congested either.
To determine what's first and what needs deferring, collect all invs in a global mapAlreadyAskedFor, separate from the peerwise mapAskFor queues. Clean them out only after receiving and accepting the block or transaction. Not if it turns out unusable, and not even after disconnecting all peers that announced it. It's no big deal if this chews up all available memory, because the code is so shoddy otherwise that it's going to need frequent manual restarts anyway, forcibly reclaiming resources by just forgetting everything that's not nailed down into the block chain.

For better and worse, we have the fortune of living to see the day when none of the above assumptions are particularly fine, if indeed they ever were. It's taken me some days - at the desk, on the couch, in the hammock, in chats, on the whiteboard, and in the trusty notebook - to chart a path forward without getting lost in the woods, and it seems to be finally coming together. The basic principles are 1) nothing may be expended without limit and 2) absent the possibility of weighing the value of things by their content, weigh by their source. In other words, the node maintains its own dynamic Web of Trust, at least a first layer thereof, rating its peers and adjusting with experience.

For each peer, queue up to a certain allowance of inv requests, both for deferred asks and for those already sent and awaiting response. The allowance is proportional to the peer's rating -- or perhaps better, its share of total rating points currently issued.
These queues can remain per-peer for now, if convenient for implementation. Eventually it seems preferable to move toward global queues, to most efficiently find what's ready to send at a given time regardless of peer count.
Consider a peer "busy" if either its "asked" queue or its outbound sending buffers are full (write on socket would block or the like). Put incoming inv in "deferred" if either the peer is busy or the inv is already asked anywhere else. Otherwise go ahead and send the ask. This may be decided by full linear scan if necessary; clear and correct memory management beats algorithmic efficiency for the first pass.
If the "deferred" queue is full, pick one of the same peer's deferred requests to evict without penalty. The best choice for eviction would probably be something like the inv heard from the greatest number of peers, rating adjusted, since that would reflect the least unique of this peer's contributions and the most likely to still be retrievable elsewhere.
Items exit the "asked" queue only by receiving a response, or peer disconnect, or a set timeout in seconds (perhaps configurable based on bandwidth).
Successful responses add to a peer's rating while timeouts subtract. There's some discretion involved for valid but unusable responses, depending on cause of rejection. Perhaps a transaction simply falls below our minimum fee threshold or is no longer valid after the latest block update. Rather than simple addition of rating points, an exponential moving average seems advantageous as it establishes a natural floor and ceiling, converging toward the ratio of successes to failures, while remaining simple to compute.

One point of this setup is that it avoids a possible status-quo-entrenching positive feedback loop where a low rating reduces odds of successful timely response which in turn further hurts rating. Basically, the rating is viewed as credit granting access to a certain share of buffer resources; having more is advantageous for increasing throughput, but having less doesn't prevent communication altogether, much like the TCP sending window. Parameters available for tuning include the initial rating of new peers (unknown or previously-known), low rating threshold for disconnection, maximum queue slots (per peer or total), request timeout, and deferral interval if different.

It's also noted that it may still be desirable to disconnect even good peers from time to time, to make room for mixing things up, better exploring the space and not being too predictable.

For those confused on the matter: just as you do not own any Bitcoin (the currency) for which you do not hold the private key, so too you have no existence in Bitcoin (the transfer network) if you do not hold a computer running a node. You rely on the good will and behavior of others to allow you to send transactions, confirm receipt, and perhaps dodgiest of all, to uphold the advertised rules of the protocol such as supply and block size limits, digital signatures, and proof of work. In short, you abdicate responsibility, which inevitably carries a cost. Unfortunately it's the kind of cost that seems difficult for many to evaluate or even connect to the original cause, delayed feedback and all, but the piper always comes to collect in the end. [^]
More or less. To expand on that finding, The Real Bitcoin project didn't go so far as to exclude all post-Satoshi work for its starting point, which presumably had its costs and benefits. On the costs side there's this change, where Matt Corallo claims to implement an original intent while speaking falsehood about the current code. Thus, however quick or dirty the original may have been, it likely wasn't quite as confused as what we have to deal with now. [^]

2 Comments »

The basic principles are 1) nothing may be expended without limit and 2) absent the possibility of weighing the value of things by their content, weigh by their source.

Applying these principles will put us firmly on the path to differentiate it from the current passive, pantsuitist, consumers have come to expect respect even if militantly naive because thinking is hard to an active, patriarchal, honor system that satisfies man's need to construct meritocratic hierarchies for effectual resource allocation.

While rather thorny indeed, it's exciting to be unfolding this development.

All the points in the 2nd list make sense to me.

It's also noted that it may still be desirable to disconnect even good peers from time to time, to make room for mixing things up, better exploring the space and not being too predictable.

Do not wallow content. Word.

Comment by Robinson Dorion — 2025-10-11 @ 17:54
CNode::setInventoryKnown is a similar egregious free memory sinkhole, growing without bound even in normal operation though presumably at least freed on disconnect. It's also suspected of contributing to block download troubles, because this PushInventory will fail to inform the peer of a required block when they "should" know about it because we happened to send it before, regardless of whether it was usable to them. (IIRC a Fixpoint commenter pointed this out some years ago, perhaps only in the abstract, but I'm not presently finding it in the relevant places.)

Comment by Jacob Welsh — 2025-10-13 @ 03:19

RSS feed for comments on this post. TrackBack URL

Fixpoint

2025-10-09

A thorny memory leak in bitcoind, and a way forward

2 Comments »

Leave a comment