Smartening up gbw-node

Filed under: Bitcoin, Data, Software — Jacob Welsh @ 01:37


It started out with a desire to explore adding new functionality to gbw-node. This did not happen, because a number of old debts stood in the way. Or if they weren't all in the way in the strictest sense, still, seeing them anew caused refreshed bouts of pain which overpowered the original desire. The stick motivates more than the carrot, so they say.

A flurry of coding and testing ensued, and by the time it was done I had a stack of five patches, bringing: multiple usability improvements, improved portability to older Python interpreters, a minor reduction in scan time per byte, a dramatic reduction in scan time overhead per block, and finally a path toward full removal of the sorest point, the notorious dumpblock horror.

Said flurry of coding took a day - a Sunday, in fact, perhaps the only slot it could find that was sufficiently well defended against more routine things - while the refining, structuring and documenting of what all happened took the better part of a week, amid assorted eagerly welcomed distractions. So it goes. Or at least, so it goes for me, from where I'm at.

Act I.

The change I had wanted to explore concerned the SQL database structure and usage. Currently, the gbw-node user must specify or "watch" a list of Bitcoin addresses of possible interest; when scanning a block, any included transactions affecting a watched address are recorded and duly indexed in the database, allowing later query of both (1) current balance and (2) full history of inflows and outflows on a given address. The need to list addresses in advance was not a desired aspect, but rather a concession to the practical costs of what would amount to keeping the books for all Bitcoin users for the entire history of the network, as many block explorer services attempt to do.

The idea, then, is to strike a better balance by tracking #1 (current state) for all addresses while keeping #2 (full history) for watched ones only. That is, it's to maintain a full index of unspent transaction outputs, or where the "coins" are right now.

On one hand, this would reduce dependence on third-party block explorers, increasing the value of operating a node by allowing the user to get more interesting data directly from his own computer, and without signalling to the outside world what data he's interested in. At the same time, it would reduce the security requirements of that computer, since it would no longer need to record any personal information whatsoever (i.e. address lists).

On the other hand, for the unwashed, lazy, or intellectually stunted but nonetheless still (for now) monied masses, this would enable us to host it as a trusted alternative for such a service, able to answer requests on demand for unspent output data in exactly the format consumed by gbw-signer.

But when I dusted off my gbw-node coding workspace, I found a change made somewhere around last year, unheralded and not even committed to a patch, which fixed a blatant mistake in a specific error reporting case. So I figured I'd better clean house properly before gleefully waltzing off into schema changes.

Act II.

def cmd_tags(argv):
	if len(argv) > 0:
		addr_id = get_address_id(parse_address(argv.pop(0)))
		if addr_id is None:
			die('address not found: %r' % name)

Did you spot it ?

I'll wait, just a little...

So this function, which implements the "tags" subcommand, takes a single optional argument giving the address for which to look up known tags; and if the address is not found in the database at all, this condition (possibly an error) is to be reported cleanly. But, the address argument is removed (popped) from the list and passed immediately to another function to resolve its database identifier; when this fails, the actual argument is no longer available, neither in a "name" variable where the code would clearly like it to be, nor anywhere else for that matter. (In a statically compiled language, even C, such an undefined reference would have been caught straightaway at compile or at worst link time.)

The fix would be simple enough, of course. But there would still be a more subtle problem, which I saw as related, affecting all the command implementations: excess arguments are silently ignored. This manifested as a noted usability issue. By properly informing users of their mistakes, we could help them learn faster.

Error reporting

I concluded that the two issues - inaccessible argument values and ignored excess arguments - shared a common root cause. Basically, each command implementation is left to do its own ad-hoc argument list processing with minimal support from the common infrastructure. I chose to bolster this support while keeping things simple, by exploiting Python's inherent dynamic argument unpacking capability, rather than importing or reinventing some fancier getopt/argparse style routines. Thus, each command declares its named parameter list just like any other Python function; the command-line arguments are automatically applied to those function parameters, and any excess raises a TypeError with informative traceback.

For comparison, here's the new form of the cmd_tags prologue:

def cmd_tags(addr=None):
	if addr:
		addr_id = get_address_id(parse_address(addr))
		if addr_id is None:
			die('address not found: %r' % addr)

While on the subject of error handling, I also took the occasion to include more context in address decoding error reports, and clarify the help text for watch and push commands.


Error tolerance

Through experience with using the program and demonstrating it to others, I had realized I didn't like the bail-on-first-error behavior of "watch" and "push", the commands that read lists from standard input. If I mistype or mispaste something, I want to see the error noted promptly, indeed, but without getting kicked out from data entry mode altogether. The prior behavior was for simplicity's sake on the first pass, as aborting the program certainly ensures that an error won't pass silently, but a more refined approach seemed to be in order.

Implementing this involved more fully embracing exceptions in place of the instadeath function seen above. This was somewhat tricky because it's preferable to distinguish user errors from program errors, that is, the expected from the unexpected exceptions. At the extreme, there's the idea that the user should never see uncaught exceptions, with their accompanying tracebacks referring to code internals, with such occurrence always constituting a program bug. I don't go for that style here; rather I err on the side of letting the tracebacks show, as they provide maximum context no matter what the cause of the error. There just need to be clear enough messages for the expected cases, whether through the exception object or directly logged, so that users can figure out their own errors without having to dig up the code.


Act III.

Automatic database initialization

Next on the usability front, I realized I couldn't really justify why the install process involved extra manual steps to populate the database tables from the packaged schema script - after all, the code has access to that same script and can perfectly well access the database. There was perhaps some user-education value in it, in that it forced one to have the user-facing sqlite3 shell installed (rather than just the library required by the Python module) and see an example of how to use it and where the database and schema files are located. But I don't much believe in forced learning; for he who wants, the information is still readily at hand, while for he who doesn't (or who had quite enough learning already just getting it all installed), merely copying a magic command is hardly a meaningful activity. So I added automatic database initialization on startup, with due attention to acceptable behavior in the cases of interrupted or concurrent execution.

By some thought process I can't quite reconstruct, I came to think the only airtight way to do this was from outside the database, by creating it as a temporary file then renaming. While it does work, on a fresh look I couldn't escape the feeling that I was awkwardly reinventing transaction functionality already provided by the database. Why couldn't the whole schema script simply be enclosed in a transaction? And wait... it already is! Perhaps I had tested out whether it actually worked as desired, and found something to recoil from. In any case, at what had started out as writing time, I circled back to check, and boy did I find more than I'd bargained for.

$ python
Python 2.7.13 (default, Jul  1 2019, 08:43:28)
[GCC 4.7.4] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import sqlite3
>>> d=sqlite3.connect('db')
>>> d.execute('begin exclusive').fetchone()

This begins a transaction, with the strictest possible isolation level for good measure: basically nothing else can be accessing the database until we're done.

>>> d.execute('create table foo(bar)').fetchone()
>>> d.rollback()

Explicitly rolling back the transaction should undo the effects of creating the table, which we'll test by trying to create it again:

>>> d.execute('begin exclusive').fetchone()
>>> d.execute('create table foo(bar)').fetchone()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
sqlite3.OperationalError: table foo already exists

The fuck ?!

>>> d.execute('begin exclusive').fetchone()
>>> d.execute('begin exclusive').fetchone()

Truly fucking exclusive transactions, these.

So I reopened the docs both for SQLite itself and Python's sqlite3 interface module, along with some web gossip, becoming even more perplexed by seeming contradictions left and right. So I continued the live testing:

$ rm db
$ python
Python 2.7.13 (default, Jul  1 2019, 08:43:28)
[GCC 4.7.4] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import sqlite3
>>> d=sqlite3.connect('db',isolation_level=None)

Supposedly setting isolation_level to None is the thing to do if you want autocommit mode. Which is the exact opposite of what I want - but at least it's exact, so worth a try!

>>> d.execute('begin exclusive').fetchone()
>>> d.execute('begin exclusive').fetchone()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
sqlite3.OperationalError: cannot start a transaction within a transaction

Looks like a transaction was actually started this time.

>>> d.execute('create table foo(bar)').fetchone()
>>> d.rollback()
>>> d.execute('begin exclusive').fetchone()
>>> d.execute('create table foo(bar)').fetchone()
>>> d.close()

As it should be - we rolled back the first table creation, so the second succeeds. Then we close, and since the second creation still wasn't committed, it should implicitly roll back too. Now we'll reopen with... not manual commit mode, no, but whatever strange beast this default might be.

>>> d=sqlite3.connect('db')
>>> d.execute('insert into foo values (1)').fetchone()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
sqlite3.OperationalError: no such table: foo

As expected.

>>> d.execute('create table foo(bar)').fetchone()
>>> d.execute('insert into foo values (1)').fetchone()
>>> d.rollback()
>>> d.execute('select bar from foo').fetchone()

The query returned no results. What do you know - the insert *does* roll back, just not the create table! So how does this all look at the sqlite shell?

$ rm db
$ sqlite3 db
SQLite version 3.21.0 2017-10-24 18:55:49
Enter ".help" for usage hints.
sqlite> begin exclusive;
sqlite> create table foo(bar);
sqlite> rollback;
sqlite> begin exclusive;
sqlite> create table foo(bar);

How 'bout that - works as documented once we lose the Python.

sqlite> begin exclusive;
Error: cannot start a transaction within a transaction

And that works too.

Returning for a second pass on the docs I was finally able to make some sense of them. For starters, where sqlite speaks of autocommit mode being disabled, it only means for the duration of an explicit transaction (BEGIN followed by COMMIT or ROLLBACK). Outside of this, it always autocommits and there's no way to change this. In other circles, "autocommit mode" refers to this whole thing; after all, it wouldn't possibly make sense to autocommit changes if the user explicitly begins a transaction so the "autocommit status" is hardly interesting in that sense. This overall mode of operation is then set opposed to a manual commit mode, where data modification statements will implicitly begin a transaction if needed but won't commit until the word is given. This is what the gbw-node code currently assumes, and is encouraged by the structure of Python's DB-API with its .commit() and .rollback() methods with no equivalent .begin().

On the Python side, where they say "if you want autocommit mode, then set isolation_level to None", what they actually mean is "if you want us to not break things by poorly emulating a manual commit mode on top of sqlite's autocommit, set the misnamed isolation_level parameter to None." Isolation level can still be controlled perfectly fine per transaction using an appropriately qualified BEGIN statement. The docs do indicate some of the undesirable aspects of the default behavior, where it diverges from a true manual mode: it implicitly opens transactions only before data modification statements (such as INSERT, but not CREATE which is considered data definition language), and it commits them implicitly in some situations ("before a non-DML, non-query statement"). What they utterly fail to mention, to their eternal shame,(i) is the part where it also just so happens to ENTIRELY BREAK THE BEGIN STATEMENT, leaving transactional DDL entirely impossible.

Talk about doing basic science on your libraries to see how they work. The PYed piper always comes for his pay.

Since my code unwittingly adopted the shoddy default mode, I'm not going to jump to cutting it out right now, but the sanest approach I can see to aim for (at least while sticking with SQLite) is to always use isolation_mode=None and explicitly begin transactions where they are desired. Those .commit() and .rollback() methods become suspect too, as they don't raise errors when called without an open transaction, unlike when the underlying SQL commands are passed to .execute() ; this could make it harder to weed out cases of autocommit where an explicit transaction was intended. Alternatively, there'd be fixing Python's sqlite3 driver code to emulate a manual mode correctly. This would likely require a deeper dive to implement and then require a patched Python to use, so it seems hard to justify (especially from the business standpoint) if it's just for this one application.

Long story short, the Unixy approach of locking through exclusive creation of a temporary file, initializing it and then renaming may not be the prettiest, but it has reason to stay for now.

On updating the README, a further improvement that snuck in was to show my recommended ~/.sqliterc settings and rationale.


Act IV.

Replacing memoryview

Next up came a renewed desire to eliminate the also unwittingly adopted dependence on Python 2.7, a desire heightened by a freshly envisioned approach to making it happen. I'd recently written some unpacking code for binary formats in Ada(ii) based on the streaming abstraction for input. That's an old pattern, well enough supported by Python and seen also in how bitcoind does its own object serialization in C++. The idea is to constrain client code to sequential access to the external data, always just pulling one next item out of an abstract kind of pipe. That way, the pipe object itself (or whatever is behind it) can handle any required position pointer bookkeeping, and the interface works just as well for files on disk, strings in memory or "true" streams like a Unix pipe, TCP socket or serial port where the data really is only readable once. Compared to the Python 2.7 memoryview object I'd been using (basically a range-checked pointer, or more efficient substring/slicing operator due to its not copying the underlying data), stream processing makes for an imperative rather than functional style, as read/write calls inherently have side effects so their ordering is significant. But for the right shape of problem, the resulting simplification can be worth it, as I trust the patch will illustrate.

Better still, the change turned out to slightly improve performance, even without specifically aiming for it. This contrasts with an earlier (undocumented) experiment where I replaced memoryviews with plain strings for Python 2.6 compatibility; that did work but with a severe performance penalty. Basically the code now benefits from using the language in a more straightforward and traditionally intended way.

There was one case where it seemed the conversion to stream style was running aground. A Bitcoin transaction ID is computed as a hash of the full transaction in its binary, serialized form. At the same time, the byte length of a transaction within a larger block can't be known up front, i.e. before reading it out of the stream in full, because transactions contain a variable-length list of input records followed by another of output records. Thus, by the time we know how many bytes to put through the hash, those bytes have already been consumed from the stream and built up into a structured object for meaningful manipulation within the programming language. Flattening this back down to bytes would require implementing the whole reverse (output) side of things, adding otherwise unused code and risking mistakes, besides being kind of silly. Building up a copy of the bytes as they're consumed would require complications to the whole family of input functions, which are decomposed by data type.

Instead, we get away with just a little cheating: using the secondary capabilities of some streams to disclose their internal position ("tell") and skip to a given position ("seek"). These seem to exist exactly as a compromise for this sort of situation, exposing the random-access capabilities of the underlying medium, when present. Thus we read each transaction twice over: first to decode and measure it, then to hash it.(iii)


As far as I know this was the only thing still requiring Python 2.7, though I have not yet tested with an older version, for instance because I'm still calling for at least sqlite 3.7.0 which isn't found on CentOS 6. Nonetheless, hopefully this will at least reduce the amount of questionable newer code that needs to be imported to run on systems from that time period.

Test methodology - memoryview replacement

Some basic testing was in order, first to check performance and then to verify that the new input code could in fact still process the full blockchain to date. Since it's all fairly simple I wasn't expecting any trouble on that score, but it was easy enough to test so why not.

My local gbw-node database happened to be 813 blocks behind the current tip minus the six-block safety margin. I isolated my Bitcoin node from the network so as to maintain a constant height and minimize competing demands, and made a copy of the gbw-node database so as to run multiple tests from the same starting point. From there, I ran a time gbw-node scan, twice each for the old and new versions.

Then for the full scan test, I purged all records from the transaction tables (tx, output, input), reset the scan pointer, and compacted the database file for good measure. For that compaction I did a .dump and restored to a new file, though I later found I could have just used a VACUUM; statement.

In all cases, scanning was against the same set of 349 addresses.

Test results - memoryview replacement

All times are shown in seconds. "Blocked" is derived as the difference between real (total) time and user+system time. That is, it's the time that the program wasn't actively running in either user or kernel mode and thus either waiting for external input, sleeping, or scheduled out by some unrelated task.(iv)

version trial real user system blocked
old 1 173.5 115.5 30.4 27.6
old 2 173.7 115.5 30.1 28.1
new 1 159.7 102.7 29.8 27.2
new 2 159.6 101.9 30.3 27.4

System and blocked times are not significantly changed, which makes sense. Taking the lowest figures (which is the way to do it, as the tightest approximation of "true" resource consumption attributable to the program without the variable overhead of running on a shared, multi-tasking system), we see savings of 11.8% in user time and 8.0% overall.

Proceeding to the full rescan, things got more interesting. Nothing misbehaved outright, no, but it's a lengthy process and of course I had to check in on it while it was going and contemplate the signals I was seeing. The first thing that jumped out was that I didn't see any of the expected write-ahead log (WAL) related files alongside the slowly growing SQLite database file. I determined that although the WAL journal mode persists once activated for a given database file, the PRAGMA directive that does it is not included in the .dump output. Thus it had been lost in my dump and restore, or perhaps even an earlier such operation. So I made the code always activate it when connecting to (opening) the database, instead of doing it in the schema script, since this appears to be the only reliable way to consistently apply the choice of journal mode; this change went into the subsequent patch.

For a performance comparison of the journal mode fix, I interrupted the scan at this point, which had reached a height of merely 26`974 blocks in 3`924 seconds, and temporarily hard-coded this as the height limit. After wiping the database again and restarting with the new code, it reached the same height in 3`616 seconds. Not too bad... but the next thing to jump out was that this included only 14 seconds of user time! Those blocks from Bitcoin's early days - up to around 195`000 - are small, indeed mostly empty, so it stands to reason that the external overhead of retrieving them from bitcoind would take a significant share of the time, but the degree seemed extreme. Indeed, looking at top as it ran confirmed that bitcoind was pegged at nearly full utilization of one CPU while gbw-node was hardly doing anything.

Act V.

To answer why the costs might be quite so skewed ("unfair!!") between the two programs, I refreshed myself on the dumpblock RPC command implementation and saw the obvious "this is O(n^2)" comment. What it really means is that its search of the index is O(n) or linear in the number of indexed blocks; thus calling it for each block in the chain compounds into quadratic time complexity. Doesn't sound like much of an "index", does it? The problem is that the block header records are indexed only by hash, not by height.

An obvious first thought, then, was to add such an index as another in-memory data structure maintained by the node. Since block height values are natural numbers, densely packed, it could be as simple as a std::vector. However, this would require some care to implement correctly, such that the index is always updated when the underlying data changes, such as in a reorganization (rewind of previously accepted blocks). Manual index management is no SQL one-liner. Further, the change would touch on code that's directly involved in block validation and acceptance; that is, a mistake could result in a chain fork. To the extent possible, I'd rather leave such shenanigans to the Power Rangers and stick with gbw-node as the place to add new features, safely downstream from the hot reactor core.(v)

Another thought was to tweak the way dumpblock iterates the block index. Currently it uses an iterator on the map structure which means it visits the entries ordered by block hash, 0x000 to 0xFFF. But the index entry for each block in the active chain includes pointers to the entries for its two neighbors, next and previous, forming a doubly linked list. If (as the comment suggests) we walk down that list from the top for height values over the halfway point, and up from the bottom for those below, it would cut the average number of steps in half (or a little more, due to not even visiting inactive blocks). Not a bad start, perhaps, but it would still be a fundamentally quadratic algorithm, and constant factor speedups really aren't that interesting when algorithmic inefficiency is still in play.

But what about combining these two observations: we have efficient (O(log n)) lookup by hash; and we have efficient (O(1)) single-step traversal by height. And gbw-node scans in order by height anyway. If only bitcoind came with RPCs to expose that "next" block hash, and to retrieve block contents by hash rather than height...

Direct block read

Thanks to my earlier correct observation and resolution of this problem, it already did, at least for the first part: querying the block index. Then for getting block contents, that index entry includes the exact file coordinates where the data begins. And guess what: since we've converted to streams, we don't even need to know the block size and can simply point the stream reader directly at the file.

From this new vantage point, we can circle back to answer the evidently good questions I raised last time around about direct block reading:

We could read directly from the blk####.dat files, but this would go against the "loose coupling" design. Concretely, this manifests as several messes waiting to happen: do we know for sure that only validated blocks get written to these files?

Indeed we don't: hopeful blocks can be written there, which passed proof-of-work but were otherwise not validated because they didn't yet amount to a longer chain.

However, by walking the active chain using the block index rather than sequentially scanning the raw files in full, we don't run into them.

How will we detect and handle blocks that were once valid but abandoned when a better chain was found?

Again, due to walking the active chain, we won't run into such loser blocks.

Might we read a corrupt block if the node is concurrently writing?

It's tempting to appeal to the block index again, but this one does warrant further verification. There's only one call to CBlock::WriteToDisk, besides the special case of initially recording the genesis block. It's the same for CBlock::AddToBlockIndex. WriteToDisk can now be relied on to commit before returning, and its call comes immediately before the AddToBlockIndex. Thus, if a block is in the index, its data is already fully flushed (and not merely to the wider system but all the way to disk).(vi)

Unfortunately we still need dumpblock in order to "prime the pump" on startup, because it remains the only way to obtain a block hash by height, which is how scan state is recorded. An alternative would be to hardcode the genesis block hash and record the scan pointer by hash instead of height. This would remove the ability to easily set a desired height to scan from, which so far seems like a nice thing to have. But that in turn could be addressed by indexing block metadata as such in gbw-node, finally providing an efficient lookup by height. Then we could even remove dumpblock altogether, perhaps providing a non-retarded replacement in gbw-node that works by hash or height. Although, I could see an argument that it be left in the main program for debugging help in a pinch.

For now, we can at least simplify the mechanism for catching dumped blocks, due to its now infrequent use. The scanner can dump to regular files and become truly single-threaded, in place of the named pipe with reader thread.


Satisfied that the change is not only simple but also sound, we can proceed to looking at performance. Hold on to your hats, ladies and gentlemen!

Test results - direct block read

The scan from zero to 26`974, previously observed at 3`616 s real / 14 s user time in the best case, now reaches the same height in 26.8 s real / 11.6 s user time, a 135x speedup.

It may be noted that this gives a distorted picture since it includes only early blocks where the index lookup overhead far exceeded block decoding, filtering and database update times, and this balance changes as the blocks get larger. Still, extrapolating the saved overhead per block to the current ~780`000 blocks suggests around 29 hours shaved off the full scan.

As to that full scan, it completed in 71`395 s real / 63`573 s user / 7`262 s system / 560 s blocked time, or 19.8 hours.


Finally, for curiosity's sake, let's return for a proper comparison of the SQLite journal modes, now that the elimination of other waste has increased the database's relative share of the overall time cost. For some kind of common reference, I used the same scan range as in the memoryview replacement experiment from Act IV. Unfortunately I'd deleted my starting database snapshot and finished the full scan, so the results are not exactly commensurate and they don't reflect any recording of new transactions; indeed it may be that the update of the scan pointer on each block is the only write activity going on. Yet even with this fairly light load, a difference is noticeable.

journal_mode real user system blocked
WAL 110.5 99.4 10.5 0.6
delete 136.2 103.7 29.7 2.8
truncate 138.5 104.0 31.6 2.9
persist 175.0 115.5 55.1 4.4

WAL wins, as expected; and at least on this SSD based system, delete (the default) and truncate beat persist. Time spent in a blocked state is significantly lower than before, also as expected since we're no longer waiting around for bitcoind to walk on the order of a million headers just to locate each subsequent block.

Curtain call: full patch listing

Patch Seals Tree
gbw-node_error_reporting.vpatch(vii) jfw Browse
gbw-node_error_tolerance.vpatch(viii) jfw Browse
gbw-node_db_auto_init.vpatch(ix) jfw Browse
gbw-node_memoryview_replacement.vpatch(x) jfw Browse
gbw-node_direct_block_read.vpatch(xi) jfw Browse

I'll most likely do a fetch-bitcoind update after letting it all marinate a little. I was hoping to defer this until after full removal of dumpblock, but now that I'm thinking of a schema change as the best path there, I'm less inclined to block release of these improvements on what's likely to be a somewhat heavier upgrade.

  1. No one is coming to fix it, of course. Supposedly the equivalent bug was finally fixed in the very recent version 3.12 of a vaguely Python-like language, but the actual Python is abandonware, unmaintained for some years now. [^]
  2. So far I've only teased at it here, but to tease just a bit more: I'm decoding captured Eulora2 network traffic dumps from before and after a protocol change, looking for what patterns might be found in repeated ciphertext blocks. [^]
  3. An optimization came to mind of deferring hashing until the transaction proves to be of interest, i.e. that it affects a watched address. This however would be counterproductive, recalling the original goal of finding an acceptably performing way to index all unspent outputs, as this will require storing their TXID by which to delete them when spent. [^]
  4. This accounting is valid given that it's a sequential program; it gets more complicated for parallel programs. The astute observer might note the presence of a secondary getblock_reader thread, but its execution is almost entirely exclusive of the main thread; or for the even easier out, Python's use of a global interpreter lock (GIL) implies that within a single process, only one thread executes at a time, although they may be scheduled on independent processor cores. [^]
  5. If the idea of swimming downstream of a reactor core doesn't instill complete tranquility (it SHOULD be perfectly safe...) then I'd think the metaphor is working just about right. [^]
  6. Note that in my V-tree, the fsync fix comes before the getblockindex addition, so there's no risk of breaking the assumptions here, either it works properly or not at all. But if using someone else's bitcoind, you may need to give the situation a closer look. [^]
  7. Fix undefined reference on reporting a bad address in cmd_tags. More generally, simplify arglist processing and fix the silent ignoring of excess CLI arguments, by having Python unpack the arglist automatically into named parameters of each command handler function. Report more context for address decoding errors. Clarify help text regarding input format for watch and push commands. [^]
  8. Don't die immediately on bad input lines. [^]
  9. Automate database initialization from schema script; update and expand related documentation. [^]
  10. Replace memoryview with cStringIO for loading block data, eliminating the main cause of Python 2.7 dependence, simplifying the code and even somewhat improving performance. [^]
  11. Use the newer getblockindex RPC to walk the chain in linear time and read blocks directly from files on disk. This bypasses the dumpblock hack, except for finding the initial hash; in turn, it makes sense to remove the named pipe and threading complications and just dump to a regular file. Also enable WAL mode more reliably by setting it on connect. Bump version by way of adding a minor number (for instance because there are no schema changes or significant interface or behavior changes). [^]

1 Comment »

  1. I'll most likely do a fetch-bitcoind update after letting it all marinate a little.

    And it's out ; signed.

    I was hoping to defer this until after full removal of dumpblock, but now that I'm thinking of a schema change as the best path there, I'm less inclined to block release of these improvements on what's likely to be a somewhat heavier upgrade.

    Behind the scenes, the schema and code updates for tracking the full set of block metadata and current unspent outputs are drafted, with a variety of functionality and performance testing ongoing. This includes a process to cleanly migrate user data from the prior version, although a full rescan will certainly still be required. Preliminary results indicate around 5 days to scan and 24 GB for the database. And it does indeed do away with use of dumpblock!

    Comment by Jacob Welsh — 2023-04-25 @ 01:39

RSS feed for comments on this post. TrackBack URL

Leave a comment

Powered by MP-WP. Copyright Jacob Welsh.