The database schema for the "node" part of Gales Bitcoin Wallet covered, we now proceed to the frontend program that puts it to work: collecting data from bitcoind, parsing its various binary encodings and extracting something useful.
Source: draft1/gbw-node.py draft2/gbw-node.py (and the previous schema-node.sql).
You'd be well advised to read the downloaded thing prior to executing, especially since it's in an unsigned draft state. As for what's necessary to graduate to a vpatch I'd be ready to sign, my thinking is that it's this review and annotation process itself, plus whatever important changes come out of it, and the previously suggested schema tweaks (since changing that is the most obnoxious part once deployed).
At present there's not much of an installation process and the database is initialized manually. I'd suggest creating some directory to hold the two sources. Then from that directory:
$ chmod +x gbw-node.py $ mkdir ~/.gbw $ sqlite3 ~/.gbw/db < schema-node.sql $ ./gbw-node.py help
In preparing this code for publication I observed that I had continued by force of habit (and editor settings) with the Python style guidelines of four-space indents and some fixed line width limit, in opposition to Republican doctrine. I've attempted to clean it up such that line breaks occur only for good reasons, though I can't say I'm happy with how my browser wraps the long lines. And it's not like I expect the poor thing to know good indentation rules for every possible programming language now... wut do?!
Prologue
We start with the usual Pythonistic pile of imports. The ready libraries are a big reason the language is hard to beat for getting things working quickly, and at the same time a dangerous temptation toward thinking you don't need to care what's inside them.
#!/usr/bin/python2 # J. Welsh, December 2019 from os import getenv, open as os_open, O_RDONLY, O_WRONLY, mkdir, mkfifo, read, write, close, stat from stat import S_ISDIR, S_ISFIFO from sys import argv, stdin, stdout, stderr, exit from socket import socket from threading import Thread, Event from binascii import a2b_hex, b2a_hex from base64 import b64encode from struct import Struct from hashlib import sha256 as _sha256 from decimal import Decimal from inspect import getdoc import errno import signal import string import json import sqlite3 from sqlite3 import IntegrityError
The above are all in the standard library, assuming they're enabled on your system. The ones that stick out like sore thumbs to me are threading and decimal; more on these to come.
As the comments say:
# Safety level: scanning stops this many blocks behind tip
CONFIRMS = 6
# There's no provision for handling forks/reorgs. In the event of one deeper than CONFIRMS, a heavy workaround would be:
# $ sqlite3 ~/.gbw/db
# sqlite> DELETE FROM output;
# sqlite> DELETE FROM input;
# sqlite> DELETE FROM tx;
# sqlite> .exit
# $ gbw-node reset
# $ gbw-node scan
At least a semi-automated and lighter-touch recovery procedure would certainly be nice there.
gbw_home = getenv('HOME') + '/.gbw' bitcoin_conf_path = getenv('HOME') + '/.bitcoin/bitcoin.conf' # Further knobs in main() for database tuning. db = None
For reasons I don't quite recall (probably interpreting hashes as integers, combined with pointer type punning - an unportable C programming practice common in Windows-land), bitcoind ended up reversing byte order compared to the internal representation for hex display of certain things including transaction and block hashes. Thus we have "bytes to little-endian hex" wrappers.
b2lx = lambda b: b2a_hex(b[::-1]) lx2b = lambda x: a2b_hex(x)[::-1]
Not taking any chances with display of monetary amounts, a function to convert integer Satoshi values to fixed-point decimal BTC notation. The remainder/modulus operators have varying definitions between programming languages (sometimes even between implementations of the same language!) when it comes to negative inputs, so we bypass the question.
def format_coin(v): neg = False if v < 0: v = -v neg = True s = '%d.%08d' % divmod(v, 100000000) if neg: return '-' + s return s
Preloading and giving more intelligible names to some "struct" based byte-packing routines.
u16 = Struct('<H') u32 = Struct('<I') u64 = Struct('<Q') s64 = Struct('<q') unpack_u16 = u16.unpack unpack_u32 = u32.unpack unpack_u64 = u64.unpack unpack_s64 = s64.unpack unpack_header = Struct('<i32s32sIII').unpack unpack_outpoint = Struct('<32sI').unpack
Some shorthand for hash functions.
def sha256(v): return _sha256(v).digest() def sha256d(v): return _sha256(_sha256(v).digest()).digest()
An exception type to indicate certain "should not happen" database inconsistencies.
class Conflict(ValueError): pass
For reading a complete stream from a low-level file descriptor; experience has led me to be suspicious of Python's file objects.
def read_all(fd): parts = [] while True: part = read(fd, 65536) if len(part) == 0: break parts.append(part) return ''.join(parts)
Ensuring needed filesystem objects exist.
def require_dir(path): try: mkdir(path) except OSError, e: if e.errno != errno.EEXIST: raise if not S_ISDIR(stat(path).st_mode): die('not a directory: %r' % path) def require_fifo(path): try: mkfifo(path) except OSError, e: if e.errno != errno.EEXIST: raise if not S_ISFIFO(stat(path).st_mode): die('not a fifo: %r' % path)
RPC client
Bitcoind uses a password-authenticated JSON-RPC protocol. I expect this is one of the more concise client implementations around.
class JSONRPCError(Exception): "Error returned in JSON-RPC response" def __init__(self, error): super(JSONRPCError, self).__init__(error['code'], error['message']) def __str__(self): return 'code: {}, message: {}'.format(*self.args)
Some of this code was cribbed from earlier experiments on my shelf. The fancy exception class above doesn't really look like my style; it may have hitchhiked from an outside JSON-RPC library.
The local bitcoin.conf is parsed to get the node's credentials. This is done lazily to avoid unnecessary error conditions for the many commands that won't be needing it.
bitcoin_conf = None def require_conf(): global bitcoin_conf if bitcoin_conf is None: bitcoin_conf = {} with open(bitcoin_conf_path) as f: for line in f: line = line.split('#', 1)[0].rstrip() if not line: continue k, v = line.split('=', 1) bitcoin_conf[k.strip()] = v.lstrip()
Side note: I detest that "global" keyword hack. It's "necessary" only because variable definition is conflated with mutation in the single "=" operator, and completely misses the case of a nested function setting a variable in an outer but not global scope. ("So they added 'nonlocal' in Python 3, solves your problem!!")
def rpc(method, *args): require_conf() host = bitcoin_conf.get('rpcconnect', '127.0.0.1') port = int(bitcoin_conf.get('rpcport', 8332)) auth = 'Basic ' + b64encode('%s:%s' % ( bitcoin_conf.get('rpcuser', ''), bitcoin_conf.get('rpcpassword', ''))) payload = json.dumps({'method': method, 'params': args}) headers = [ ('Host', host), ('Content-Type', 'application/json'), ('Content-Length', len(payload)), ('Connection', 'close'), ('Authorization', auth), ] msg = 'POST / HTTP/1.1\r\n%s\r\n\r\n%s' % ('\r\n'.join('%s: %s' % kv for kv in headers), payload) sock = socket() sock.connect((host, port)) sock.sendall(msg) response = read_all(sock.fileno()) sock.close() headers, payload = response.split('\r\n\r\n', 1) r = json.loads(payload, parse_float=Decimal) if r['error'] is not None: raise JSONRPCError(r['error']) return r['result']
I could see removing the "parse_float=Decimal", and thus the corresponding import, as we won't be calling here any of the problematic interfaces that report monetary values as JSON numbers. But then, I'd also see value in one RPC client implementation that can just be copied for whatever use without hidden hazards.
Bitcoin data parsing
Now things might get interesting. To parse the serialized data structures in a manner similar to the C++ reference implementation and hopefully efficient besides, I used memory views, basically bounds-checking pointers.(i)
# "load" functions take a memoryview and return the object and number of bytes consumed. def load_compactsize(v): # serialize.h WriteCompactSize size = ord(v[0]) if size < 253: return size, 1 elif size == 253: return unpack_u16(v[1:3])[0], 3 elif size == 254: return unpack_u32(v[1:5])[0], 5 else: return unpack_u64(v[1:9])[0], 9 def load_string(v): # serialize.h Serialize, std::basic_string and CScript overloads n, i = load_compactsize(v) return v[i:i+n].tobytes(), i+n def vector_loader(load_element): # serialize.h Serialize_impl def load_vector(v): n, i = load_compactsize(v) r = [None]*n for elem in xrange(n): r[elem], delta = load_element(v[i:]) i += delta return r, i return load_vector def load_txin(v): # main.h CTxIn i = 36 txid, pos = unpack_outpoint(v[:i]) scriptsig, delta = load_string(v[i:]) i += delta i += 4 # skipping sequence return (txid, pos, scriptsig), i load_txins = vector_loader(load_txin) def load_txout(v): # main.h CTxOut i = 8 value, = unpack_s64(v[:i]) scriptpubkey, delta = load_string(v[i:]) return (value, scriptpubkey), i+delta load_txouts = vector_loader(load_txout) def load_transaction(v): # main.h CTransaction i = 4 # skipping version txins, delta = load_txins(v[i:]) i += delta txouts, delta = load_txouts(v[i:]) i += delta i += 4 # skipping locktime hash = sha256d(v[:i]) return (hash, i, txins, txouts), i load_transactions = vector_loader(load_transaction) def load_block(v): # main.h CBlock i = 80 head = v[:i] version, prev, root, time, target, nonce = unpack_header(head) hash = sha256d(head) txs, delta = load_transactions(v[i:]) return (hash, prev, time, target, txs), i+delta
The code dig to come up with this magic for identifying standard pay-to-pubkey-hash outputs and extracting the enclosed addresses was ugly.
def out_script_address(s): # Standard P2PKH script: OP_DUP OP_HASH160 20 ... OP_EQUALVERIFY OP_CHECKSIG if len(s) == 25 and s[:3] == '\x76\xA9\x14' and s[23:] == '\x88\xAC': return s[3:23] return None
To be continued.(ii)
Updated for errata.
- I'm just now noticing these were added in 2.7, ugh... sorry, 2.6 users. [^]
- My blog will be going on hiatus as far as new articles until early January. There's quite a ways to go on this file and I might not make it all the way through on this pass. If the suspense gnaws, you can always keep reading the source! [^]
[...] 1, 2 [...]
Pingback by Draft gbw-node frontend, part 3 « Fixpoint — 2020-01-19 @ 02:52
[...] 1, 2, 3 [...]
Pingback by Draft gbw-node frontend, part 4 « Fixpoint — 2020-01-19 @ 04:38
[...] 1, 2, 3, 4 [...]
Pingback by Draft gbw-node frontend, part 5 « Fixpoint — 2020-01-19 @ 19:03
[...] F. Welsh (WoT : jfw) which he documented on Fixpoint through his work with JWRD Computing.in a 1 2 3 4 5 6, count'em, 6 article [...]
Pingback by GBW-NODE : Gales Bitcoin Wallet Node verified acquisition, build, install and run in 21ish short, simple steps. « Dorion Mode — 2020-07-01 @ 17:32
[...] presentation of Python code for the node extension: 1, 2, 3, 4, 5, [...]
Pingback by Gales Bitcoin Wallet (re)release « Fixpoint — 2021-12-03 @ 08:59