Bitcoin Developer Examples
Find examples of how to build programs using Bitcoin.
The following guide aims to provide examples to help you start building Bitcoin-based applications. To make the best use of this document, you may want to install the current version of Bitcoin Core, either from source or from a pre-compiled executable.
Once installed, you’ll have access to three programs: bitcoind
,
bitcoin-qt
, and bitcoin-cli
.
-
bitcoin-qt
provides a combination full Bitcoin peer and wallet frontend. From the Help menu, you can access a console where you can enter the RPC commands used throughout this document. -
bitcoind
is more useful for programming: it provides a full peer which you can interact with through RPCs to port 8332 (or 18332 for testnet). -
bitcoin-cli
allows you to send RPC commands tobitcoind
from the command line. For example,bitcoin-cli help
All three programs get settings from bitcoin.conf
in the Bitcoin
application directory:
-
Windows:
%APPDATA%\Bitcoin\
-
OSX:
$HOME/Library/Application Support/Bitcoin/
-
Linux:
$HOME/.bitcoin/
To use bitcoind
and bitcoin-cli
, you will need to add a RPC password
to your bitcoin.conf
file. Both programs will read from the same file
if both run on the same system as the same user, so any long random
password will work:
rpcpassword=change_this_to_a_long_random_password
You should also make the bitcoin.conf
file only readable to its
owner. On Linux, Mac OSX, and other Unix-like systems, this can be
accomplished by running the following command in the Bitcoin application
directory:
chmod 0600 bitcoin.conf
For development, it’s safer and cheaper to use Bitcoin’s test network (testnet) or regression test mode (regtest) described below.
Questions about Bitcoin use are best sent to the BitcoinTalk forum and IRC channels. Errors or suggestions related to documentation on this site can be submitted as an issue or posted to the bitcoin-documentation mailing list.
In the following documentation, some strings have been shortened or wrapped: “[…]” indicates extra data was removed, and lines ending in a single backslash “\” are continued below. If you hover your mouse over a paragraph, cross-reference links will be shown in blue. If you hover over a cross-reference link, a brief definition of the term will be displayed in a tooltip.
Testing Applications
Bitcoin Core provides testing tools designed to let developers test their applications with reduced risks and limitations.
Testnet
When run with no arguments, all Bitcoin Core programs default to Bitcoin’s main network (mainnet). However, for development, it’s safer and cheaper to use Bitcoin’s test network (testnet) where the satoshis spent have no real-world value. Testnet also relaxes some restrictions (such as standard transaction checks) so you can test functions which might currently be disabled by default on mainnet.
To use testnet, use the argument -testnet
with bitcoin-cli
, bitcoind
or bitcoin-qt
or add
testnet=1
to your bitcoin.conf
file as described earlier. To get
free satoshis for testing, use Piotr Piasecki’s testnet faucet.
Testnet is a public resource provided for free by members of the
community, so please don’t abuse it.
Regtest Mode
For situations where interaction with random peers and blocks is unnecessary or unwanted, Bitcoin Core’s regression test mode (regtest mode) lets you instantly create a brand-new private block chain with the same basic rules as testnet—but one major difference: you choose when to create new blocks, so you have complete control over the environment.
Many developers consider regtest mode the preferred way to develop new applications. The following example will let you create a regtest environment after you first configure bitcoind.
Start bitcoind
in regtest mode to create a private block chain.
## Bitcoin Core 0.10.1 and earlier
bitcoin-cli -regtest setgenerate true 101
## Bitcoin Core master (as of commit 48265f3)
bitcoin-cli -regtest generate 101
Generate 101 blocks using a special RPC which is only available in regtest mode. This takes less than a second on a generic PC. Because this is a new block chain using Bitcoin’s default rules, the first blocks pay a block reward of 50 bitcoins. Unlike mainnet, in regtest mode only the first 150 blocks pay a reward of 50 bitcoins. However, a block must have 100 confirmations before that reward can be spent, so we generate 101 blocks to get access to the coinbase transaction from block #1.
Verify that we now have 50 bitcoins available to spend.
You can now use Bitcoin Core RPCs prefixed with bitcoin-cli -regtest
.
Regtest wallets and block chain state (chainstate) are saved in the regtest
subdirectory of the Bitcoin Core configuration directory. You can safely
delete the regtest
subdirectory and restart Bitcoin Core to
start a new regtest. (See the Developer Examples Introduction for default
configuration directory locations on various operating systems. Always back up
mainnet wallets before performing dangerous operations such as deleting.)
Transactions
Transaction Tutorial
Creating transactions is something most Bitcoin applications do. This section describes how to use Bitcoin Core’s RPC interface to create transactions with various attributes.
Your applications may use something besides Bitcoin Core to create transactions, but in any system, you will need to provide the same kinds of data to create transactions with the same attributes as those described below.
In order to use this tutorial, you will need to setup Bitcoin Core and create a regression test mode environment with 50 BTC in your test wallet.
Simple Spending
Bitcoin Core provides several RPCs which handle all the details of spending, including creating change outputs and paying appropriate fees. Even advanced users should use these RPCs whenever possible to decrease the chance that satoshis will be lost by mistake.
Get a new Bitcoin address and save it in the shell variable $NEW_ADDRESS
.
Send 10 bitcoins to the address using the sendtoaddress
RPC. The
returned hex string is the transaction identifier (txid).
The sendtoaddress
RPC automatically selects an unspent transaction
output (UTXO) from which to spend the satoshis. In this case, it
withdrew the satoshis from our only available UTXO, the coinbase
transaction for block #1 which matured with the creation of block #101.
To spend a specific UTXO, you could use the sendfrom
RPC instead.
Use the listunspent
RPC to display the UTXOs belonging to this wallet.
The list is empty because it defaults to only showing confirmed
UTXOs and we just spent our only confirmed UTXO.
Re-running the listunspent
RPC with the argument “0” to also display
unconfirmed transactions shows that we have two UTXOs, both with the
same txid. The first UTXO shown is a change output that sendtoaddress
created using a new address from the key pool. The second UTXO shown is
the spend to the address we provided. If we had spent those satoshis to
someone else, that second transaction would not be displayed in our
list of UTXOs.
Create a new block to confirm the transaction above (takes less than a second) and clear the shell variable.
Simple Raw Transaction
The raw transaction RPCs allow users to create custom transactions and delay broadcasting those transactions. However, mistakes made in raw transactions may not be detected by Bitcoin Core, and a number of raw transaction users have permanently lost large numbers of satoshis, so please be careful using raw transactions on mainnet.
This subsection covers one of the simplest possible raw transactions.
Re-run listunspent
. We now have three UTXOs: the two transactions we
created before plus the coinbase transaction from block #2. We save the
txid and output index number (vout) of that coinbase UTXO to shell
variables.
Get a new address to use in the raw transaction.
Using two arguments to the createrawtransaction
RPC, we create a new
raw format transaction. The first argument (a JSON array) references
the txid of the coinbase transaction from block #2 and the index
number (0) of the output from that transaction we want to spend. The
second argument (a JSON object) creates the output with the address
(public key hash) and number of bitcoins we want to transfer.
We save the resulting raw format transaction to a shell variable.
Warning: createrawtransaction
does not automatically create change
outputs, so you can easily accidentally pay a large transaction fee. In
this example, our input had 50.0000 bitcoins and our output
($NEW_ADDRESS
) is being paid 49.9999 bitcoins, so the transaction will
include a fee of 0.0001 bitcoins. If we had paid $NEW_ADDRESS
only 10
bitcoins with no other changes to this transaction, the transaction fee
would be a whopping 40 bitcoins. See the Complex Raw Transaction
subsection below for how to create a transaction with multiple outputs so you
can send the change back to yourself.
Use the decoderawtransaction
RPC to see exactly what the transaction
we just created does.
Use the signrawtransaction
RPC to sign the transaction created by
createrawtransaction
and save the returned “hex” raw format signed
transaction to a shell variable.
Even though the transaction is now complete, the Bitcoin Core node we’re
connected to doesn’t know anything about the transaction, nor does any
other part of the network. We’ve created a spend, but we haven’t
actually spent anything because we could simply unset the
$SIGNED_RAW_TX
variable to eliminate the transaction.
Send the signed transaction to the connected node using the
sendrawtransaction
RPC. After accepting the transaction, the node
would usually then broadcast it to other peers, but we’re not currently
connected to other peers because we started in regtest mode.
Generate a block to confirm the transaction and clear our shell variables.
Complex Raw Transaction
In this example, we’ll create a transaction with two inputs and two outputs. We’ll sign each of the inputs separately, as might happen if the two inputs belonged to different people who agreed to create a transaction together (such as a CoinJoin transaction).
For our two inputs, we select two UTXOs by placing the txid and output index numbers (vouts) in shell variables. We also save the addresses corresponding to the public keys (hashed or unhashed) used in those transactions. We need the addresses so we can get the corresponding private keys from our wallet.
Use the dumpprivkey
RPC to get the private keys corresponding to the
public keys used in the two UTXOs we will be spending. We need
the private keys so we can sign each of the inputs separately.
Warning: Users should never manually manage private keys on mainnet. As dangerous as raw transactions are (see warnings above), making a mistake with a private key can be much worse—as in the case of a HD wallet cross-generational key compromise. These examples are to help you learn, not for you to emulate on mainnet.
For our two outputs, get two new addresses.
Create the raw transaction using createrawtransaction
much the same as
before, except now we have two inputs and two outputs.
Signing the raw transaction with signrawtransaction
gets more
complicated as we now have three arguments:
-
The unsigned raw transaction.
-
An empty array. We don’t do anything with this argument in this operation, but some valid JSON must be provided to get access to the later positional arguments.
-
The private key we want to use to sign one of the inputs.
The result is a raw transaction with only one input signed; the fact
that the transaction isn’t fully signed is indicated by value of the
complete
JSON field. We save the incomplete, partly-signed raw
transaction hex to a shell variable.
To sign the second input, we repeat the process we used to sign the
first input using the second private key. Now that both inputs are
signed, the complete
result is true.
Clean up the shell variables used. Unlike previous subsections, we’re
not going to send this transaction to the connected node with
sendrawtransaction
. This will allow us to illustrate in the Offline
Signing subsection below how to spend a transaction which is not yet in
the block chain or memory pool.
Offline Signing
We will now spend the transaction created in the Complex Raw Transaction subsection above without sending it to the local node first. This is the same basic process used by wallet programs for offline signing—which generally means signing a transaction without access to the current UTXO set.
Offline signing is safe. However, in this example we will also be spending an output which is not part of the block chain because the transaction containing it has never been broadcast. That can be unsafe:
Warning: Transactions which spend outputs from unconfirmed transactions are vulnerable to transaction malleability. Be sure to read about transaction malleability and adopt good practices before spending unconfirmed transactions on mainnet.
Put the previously signed (but not sent) transaction into a shell variable.
Decode the signed raw transaction so we can get its txid. Also, choose a specific one of its UTXOs to spend and save that UTXO’s output index number (vout) and hex pubkey script (scriptPubKey) into shell variables.
Get a new address to spend the satoshis to.
Create the raw transaction the same way we’ve done in the previous subsections.
Attempt to sign the raw transaction without any special arguments, the way we successfully signed the the raw transaction in the Simple Raw Transaction subsection. If you’ve read the Transaction section of the guide, you may know why the call fails and leaves the raw transaction hex unchanged.
As illustrated above, the data that gets signed includes the txid and
vout from the previous transaction. That information is included in the
createrawtransaction
raw transaction. But the data that gets signed
also includes the pubkey script from the previous transaction, even
though it doesn’t appear in either the unsigned or signed transaction.
In the other raw transaction subsections above, the previous output was part of the UTXO set known to the wallet, so the wallet was able to use the txid and output index number to find the previous pubkey script and insert it automatically.
In this case, you’re spending an output which is unknown to the wallet, so it can’t automatically insert the previous pubkey script.
Successfully sign the transaction by providing the previous pubkey script and other required input data.
This specific operation is typically what offline signing wallets do. The online wallet creates the raw transaction and gets the previous pubkey scripts for all the inputs. The user brings this information to the offline wallet. After displaying the transaction details to the user, the offline wallet signs the transaction as we did above. The user takes the signed transaction back to the online wallet, which broadcasts it.
Attempt to broadcast the second transaction before we’ve broadcast the first transaction. The node rejects this attempt because the second transaction spends an output which is not a UTXO the node knows about.
Broadcast the first transaction, which succeeds, and then broadcast the second transaction—which also now succeeds because the node now sees the UTXO.
We have once again not generated an additional block, so the transactions above have not yet become part of the regtest block chain. However, they are part of the local node’s memory pool.
Remove old shell variables.
P2SH Multisig
In this subsection, we will create a P2SH multisig address, spend satoshis to it, and then spend those satoshis from it to another address.
Creating a multisig address is easy. Multisig outputs have two parameters, the minimum number of signatures required (m) and the number of public keys to use to validate those signatures. This is called m-of-n, and in this case we’ll be using 2-of-3.
Generate three new P2PKH addresses. P2PKH addresses cannot be used with the multisig redeem script created below. (Hashing each public key is unnecessary anyway—all the public keys are protected by a hash when the redeem script is hashed.) However, Bitcoin Core uses addresses as a way to reference the underlying full (unhashed) public keys it knows about, so we get the three new addresses above in order to use their public keys.
Recall from the Guide that the hashed public keys used in addresses obfuscate the full public key, so you cannot give an address to another person or device as part of creating a typical multisig output or P2SH multisig redeem script. You must give them a full public key.
Use the validateaddress
RPC to display the full (unhashed) public key
for one of the addresses. This is the information which will
actually be included in the multisig redeem script. This is also the
information you would give another person or device as part of creating
a multisig output or P2SH multisig redeem script.
We save the address returned to a shell variable.
Use the createmultisig
RPC with two arguments, the number (n) of
signatures required and a list of addresses or public keys. Because
P2PKH addresses can’t be used in the multisig redeem script created by this
RPC, the only addresses which can be provided are those belonging to a
public key in the wallet. In this case, we provide two addresses and
one public key—all of which will be converted to public keys in the
redeem script.
The P2SH address is returned along with the redeem script which must be provided when we spend satoshis sent to the P2SH address.
Warning: You must not lose the redeem script, especially if you don’t have a record of which public keys you used to create the P2SH multisig address. You need the redeem script to spend any bitcoins sent to the P2SH address. If you lose the redeem script, you can recreate it by running the same command above, with the public keys listed in the same order. However, if you lose both the redeem script and even one of the public keys, you will never be able to spend satoshis sent to that P2SH address.
Neither the address nor the redeem script are stored in the wallet when
you use createmultisig
. To store them in the wallet, use the
addmultisigaddress
RPC instead. If you add an address to the wallet,
you should also make a new backup.
Paying the P2SH multisig address with Bitcoin Core is as simple as paying a more common P2PKH address. Here we use the same command (but different variable) we used in the Simple Spending subsection. As before, this command automatically selects an UTXO, creates a change output to a new one of our P2PKH addresses if necessary, and pays a transaction fee if necessary.
We save that txid to a shell variable as the txid of the UTXO we plan to spend next.
We use the getrawtransaction
RPC with the optional second argument
(true) to get the decoded transaction we just created with
sendtoaddress
. We choose one of the outputs to be our UTXO and get
its output index number (vout) and pubkey script (scriptPubKey).
We generate a new P2PKH address to use in the output we’re about to create.
We generate the raw transaction the same way we did in the Simple Raw Transaction subsection.
We get the private keys for two of the public keys we used to create the transaction, the same way we got private keys in the Complex Raw Transaction subsection. Recall that we created a 2-of-3 multisig pubkey script, so signatures from two private keys are needed.
Reminder: Users should never manually manage private keys on mainnet. See the warning in the complex raw transaction section.
We make the first signature. The input argument (JSON object) takes the additional redeem script parameter so that it can append the redeem script to the signature script after the two signatures.
The signrawtransaction
call used here is nearly identical to the one
used above. The only difference is the private key used. Now that the
two required signatures have been provided, the transaction is marked as
complete.
We send the transaction spending the P2SH multisig output to the local node, which accepts it.
Payment Processing
Payment Protocol
To request payment using the payment protocol, you use an extended (but
backwards-compatible) bitcoin:
URI. For example:
bitcoin:mjSk1Ny9spzU2fouzYgLqGUD8U41iR35QN\
?amount=0.10\
&label=Example+Merchant\
&message=Order+of+flowers+%26+chocolates\
&r=https://example.com/pay.php/invoice%3Dda39a3ee
The browser, QR code reader, or other program processing the URI opens
the spender’s Bitcoin wallet program on the URI. If the wallet program is
aware of the payment protocol, it accesses the URL specified in the r
parameter, which should provide it with a serialized PaymentRequest
served with the MIME type application/bitcoin-paymentrequest
.
Resource: Gavin Andresen’s Payment Request Generator generates custom example URIs and payment requests for use with testnet.
PaymentRequest & PaymentDetails
The PaymentRequest is created with data structures built using Google’s Protocol Buffers. BIP70 describes these data structures in the non-sequential way they’re defined in the payment request protocol buffer code, but the text below will describe them in a more linear order using a simple (but functional) Python CGI program. (For brevity and clarity, many normal CGI best practices are not used in this program.)
The full sequence of events is illustrated below, starting with the
spender clicking a bitcoin:
URI or scanning a bitcoin:
QR code.
For the script to use the protocol buffer, you will need a copy of
Google’s Protocol Buffer compiler (protoc
), which is available in most
modern Linux package managers and directly from Google. Non-Google
protocol buffer compilers are available for a variety of
programming languages. You will also need a copy of the PaymentRequest
Protocol Buffer description from the Bitcoin Core source code.
Initialization Code
With the Python code generated by protoc
, we can start our simple
CGI program.
The startup code above is quite simple, requiring nothing but the epoch
(Unix date) time function, the standard out file descriptor, a few
functions from the OpenSSL library, and the data structures and
functions created by protoc
.
Configuration Code
Next, we’ll set configuration settings which will typically only change
when the receiver wants to do something differently. The code pushes a
few settings into the request
(PaymentRequest) and details
(PaymentDetails) objects. When we serialize them,
PaymentDetails will be contained
within the PaymentRequest.
Each line is described below.
pki_type
: (optional) tell the receiving wallet program what Public-Key
Infrastructure (PKI) type you’re using to
cryptographically sign your PaymentRequest so that it can’t be modified
by a man-in-the-middle attack.
If you don’t want to sign the PaymentRequest, you can choose a
pki_type
of none
(the default).
If you do choose the sign the PaymentRequest, you currently have two
options defined by BIP70: x509+sha1
and x509+sha256
. Both options
use the X.509 certificate system, the same system used for HTTP Secure
(HTTPS). To use either option, you will need a certificate signed by a
certificate authority or one of their intermediaries. (A self-signed
certificate will not work.)
Each wallet program may choose which certificate authorities to trust, but it’s likely that they’ll trust whatever certificate authorities their operating system trusts. If the wallet program doesn’t have a full operating system, as might be the case for small hardware wallets, BIP70 suggests they use the Mozilla Root Certificate Store. In general, if a certificate works in your web browser when you connect to your webserver, it will work for your PaymentRequests.
network
: (optional) tell the spender’s wallet program what Bitcoin network you’re
using; BIP70 defines “main” for mainnet (actual payments) and “test” for
testnet (like mainnet, but fake satoshis are used). If the wallet
program doesn’t run on the network you indicate, it will reject the
PaymentRequest.
payment_url
: (required) tell the spender’s wallet program where to send the Payment
message (described later). This can be a static URL, as in this example,
or a variable URL such as https://example.com/pay.py?invoice=123.
It should usually be an HTTPS address to prevent man-in-the-middle
attacks from modifying the message.
payment_details_version
: (optional) tell the spender’s wallet program what version of the
PaymentDetails you’re using. As of this writing, the only version is
version 1.
x509certificates
: (required for signed PaymentRequests) you must
provide the public SSL key/certificate corresponding to the private SSL
key you’ll use to sign the PaymentRequest. The certificate must be in
ASN.1/DER format.
You must also provide any intermediate certificates necessary to link your certificate to the root certificate of a certificate authority trusted by the spender’s software, such as a certificate from the Mozilla root store.
The certificates must be provided in a specific order—the same order
used by Apache’s SSLCertificateFile
directive and other server
software. The figure below shows the certificate chain of the
www.bitcoin.org
X.509 certificate and how each certificate (except the
root certificate) would be loaded into the X509Certificates protocol
buffer message.
To be specific, the first certificate provided must be the X.509 certificate corresponding to the private SSL key which will make the signature, called the leaf certificate. Any intermediate certificates necessary to link that signed public SSL key to the root certificate (the certificate authority) are attached separately, with each certificate in DER format bearing the signature of the certificate that follows it all the way to (but not including) the root certificate.
(Required for signed PaymentRequests) you will need a private SSL key in a format your SSL library supports (DER format is not required). In this program, we’ll load it from a PEM file. (Embedding your passphrase in your CGI code, as done here, is obviously a bad idea in real life.)
The private SSL key will not be transmitted with your request. We’re only loading it into memory here so we can use it to sign the request later.
Code Variables
Now let’s look at the variables your CGI program will likely set for each payment.
Each line is described below.
amount
: (optional) the amount you want the spender to pay. You’ll probably get
this value from your shopping cart application or fiat-to-BTC exchange
rate conversion tool. If you leave the amount blank, the wallet
program will prompt the spender how much to pay (which can be useful
for donations).
script
: (required) You must specify the pubkey script you want the spender to
pay—any valid pubkey script is acceptable. In this example, we’ll request
payment to a P2PKH pubkey script.
First we get a pubkey hash. The hash above is the hash form of the address used in the URI examples throughout this section, mjSk1Ny9spzU2fouzYgLqGUD8U41iR35QN.
Next, we plug that hash into the standard P2PKH pubkey script using hex, as illustrated by the code comments.
Finally, we convert the pubkey script from hex into its serialized form.
outputs
: (required) add the pubkey script and (optional) amount to the
PaymentDetails outputs array.
It’s possible to specify multiple scripts
and amounts
as part of a merge
avoidance strategy, described later in the Merge Avoidance
subsection. However, effective merge avoidance is not possible under
the base BIP70 rules in which the spender pays each script
the exact
amount specified by its paired amount
. If the amounts are omitted from
all amount
/script
pairs, the spender will be prompted to choose an
amount to pay.
memo
: (optional) add a memo which will be displayed to the spender as
plain UTF-8 text. Embedded HTML or other markup will not be processed.
merchant_data
: (optional) add arbitrary data which should be sent back to the
receiver when the invoice is paid. You can use this to track your
invoices, although you can more reliably track payments by generating a
unique address for each payment and then tracking when it gets paid.
The memo
field and the merchant_data
field can be arbitrarily long,
but if you make them too long, you’ll run into the 50,000 byte limit on
the entire PaymentRequest, which includes the often several kilobytes
given over to storing the certificate chain. As will be described in a
later subsection, the memo
field can be used by the spender after
payment as part of a cryptographically-proven receipt.
Derivable Data
Next, let’s look at some information your CGI program can automatically derive.
Each line is described below.
time
: (required) PaymentRequests must indicate when they were created
in number of seconds elapsed since 1970-01-01T00:00 UTC (Unix
epoch time format).
expires
: (optional) the PaymentRequest may also set an expires
time after
which they’re no longer valid. You probably want to give receivers
the ability to configure the expiration time delta; here we used the
reasonable choice of 10 minutes. If this request is tied to an order
total based on a fiat-to-satoshis exchange rate, you probably want to
base this on a delta from the time you got the exchange rate.
serialized_payment_details
: (required) we’ve now set everything we need to create the
PaymentDetails, so we’ll use the SerializeToString function from the
protocol buffer code to store the PaymentDetails in the appropriate
field of the PaymentRequest.
pki_data
: (required for signed PaymentRequests) serialize the certificate chain
PKI data and store it in the
PaymentRequest
We’ve filled out everything in the PaymentRequest except the signature, but before we sign it, we have to initialize the signature field by setting it to a zero-byte placeholder.
signature
: (required for signed PaymentRequests) now we
make the signature by
signing the completed and serialized PaymentRequest. We’ll use the
private key we stored in memory in the configuration section and the
same hashing formula we specified in pki_type
(sha256 in this case)
Output Code
Now that we have PaymentRequest all filled out, we can serialize it and send it along with the HTTP headers, as shown in the code below.
(Required) BIP71 defines the content types for PaymentRequests, Payments, and PaymentACKs.
request
: (required) now, to finish, we just dump out the serialized
PaymentRequest (which contains the serialized PaymentDetails). The
serialized data is in binary, so we can’t use Python’s print()
because it would add an extraneous newline.
The following screenshot shows how the authenticated PaymentDetails created by the program above appears in the GUI from Bitcoin Core 0.9.
P2P Network
Creating A Bloom Filter
In this section, we’ll use variable names that correspond to the field
names in the filterload
message documentation.
Each code block precedes the paragraph describing it.
We start by setting some maximum values defined in BIP37: the maximum number of bytes allowed in a filter and the maximum number of hash functions used to hash each piece of data. We also set nFlags to zero, indicating we don’t want the remote node to update the filter for us. (We won’t use nFlags again in the sample program, but real programs will need to use it.)
We define the number (n) of elements we plan to insert into the filter and the false positive rate (p) we want to help protect our privacy. For this example, we will set n to one element and p to a rate of 1-in-10,000 to produce a small and precise filter for illustration purposes. In actual use, your filters will probably be much larger.
Using the formula described in BIP37, we calculate the ideal size of the filter (in bytes) and the ideal number of hash functions to use. Both are truncated down to the nearest whole number and both are also constrained to the maximum values we defined earlier. The results of this particular fixed computation are 2 filter bytes and 11 hash functions. We then use nFilterBytes to create a little-endian bit array of the appropriate size.
We also should choose a value for nTweak. In this case, we’ll simply use zero.
We setup our hash function template using the formula and 0xfba4c795 constant set in BIP37. Note that we limit the size of the seed to four bytes and that we’re returning the result of the hash modulo the size of the filter in bits.
For the data to add to the filter, we’re adding a TXID. Note that the TXID is in internal byte order.
Now we use the hash function template to run a slightly different hash function for nHashFuncs times. The result of each function being run on the transaction is used as an index number: the bit at that index is set to 1. We can see this in the printed debugging output:
Notice that in iterations 8 and 9, the filter did not change because the corresponding bit was already set in a previous iteration (5 and 7, respectively). This is a normal part of bloom filter operation.
We only added one element to the filter above, but we could repeat the process with additional elements and continue to add them to the same filter. (To maintain the same false-positive rate, you would need a larger filter size as computed earlier.)
Note: for a more optimized Python implementation with fewer external dependencies, see python-bitcoinlib’s bloom filter module which is based directly on Bitcoin Core’s C++ implementation.
Using the filterload
message format, the complete filter created above
would be the binary form of the annotated hexdump shown below:
Evaluating A Bloom Filter
Using a bloom filter to find matching data is nearly identical to constructing a bloom filter—except that at each step we check to see if the calculated index bit is set in the existing filter.
Using the bloom filter created above, we import its various parameters. Note, as indicated in the section above, we won’t actually use nFlags to update the filter.
We define a function to check an element against the provided filter. When checking whether the filter might contain an element, we test to see whether a particular bit in the filter is already set to 1 (if it isn’t, the match fails).
Testing the filter against the data element we previously added, we get no output (indicating a possible match). Recall that bloom filters have a zero false negative rate—so they should always match the inserted elements.
Testing the filter against an arbitrary element, we get the failure output below. Note: we created the filter with a 1-in-10,000 false positive rate (which was rounded up somewhat when we truncated), so it was possible this arbitrary string would’ve matched the filter anyway. It is not possible to set a bloom filter to a false positive rate of zero, so your program will always have to deal with false positives. The output below shows us that one of the hash functions returned an index number of 0x06, but that bit wasn’t set in the filter, causing the match failure:
Retrieving A MerkleBlock
For the merkleblock
message documentation on the reference page, an
actual merkle block was retrieved from the network and manually
processed. This section walks through each step of the process,
demonstrating basic network communication and merkle block processing.
To connect to the P2P network, the trivial Python function above was developed to compute message headers and send payloads decoded from hex.
Peers on the network will not accept any requests until you send them a
version
message. The receiving node will reply with their version
message and a verack
message.
We’re not going to validate their version
message with this simple
script, but we will sleep a short bit and send back our own verack
message as if we had accepted their version
message.
We set a bloom filter with the filterload
message. This filter is
described in the two preceeding sections.
We request a merkle block for transactions matching our filter, completing our script.
To run the script, we simply pipe it to the Unix netcat
command or one of its many clones, one of which is available
for practically any platform. For example, with the original netcat and
using hexdump (hd
) to display the output:
Part of the response is shown in the section below.
Parsing A MerkleBlock
In the section above, we retrieved a merkle block from the network; now
we will parse it. Most of the block header has been omitted. For
a more complete hexdump, see the example in the merkleblock
message
section.
We parse the above merkleblock
message using the following
instructions. Each illustration is described in the paragraph below it.
We start by building the structure of a merkle tree based on the number of transactions in the block.
The first flag is a 1 and the merkle root is (as always) a non-TXID node, so we will need to compute the hash later based on this node’s children. Accordingly, we descend into the merkle root’s left child and look at the next flag for instructions.
The next flag in the example is a 0 and this is also a non-TXID node, so
we apply the first hash from the merkleblock
message to this node. We
also don’t process any child nodes—according to the peer which created
the merkleblock
message, none of those nodes will lead to TXIDs of
transactions that match our filter, so we don’t need them. We go back up
to the merkle root and then descend into its right child and look at the
next (third) flag for instructions.
The third flag in the example is another 1 on another non-TXID node, so we descend into its left child.
The fourth flag is also a 1 on another non-TXID node, so we descend again—we will always continue descending until we reach a TXID node or a non-TXID node with a 0 flag (or we finish filling out the tree).
Finally, on the fifth flag in the example (a 1), we reach a TXID node. The 1 flag indicates this TXID’s transaction matches our filter and that we should take the next (second) hash and use it as this node’s TXID.
The sixth flag also applies to a TXID, but it’s a 0 flag, so this TXID’s transaction doesn’t match our filter; still, we take the next (third) hash and use it as this node’s TXID.
We now have enough information to compute the hash for the fourth node we encountered—it’s the hash of the concatenated hashes of the two TXIDs we filled out.
Moving to the right child of the third node we encountered, we fill it out using the seventh flag and final hash—and discover there are no more child nodes to process.
We hash as appropriate to fill out the tree. Note that the eighth flag is not used—this is acceptable as it was required to pad out a flag byte.
The final steps would be to ensure the computed merkle root
is identical to the merkle root in the header and check the other steps
of the parsing checklist in the merkleblock
message section.