Learn How to Program Bitcoin from Scratch
Jimmy Song
Editors: Mike Loukides and Michele Cronin
Production Editor: Kristen Brown
Copyeditor: James Fraleigh
Proofreader: Rachel Head
Indexer: Judy McConville
Interior Designer: David Futato
Cover Designer: Karen Montgomery
Illustrator: Rebecca Demarest
Book Details
Price
|
5.00 USD |
---|---|
Pages
| 321 p |
File Size
|
13,030 KB |
File Type
|
PDF format |
ISBN
| 978-1-492-03149-9 |
Copyright
| 2019 Jimmy Song |
It’s fun to be a science fiction writer. To build a society where wealth is no longer a
mirage erected on the empty promises of governments and manipulations of central
banks, where exchanges of value can be completed among the trustless without paying
a tax to middlemen, where code can be law, where collective decision making is
not subject to the distortions of centralization…all I have to do is to open up a text
editor and start making up stuff.
But compelling stories require more than imagination. They require knowledge of
the world. “Worldbuilding” isn’t about literary verisimilitude or strings of technobabble—
it’s about piercing through the superficial to ask “what if ” questions that get at
the heart of how the world works. The more a writer understands the mechanisms
and codes that make up the world, the more interesting the questions they ask become.
Changing the real world is much, much harder than writing fiction, but it similarly
requires knowledge. Beyond wisdom, idealism, grit, discipline, and single-minded
determination in the face of doubt, a would-be world-changer needs understanding:
of the available tools, their capabilities, and their limits.
The world of Bitcoin and blockchain today is still largely a world of fiction. Pundits
selling hope and hype, with no understanding of the underlying reality, are far louder
and more influential than those who are doing the hard work of bringing about
change. Politically motivated screeds premised on fear and get-rich-quick schemes
appealing to greed pass for knowledge with the help of a sprinkling of technobabble and trending hashtags.
But you can no more understand blockchain by reading whitepapers or thinkpieces
than you can learn to build a company by going to business school and watching
PowerPoints.
You have to code.
There is no better way to understand a technology than to build something useful to
you in it. Until you’ve coded the fundamental building blocks of a blockchain application
with your own hands, you will not be able to intuit the difference between empty
hype and realizable possibility.
This book is the most efficient and comprehensive way to learn about Bitcoin and
blockchain through coding. With great skill and insight, Jimmy Song has crafted a
learning path that will take you from the basic math behind Bitcoin to the latest
extensions and forks. Along the way, the exercises—refined through extensive work
with live students—will not only teach you the mechanics of working with the blockchain,
but also an intuition for the elegance and beauty of this technology.
The journey will not be easy. Even with a teacher as adept as Jimmy to guide you, this
isn’t a book to be flipped through when you’re bored from bingeing on Netflix.
It requires you to put in considerable work to get the most out of it. There is no shortcut,
no CliffsNotes. But that is very much in line with the constitutive principle of
Bitcoin: you must have skin in the game; you must demonstrate proof-of-work.
Only then can you trust your knowledge.
Happy coding!
— Ken Liu
A winner of the Nebula, Hugo, and World Fantasy awards, Ken Liu is the author of The
Dandelion Dynasty, a silkpunk epic fantasy series in which the magic is engineering,
and The Paper Menagerie and Other Stories, a collection. His SF story about blockchain,
“Byzantine Empathy”, was originally published by the MIT Press.
Preface
This book will teach you the technology of Bitcoin at a fundamental level. It doesn’t
cover the monetary, economic, or social dynamics of Bitcoin, but knowing how Bitcoin
works under the hood should give you greater insight into what’s possible.
There’s a tendency to hype Bitcoin and blockchain without really understanding
what’s going on; this book is meant to be an antidote to that tendency.
After all, there are lots of books about Bitcoin, covering the history and the economic
aspects and giving technical descriptions. The aim of this book is to get you to understand
Bitcoin by coding all of the components necessary for a Bitcoin library. The
library is not meant to be exhaustive or efficient. The aim of the library is to help you learn.
Who Is This Book For?
This book is for programmers who want to learn how Bitcoin works by coding it
themselves. You will learn Bitcoin by coding the “bare metal” stuff in a Bitcoin library
you will create from scratch. This is not a reference book where you can look up the
specification for a particular feature.
The material from this book has been largely taken from my two-day seminar where I
teach developers all about Bitcoin. The material has been refined extensively, as I’ve
taught this course more than 20 times, to over 400 people as of this writing.
By the time you’re done with the book, you’ll not only be able to create transactions,
but also get all the data you need from peers and send the transactions over the network.
It covers everything needed to accomplish this, including the math, parsing,
network connectivity, and block validation.
Table of Contents
Foreword. . . . . . . xi
Preface. . .. . . . xiii
1. Finite Fields. . . . . . . . . . 1
Learning Higher-Level Math 1
Finite Field Definition 2
Defining Finite Sets 3
Constructing a Finite Field in Python 3
Exercise 1 5
Modulo Arithmetic 5
Modulo Arithmetic in Python 7
Finite Field Addition and Subtraction 8
Exercise 2 9
Coding Addition and Subtraction in Python 9
Exercise 3 10
Finite Field Multiplication and Exponentiation 10
Exercise 4 11
Exercise 5 11
Coding Multiplication in Python 12
Exercise 6 12
Coding Exponentiation in Python 12
Exercise 7 13
Finite Field Division 13
Exercise 8 15
Exercise 9 16
Redefining Exponentiation 16
Conclusion 17
2. Elliptic Curves. .. . . 19
Definition 19
Coding Elliptic Curves in Python 26
Exercise 1 27
Exercise 2 27
Point Addition 27
Math of Point Addition 31
Coding Point Addition 33
Exercise 3 34
Point Addition for When x1≠x2 35
Exercise 4 36
Coding Point Addition for When x1≠x2 36
Exercise 5 36
Point Addition for When P1 = P2 37
Exercise 6 38
Coding Point Addition for When P1 = P2 38
Exercise 7 39
Coding One More Exception 39
Conclusion 40
3. Elliptic Curve Cryptography. . . 41
Elliptic Curves over Reals 41
Elliptic Curves over Finite Fields 42
Exercise 1 44
Coding Elliptic Curves over Finite Fields 44
Point Addition over Finite Fields 45
Coding Point Addition over Finite Fields 47
Exercise 2 47
Exercise 3 47
Scalar Multiplication for Elliptic Curves 47
Exercise 4 49
Scalar Multiplication Redux 50
Mathematical Groups 51
Identity 51
Closure 52
Invertibility 53
Commutativity 54
Associativity 55
Exercise 5 56
Coding Scalar Multiplication 57
Defining the Curve for Bitcoin 58
Working with secp256k1 60
Public Key Cryptography 61
Signing and Verification 62
Inscribing the Target 63
Verification in Depth 65
Verifying a Signature 66
Exercise 6 67
Programming Signature Verification 67
Signing in Depth 68
Creating a Signature 68
Exercise 7 69
Programming Message Signing 70
Conclusion 72
4. Serialization. . . . . . 73
Uncompressed SEC Format 73
Exercise 1 75
Compressed SEC Format 75
Exercise 2 79
DER Signatures 79
Exercise 3 81
Base58 81
Transmitting Your Public Key 81
Exercise 4 83
Address Format 83
Exercise 5 84
WIF Format 84
Exercise 6 85
Big- and Little-Endian Redux 85
Exercise 7 86
Exercise 8 86
Exercise 9 86
Conclusion 86
5. Transactions. . . . . 87
Transaction Components 87
Version 90
Exercise 1 90
Inputs 90
Parsing Script 95
Exercise 2 96
Outputs 96
Exercise 3 97
Locktime 98
Exercise 4 98
Exercise 5 98
Coding Transactions 99
Transaction Fee 100
Calculating the Fee 102
Exercise 6 102
Conclusion 102
6. Script. . . . . . . . 103
Mechanics of Script 103
How Script Works 105
Example Operations 105
Coding Opcodes 106
Exercise 1 107
Parsing the Script Fields 107
Coding a Script Parser and Serializer 108
Combining the Script Fields 111
Coding the Combined Instruction Set 111
Standard Scripts 111
p2pk 112
Coding Script Evaluation 115
Stack Elements Under the Hood 117
Exercise 2 118
Problems with p2pk 118
Solving the Problems with p2pkh 120
p2pkh 120
Scripts Can Be Arbitrarily Constructed 124
Exercise 3 127
Utility of Scripts 127
Exercise 4 127
SHA-1 Piñata 128
Conclusion 128
7. Transaction Creation and Validation. .. . 129
Validating Transactions 129
Checking the Spentness of Inputs 130
Checking the Sum of the Inputs Versus the Sum of the Outputs 130
Checking the Signature 131
Exercise 1 135
Exercise 2 135
Verifying the Entire Transaction 135
Creating Transactions 136
Constructing the Transaction 136
Making the Transaction 139
Signing the Transaction 141
Exercise 3 141
Creating Your Own Transactions on testnet 141
Exercise 4 142
Exercise 5 142
Conclusion 142
8. Pay-to-Script Hash. . . . 143
Bare Multisig 143
Coding OP_CHECKMULTISIG 148
Exercise 1 148
Problems with Bare Multisig 148
Pay-to-Script-Hash (p2sh) 149
Coding p2sh 156
More Complicated Scripts 157
Addresses 157
Exercise 2 158
Exercise 3 158
p2sh Signature Verification 158
Exercise 4 161
Exercise 5 161
Conclusion 161
9. Blocks. . . . . . . . . 163
Coinbase Transactions 164
Exercise 1 164
ScriptSig 165
BIP0034 165
Exercise 2 166
Block Headers 166
Exercise 3 167
Exercise 4 167
Exercise 5 167
Version 168
Exercise 6 169
Exercise 7 169
Exercise 8 169
Previous Block 169
Merkle Root 169
Timestamp 169
Bits 170
Nonce 170
Proof-of-Work 170
How a Miner Generates New Hashes 171
The Target 172
Exercise 9 173
Difficulty 173
Exercise 10 173
Checking That the Proof-of-Work Is Sufficient 174
Exercise 11 174
Difficulty Adjustment 174
Exercise 12 176
Exercise 13 176
Conclusion 176
10. Networking. . . . . . . . 177
Network Messages 177
Exercise 1 179
Exercise 2 179
Exercise 3 179
Parsing the Payload 179
Exercise 4 181
Network Handshake 181
Connecting to the Network 181
Exercise 5 184
Getting Block Headers 184
Exercise 6 185
Headers Response 185
Conclusion 188
11. Simplified Payment Verification. . . 189
Motivation 189
Merkle Tree 190
Merkle Parent 191
Exercise 1 192
Merkle Parent Level 192
Exercise 2 193
Merkle Root 193
Exercise 3 194
Merkle Root in Blocks 194
Exercise 4 195
Using a Merkle Tree 195
Merkle Block 197
Merkle Tree Structure 199
Exercise 5 199
Coding a Merkle Tree 199
The merkleblock Command 205
Exercise 6 206
Using Flag Bits and Hashes 206
Exercise 7 210
Conclusion 210
12. Bloom Filters. . . . . . 211
What Is a Bloom Filter? 211
Exercise 1 213
Going a Step Further 214
BIP0037 Bloom Filters 215
Exercise 2 216
Exercise 3 216
Loading a Bloom Filter 216
Exercise 4 217
Getting Merkle Blocks 217
Exercise 5 218
Getting Transactions of Interest 218
Exercise 6 220
Conclusion 220
13. Segwit. . . . . . . . . 221
Pay-to-Witness-Pubkey-Hash (p2wpkh) 221
Transaction Malleability 222
Fixing Malleability 222
p2wpkh Transactions 223
p2sh-p2wpkh 226
Coding p2wpkh and p2sh-p2wpkh 231
Pay-to-Witness-Script-Hash (p2wsh) 235
p2sh-p2wsh 239
Coding p2wsh and p2sh-p2wsh 244
Other Improvements 246
Conclusion 246
14. Advanced Topics and Next Steps. . . . 247
Suggested Topics to Study Next 247
Wallets 247
Payment Channels and Lightning Network 248
Contributing 248
Suggested Next Projects 249
Testnet Wallet 249
Block Explorer 249
Web Shop 249
Utility Library 250
Finding a Job 250
Conclusion 250
A. Solutions. . . . . . 251
Index. . . . . 289
What Do I Need to Know?
A prerequisite for this book is that you know programming—Python, in particular.
The library itself is written in Python 3, and a lot of the exercises can be done in a
controlled environment like a Jupyter notebook. An intermediate knowledge of
Python is preferable, but even a beginning knowledge is probably sufficient to get a
lot of the concepts.
Some knowledge of math is required, especially for Chapters 1 and 2. These chapters
introduce mathematical concepts probably not familiar to those who didn’t major in
mathematics. Math knowledge around algebra level should suffice to understand the
new concepts and to code the exercises covered in those chapters.
General computer science knowledge, for example, of hash functions, will come in
handy but is not strictly necessary to complete the exercises in this book.
How Is the Book Arranged?
This book is split into 14 chapters. Each is meant to build on the previous one as the
Bitcoin library gets built from scratch all the way to the end.
Roughly speaking, Chapters 1–4 establish the mathematical tools that we need; Chapters
5–8 cover transactions, which are the fundamental unit of Bitcoin; and Chapters
9–12 cover blocks and networking. The last two chapters cover some advanced topics
but don’t actually require you to write code.
Chapters 1 and 2 cover some math that we need. Finite fields and elliptic curves are
needed to understand elliptic curve cryptography in Chapter 3. After we’ve established
the public key cryptography at the end of Chapter 3, Chapter 4 adds parsing
and serialization, which are how cryptographic primitives are stored and transmitted.
Chapter 5 covers the transaction structure. Chapter 6 goes into the smart contract
language behind Bitcoin, called Script. Chapter 7 builds on all the previous chapters,
showing how to validate and create transactions based on the elliptic curve cryptography
from the first four chapters. Chapter 8 establishes how pay-to-script-hash
(p2sh) works, which is a way to make more powerful smart contracts.
Chapter 9 covers blocks, which are groups of ordered transactions. Chapter 10 covers
network communication in Bitcoin. Chapters 11 and 12 go into how a light client, or
software without access to the entire blockchain, might request and broadcast data to
and from nodes that store the entire blockchain.
Chapter 13 covers Segwit, a backward-compatible upgrade introduced in 2017, and
Chapter 14 provides suggestions for further study. These chapters are not strictly necessary,
but are included as a way to give you a taste of what more there is to learn.
Chapters 1–12 have exercises that require you to build up the library from scratch.
The answers are in Appendix A and in the corresponding chapter directory in the
GitHub repo. You will be writing many Python classes and building toward not just
validating transactions/blocks, but also creating your own transactions and broadcasting
them on the network.
The last exercise in Chapter 12 specifically asks you to connect to another node on
the testnet network, calculate what you can spend, construct and sign a transaction of
your devising, and broadcast that on the network. The first 11 chapters essentially
prepare you for this exercise.
There will be a lot of unit tests that your code will need to pass. The book has been
designed this way so you can do the “fun” part of coding. To aid your progress, we
will be looking at a lot of code and diagrams throughout.