How Bitcoin Works

Introduction

You are probably already familiar with how Bitcoin works at a high level, that it uses proof of work, has miners, etc. But to be able to start building with it, it's important to get an in-depth knowledge of how it works under the hood.

This tutorial is going to serve as an introduction to how Bitcoin operates under the hood. We'll look at how mining works, the hashing algorithm Bitcoin uses, transactions, UTXOs, SPVs, basic public key cryptography and address generation, and more.

We'll go through step by step, looking at how each component works and how they all interact to create the blockchain itself.

This is a pretty long tutorial, but by the end of it, you'll have a solid grasp of how Bitcoin operates and will be in a great place to start building robust decentralized apps with it.

Nodes

The first basic building block of Bitcoin is the network of nodes that comprises it. Remember that the entire point of Bitcoin is that is a distributed network of computers. These computers are called nodes. A node is just a computer that is running the Bitcoin software and is also connected to other nodes doing the same.

Anybody can run a Bitcoin node and the requirements to do so are relatively low, this is a major part of the reason why Bitcoin can be so decentralized. As blockchain protocols add complexity and speed, the hardware requirements to run a node increase, making fewer people able to run a node.

But why are more nodes good, and what role do they play in Bitcoin?

Nodes are the backbone of Bitcoin and what makes it secure and decentralized. They do three main things:

Follow the rules of the protocol
Broadcast new transactions
Store confirmed transactions

The Bitcoin protocol has pre-defined rules that nodes are required to follow. If they receive a broadcasted transaction that does not follow these rules, they simply ignore it.

If the transaction is legit and follows the rules, they broadcast it to the rest of the network. This transaction gets sent around the network to other nodes, in what's called the mempool until is confirmed and written to a block.

A node's mempool is where it collects all of the new transactions it has received while it waits to package them up into a candidate block.

We'll get more into blocks next, but at a high level, they are a collection of different transactions.

Each node will then try to get its block added to the chain through the mining process.

Nodes can also broadcast confirmed transactions, which are transactions that have been verified as legitimate and accurate. These transactions are batched into blocks and broadcasted that way by nodes.

Note that any node can be a miner, but not all nodes have to mine. Some nodes are only responsible for relaying newly created blocks and storing the history of the chain, but they don't participate in the mining process, as this can be costly.

We'll get into the makeup of blocks and how nodes which transactions and blocks to include next.

Finally, nodes also keep a copy of all confirmed blocks of transactions, which is what is collectively called the blockchain. The fact that all nodes are required to maintain this history of transactions is what makes Bitcoin immutable.

There is a verifiable history of transactions that all nodes agree on, and no single node can change.

Finally, you don't have to be a node to utilize Bitcoin. Anybody that has an address can use Bitcoin and initiate new transactions. If you want to be able to independently verify all transactions and the history of the chain yourself, and contribute to Bitcoin's decentralization, you should run a node, but you don't have to if you want to use Bitcoin.

Addresses and the public key cryptography that generates them are separate topics, and we'll cover them later. For now, just know that a Bitcoin address corresponds to what can be thought of as a user account, and those user accounts can initiate transactions that get relayed to nodes.

Then the nodes conduct the process we went over above.

Transactions

Before we look at how blocks are constructed, let's first take a look at how Bitcoin transactions work.

As we noted above, any Bitcoin address can generate a new transaction to be broadcasted to the network.

A transaction is just a set of data that indicates the amount of Bitcoin being sent, what address it is being sent from, and what address it is being sent to.

The entire collective history of these transactions is what forms the Bitcoin blockchain.

All you are doing when you make a transaction is sending these pieces of data to the Bitcoin network. Eventually, a miner will add this transaction to a block and it will be confirmed on-chain, living on in perpetuity.

A transaction is a record of moving coins from one wallet to another, but it's not like how you might picture a bank account, where you are just moving a number from one bucket to another.

Instead, Bitcoin addresses (covered more below) are a record of every transaction that the address has ever sent or received, and this is how we can read the balance of a transaction.

Additionally, when you send bitcoins from your address to someone else, you are not directly sending a portion of your total bitcoins, you are batching transactions you have already received and sending that as a new transaction to someone else.

That's a little confusing, so let's lay it out with an example.

Let's say I receive three bitcoin transactions for 1, 2, and 3 BTC. I might have 6 BTC total, but it is recorded simply as me having received these three transactions.

So if I want to send 5 BTC to someone, my two transactions for 2 and 3 BTC will be batched together and sent as one new transaction. These batches are called outputs.

What if want to send 5.5 BTC to someone? In that case, Bitcoin will batch each of my transactions into a new output for 6 BTC, and then create an additional output to send 0.5 BTC back to myself.

It's weird but works better from a programming perspective.

Each transaction is locked using cryptography, so I can't just submit a new transaction that sends 1 BTC from your address to mine. We'll cover this in more detail below in the Addresses section.

If you've seen the term UTXO, that stands for "unspent transaction output".

If we modify our example above and I have a fourth transaction in my address for another 3 BTC, that output was not used in me paying the 5.5 BTC, so that is known as an unspent transaction output.

All of my other outputs are considered spent and cannot be used again, but this other one can because it is unspent.

Similarly, the 0.5 BTC that I sent back to myself is now a new UTXO that I can spend in the future.

UTXOs are critical because the total number of bitcoins an address owns is simply the sum of all of its UTXOs.

The Mempool

When someone first initiates a new transaction, it does not immediately get written to a block. Instead, it gets sent to the mempool, which is a waiting room for transactions until they get picked up by a miner to be written to a block.

Blocks

Bitcoin blocks consist of a couple of key components: transactions and a header.

Transactions are bundled together and written to a block in the form of a merkle tree. Starting at the bottom, we have all of the transactions in a block and each is hashed with the SHA-256 algorithm. Then, two of those resulting hashes are hashed together, and so on up the tree until we reach a single root.

You can visualize this using this graphic from Subhan Nadeem's excellent guide to Bitcoin mining.

3Y5SmuCwRz8GnPlpMVo9SUG0n3mg95o4fwoP

This merkle root is the foundation of how the Bitcoin blockchain maintains perpetual data integrity. This single merkle root is the unique "summary" of all of the transactions contained in this block. If even one byte of data in any of these transactions was changed, the merkle root would be completely different, rendering a block invalid.

Since this violates the rules of the Bitcoin protocol, a node would catch this change and reject that block as invalid.

Next is the block header, which is made of 6 components:

Bitcoin software version number
Block timestamp
The merkle root of this block's transactions
The hash of the previous block
A nonce
The target

By containing the hash of the previous block in every block header, we can create an immutable chain of blocks and an ongoing ledger of every transaction that has ever occurred. This is where we get the term blockchain.

Nonce and target are two components that are critical to the mining process, and we'll cover them below when we dive deeper into mining.

Miners

As you probably know, Bitcoin uses proof-of-work mining. What does that mean? The Bitcoin protocol uses a SHA-256 hashing algorithm. SHA-256 is a one-way deterministic hashing algorithm that takes an input and returns a hash in the form of a 256-bit number.

That means:

One-way: we provide an input to the hashing function, and it gives us an output, but we cannot then take that output and get back the original input
Deterministic: If we provide the same input, we will get the same output every time. If we change even one character, we get a completely different hash

Let's look at an example of this using this hash calculator.

If we type "Hello World" into the input box, we get a hash back.

We get the hash a591a6d40bf420404a011733cfb7b190d62c65bf0bcda32b57b277d9ad9f146e.

Let's see what happens if we change even one little character, let's change the "H" to be lowercase.

SHA-256 gives us a completely different hash. But, if we switch back to a capital "H", you'll see that we get the same hash that we got before, that's the deterministic part of the hash function.

But we said the output is a 256-bit number, what's with the letters?

The output is represented as a hexadecimal number, which uses base 16 instead of the base 10 representation most of us are used to with the decimal system. In decimal notation we can only represent values with 0-9, and with hexadecimal we expand that to use the symbols a-f to represent the values 10-15.

So in a hexadecimal number, 3a would actually represent the values 3 and 10.

Okay, so what do these hash values have to do with mining?

When someone becomes a miner, their Bitcoin software will take a megabyte worth of transactions (remember that transactions are sent to all nodes) run it through SHA-256, and continue trying different pieces of data until it gets to a number that the network accepts, in which case they are given the block reward.

The block reward is how new Bitcoin is minted and is also the financial incentive for miners to spend computational energy to try to guess this hash value.

But what data is getting submitted to the hash function by the miners? And how does the network decide which is acceptable? More importantly, how does this process ensure that only valid data is written to the chain?

Bitcoin Mining In-Depth

Let's go back up to the target and nonce fields of the block header we mentioned above.

Now that we have some context into how hashing plays a role in mining, we can see what these fields are used for.

Remember that the output of a SHA-256 function is just a hexadecimal number.

The target field of the block header is also a number but in traditional base 10 notation.

The goal of miners is to take the information contained in the current block header, add a random number to it called a nonce, and calculate the hash. If the hash value is lower than the target value, then the miner writes the block to the chain and is rewarded with the block subsidy.

The "work" part of the proof of work consensus mechanism consists of mining software trying an extremely large number of different nonces to get a hash value that is lower than the target.

So a miner will start with a nonce of 0, and then increase the nonce one at a time until the hash outputs a number that is lower than the set target.

Even for the Genesis block, the highest the target number will ever be, required 2,083,236,893 attempts to get the final acceptable nonce.

Miners get their rewards by adding a transaction to their block rewarding themselves, this is a special transaction called a generation transaction and is accepted by the network upon publishing a valid block.

So that is how things work from the miner side, but how do the other nodes know that the transactions contained in this published block are legitimate?

The rules that transactions need to follow are coded into every Bitcoin node, so a receiving node will first check to make sure that all transactions contained in the new block follow the rules.

If they do, it will then double-hash the header of the published block to verify that it is lower than the target. The miner publishes the successful nonce with their header, so the validating nodes only need to run the hash function with the provided nonce to ensure it is correct.

Even if a miner goes through the costly process of trying to mine a new block, they still need to include valid transactions or their efforts will be wasted, since the validating nodes won't accept an invalid block.

Difficulty Adjustment

Trying all the different inputs required to generate an accepted hash requires an immense amount of energy, and this energy expenditure is the core of what secures the Bitcoin chain.

Bitcoin has a built-in difficulty adjustment, which automatically adjusts the difficulty of guessing the correct hash depending on how many miners are online.

Every two weeks, Bitcoin analyzes miner activity and adjusts the difficulty so that the average time to write a block is 10 minutes.

It does this by raising or lowering the target number to make it so that the odds of guessing the nonce roughly match up with it taking 10 minutes to occur based on the current hashrate of the Bitcoin network.

Because of this, we can verify the real Bitcoin chain by determining which is the longest, we'll get into this a bit more in the Blockchain section below.

Addresses and Keys

So other than being a node, how do we actually become a Bitcoin user and participate in the network?

And since transactions are just lines of data, what's to stop me from initiating a transaction sending BTC from your address to mine?

This is where addresses and keys come in.

Rather than a traditional web app, where you might be identified by a username or an email address, and authenticate with a password, Bitcoin uses the concept of addresses with public and private keys to establish identities and verify transactions.

Let's see how this works.

Bitcoin identities or accounts or composed of two primary components, a public key and a private key. You can think of a public key as your username, it's publicly visible and is used to establish your identity and separate it from everyone else.

Your private key should not be shared with anyone and this is how you prove that you are the owner of said public key. You can receive bitcoins with a public key, but you can't send them without also having access to the private key.

Public keys are long and unwieldy, so the Bitcoin protocol also generates condensed versions of these called addresses. This is usually how you will interact with other Bitcoin users. Addresses contain a few other neat little tricks that make things easier, Learn Me A Bitcoin has some interesting information on this.

How do these get generated?

The first step is to generate a private key, which is a really big random number in hexadecimal format.

Then we use this private key to generate our public key, which is publicly visible. Since we don't want anyone to be able to determine our private key from our public key, we use a one-way mathematical function to generate this public key.

This is a deterministic function, so anytime we pass our private key into this function, we will get this same public key.

That's how we can authenticate our transactions. Anytime we send out a new transaction, it gets sent out with a lock on it. Remember that all of the bitcoins we own are really just sets of outputs that were sent to us.

Well when they were sent, the sender placed a lock that said that only we can open it, or only this address can open it. And the key we use to open that lock is our private key.

So even though outputs are being broadcast to the entire chain, they are being broadcast with these locks so that only the owners can take them and use them as inputs in new transactions.

Blockchain

Now that we've covered the essential components of Bitcoin, let's zoom out and take a look at the Bitcoin blockchain as a whole.

The blockchain is a file containing every Bitcoin transaction that has ever occurred, added in blocks, which are all connected to each other, hence, blockchain.

When you run a Bitcoin node, the first thing it will do is download a copy of the entire blockchain.

It will do this by connecting to other nodes on the network and asking for a copy of the blockchain from them. As part of this process, nodes communicate the height (how many blocks) of their chain. Nodes do this continuously so they are always sharing the current state of the chain and replicating it across every node in the network.

As we discussed above in the mining section, new blocks are generated by miners, which then broadcast that new block to the other nodes, and they add it to their copy of the blockchain.

One thing we didn't address in the mining section is what happens when two blocks are mined at the same time. It is possible, and normal, for two miners to solve for two blocks at roughly the same time.

Since it takes time for changes to propagate across the network, nodes will receive different blocks at different times.

When this happens, nodes will take the first block they receive as part of their chain, and they will also accept the second, but it won't be considered part of the active chain.

At this point, some nodes in the network will be in disagreement about which block belongs at the tip of the chain.

How does this problem get solved?

When the next block is mined, it will be mined on top of only one of these two blocks, which now makes that particular chain the longest.

As a result, the nodes will drop the other chain since it is no longer the longest. This process of removing blocks from an older, inactive chain in favor of the blocks from the newer, active chain is called a chain reorganization.

What happens to the transactions in the block that was dropped? According to the network, they are invalid and do not exist. So if you try to spend bitcoins from an output that was contained in this dropped (orphan) block, it won't work.

But when two blocks are mined at about the same time they usually contain the same transactions, so nothing usually happens.

But even if there were some transactions contained in the orphan block that were not contained in the accepted block, they would just get sent back to the mempool to be picked up again, so the worst-case scenario is that it takes a little longer for the transaction to be processed.

But, this is not a guarantee, so it's a good idea to wait for your transaction to be included more than one block deep before considering it final.

So technically, Bitcoin blocks can be replaced. If theoretically, you were able to somehow produce enough blocks to create a longer chain than the one currently accepted by the nodes, then you could take over the network and put whatever data you wanted in those blocks.

The problem is this is technically next to impossible due to one of the key innovations we briefly touched on above, the difficulty adjustment.

The difficulty adjustment ensures that a certain amount of time has passed to create the current state of the chain. That's why I can't just create my own private Bitcoin chain and then push it out onto the network for nodes to adopt.

Okay but couldn't I just take the existing chain and build a new chain on top of it? You could, but you would need to be able to outpace the entire network of miners building on the longest chain.

Remember that the difficulty adjustment uses the average hashrate over the last two weeks, meaning that you would get nowhere trying to outpace the network at the current hashrate unless you hate over 50% of the mining power. At that point it would just be a matter of time until you were able to perform a re-org, depending on how far down the chain you wanted to replace transactions.

This is known as a 51% attack and while it is technically possible, it has never been done before.

Note that this is another thing that differentiates proof of work from proof of stake. In proof of stake protocols, we don't have this proof of the passage of time functionality built-in, so I can create as many false copies of a proof of stake chain as I want without having to expend any resources.

This means that a validator in a proof of stake system able to gain a majority of the required staked asset can swap the entire chain with one of their choosing and there would be no recourse, aside from hoping that the community would voluntarily adopt an honest version of the chain.

In a proof of work system, even if an attacker was able to successfully conduct a 51% attack, they still have to continue to maintain that power over a progressively longer period to reorg further down the chain. In a proof of stake system, it's game over.

Script

Script is a stack-based mini programming language built-in to the Bitcoin protocol. It is primarily used for locking outputs and setting certain rules that must be met to unlock them.

A locking script is placed on every output and must be unlocked with an unlocking script before that output can be spent and used as an input. All conditions on both scripts must be valid for the output to be unlocked and used.

Script is a very basic language and consists of two basic building blocks:

Data (signatures, public keys)
Opcodes (simple functions that operate on that same data, here's a list of all opcodes)

We'll go over the basics of how Script works here and we'll experiment with it and dive deeper in the next lesson.

Script is a bit funky, especially if you are used to more web-oriented programming languages like JavaScript, Solidity, PHP, etc. It's a stack-based programming language.

What does that mean?

A stack is just a low-level data structure or a way to store data.

You write a script and it is read from left to write, interacting with the data via a stack. You can think of the stack as an empty silo that data is pushed into.

So you have a piece of data on the left of a script, and then OPCODES can pull data out of the stack, do something with them, and push new data back onto the stack.

Stacks follow the LIFO principle, or last-in, first-out, to determine the order in which the pieces of data are operated on. You have to add to the top of the stack and you can only pull from the top of the stack.

A script follows this process until it reaches the end and is considered valid if the only value left in the stack is a 1 or greater.

Here is a good visualization from Learn Me A Bitcoin:

This is what's called a P2PKH script, and is the most common script used. This locking script is the one used any time you send bitcoins to someone.

I recommend learning how it works before moving on.

There are two primary pieces of functionality that we can use to work with stacks: push and pop. Push will add a piece of data to the top, and pop will pull a piece of data from the top and return it.

Then we can take this data and run it through an OPCODE function to turn it into something else, and then push it back onto the stack.

This is all a bit complex and is difficult to wrap your head around at first, but this basic lock/unlock process using scripts is how every output is sent and unlocked by the appropriate user. Don't worry too much about completely understanding this right now, we'll do some concrete practice with Script in the next tutorial when we begin building our app.

We have these scripts instead of just basic public/private key verification so that we can create different types of locks that do different things. This is how Bitcoin offers programmability.

But part of the reason that we can't do nearly as much with Bitcoin as we can with Ethereum is that these scripts and OPCODES are very limited. And, as an additional limitation, there is a very small subset of OPCODE combinations called standard scripts that nodes relay.

5 standard scripts offer a few different pieces of functionality that nodes will relay. You can read more about these on Learn Me A Bitcoin, but the basic reason is safety and security.

Not all scripts have been tested, so it's a security risk to allow all these different combinations and open up attack vectors.

This limitation is part of what makes Bitcoin extremely secure, but it also makes it so that we can't build robust smart contracts on Bitcoin.

When people talk about smart contracts on Bitcoin, this Script language is what they are referring to, and it is extremely limited by design.

Let's look at two common use cases for modern smart contracts, DeFi and DAOs.

Let's say we want to build a DeFi application that allows us to lend our bitcoins in exchange for interest. Right now, we can't do that without giving someone else custody of our bitcoins and letting them do it for us.

This introduces a trusted intermediary which defeats the entire purpose of having decentralized money. Ethereum users realize this and they have created a robust ecosystem of DeFi applications that allow users to earn on their assets.

How can we do this trustlessly with how limited Bitcoin's Script language is?

This is where Bitcoin layers like Stacks, RSK, and Liquid come in. They allow us to build separate networks that hook into Bitcoin and expand its functionality.

We'll dive deeper into these limitations and how we can still build robust smart contracts with layers (Stacks specifically) in future lessons in this series.

Updates and Forks

While Bitcoin is generally resistant to change and doesn't change as often as other chains might, that doesn't mean it is completely stagnant. There have been several changes and upgrades over the years.

In general, these changes can be divided into two categories: hard forks and soft forks.

Hard Forks

A hard fork is where the chain splits and creates an entirely new version. The new chain is not backward compatible and nodes need to choose between the two.

There have only been a few long-lasting hard forks in Bitcoin's history, Bitcoin Cash is one of them after the heated blocksize war, which is a fascinating read.

Bitcoin also accidentally hard forks relatively frequently for a short amount of time, we discussed this when we talked about chain reorganization above.

Hard forks are a radical change that should only be carried out when absolutely necessary, as it is necessary to get the entire network to agree in order to have them switch to the new chain and not create two versions, as with Bitcoin Cash.

Soft Forks

In contrast, soft forks are backward compatible changes where nodes see both versions of that chain as valid. SegWit was a major upgrade to the Bitcoin network that we'll talk a bit more about below and was a soft fork.

Bitcoin also has a very cool ability called user-activated soft forks. UASFs allow users of Bitcoin like wallet operators, exchanges, businesses, and other users running full nodes to move to a new version of a chain that will have some activation point in the future.

This forces miners to utilize this new forked version or else end up mining a separate chain, killing their profits. This is primarily to prevent control of the network from being in the hands of the miners, which was a major concern during the blocksize war.

I highly recommend anybody interested in the surprisingly dramatic history of Bitcoin forks read the book linked above, The Blocksize War.

Two major upgrades to Bitcoin that I recommend you read more about are SegWit and Taproot, as they both have impacted how Bitcoin works in relatively major ways.

Wrapping Up

You now have a basic familiarity with how Bitcoin works, but we only scratched the surface here. If you are interested in diving into the technical aspects of Bitcoin, Learn Me A Bitcoin is an excellent free resource.

Throughout the rest of this series, we are going to be building a full-stack Bitcoin application using the Bitcoin base layer and Stacks. As we begin to build out this functionality we'll dive deeper into the specifics of how Bitcoin works and how you can use it to build robust decentralized applications on top of the world's most decentralized, most secure blockchain.

The app we're going to build is Terrapin, which will serve as an MVP to create a Bitcoin-based DAO for funding and conducting scientific studies.

We'll look at how to set up a Bitcoin dapp and see what we can accomplish with Bitcoin alone. Then we'll take a look at some of the functionality we need to build robust, decentralized dapps that the Bitcoin base layer is not capable of, and how Stacks and Clarity fit in to solve that problem.

We'll cover everything from start to finish using the following tools, among others:

React
Tailwind CSS
Stacks.js
Bitcoin
Clarity
sBTC

Coming up next, we're going to get Bitcoin running on our local machines and look at how we can interact with it, getting hands-on with the concepts we discussed above.