The Bitcoin network bundles transactions together into a distributed database known as the block chain. When viewed from within the network, transactions simply represent electronic cash payments. Outside the Bitcoin network, more complex interpretations are possible. Adding application-specific data to transactions opens the door to using Bitcoin not just for electronic cash payments, but new kinds of financial, property, and legal transactions. Questions around how - and even whether - to support these uses have been hotly debated for years.
The 0.9.0 release of Bitcoin Core took the first step toward settling the debate through the standardization of a new transaction type. This article reviews the problem of extending the block chain, and the importance of a solution.
One Man’s Trash
Back in 2010, an ambitious proposal for building a decentralized domain name service on top of Bitcoin appeared on the bitcointalk forum. The proposal consisted of two parts:
- BitX would expand the Bitcoin protocol to support multiple niche applications running top of the same block chain.
- BitDNS would be one such application focused on solving the domain name registration problem.
BitX would have required sweeping changes to the Bitcoin protocol. Over time, interest in this part of the proposal faded. Later, the basic idea would resurface as a centerpiece of Ethereum and other projects.
BitDNS received more immediate attention. A detailed plan to implement it, still available here, called for tight integration of a decentralized domain name service with the Bitcoin block chain. Domain records would be encoded within ordinary Bitcoin transactions.
Opponents of this plan rejected idea of cluttering the block chain with data that only a few users cared about. Satoshi himself even weighed in on the subject:
Piling every proof-of-work quorum system in the world into one dataset doesn’t scale.
Bitcoin and BitDNS can be used separately. Users shouldn’t have to download all of both to use one or the other. BitDNS users may not want to download everything the next several unrelated networks decide to pile in either.
Instead, Satoshi believed that BitDNS should use its own independent block chain. He even offered the first proposal for merged mining as way to secure the new block chain. Although it would take some time for this idea to be implemented, BitDNS eventually evolved into Namecoin, the first altcoin.
Since BitDNS, several schemes for layering new functionality onto the Bitcoin block chain have been proposed. A few were even implemented. However, all of these systems faced the same barrier: transactions lacked a standard method for carrying a data payload.
Transactions are far more flexible, and complex, than they might appear on the surface. Within each transaction lies a small program, or script, written in a custom language. More precisely, a transaction supplies one part of a two-part script. A payment defines a challenge script that places conditions on how a coin can be redeemed. The transaction redeeming the payment provides a response script that satisfies the conditions imposed by the challenge script.
Full nodes (e.g., those running Bitcoin Core) verify a transaction by combining challenge and response scripts into a validation script. The validation script is then executed during verification, just like a program. If the script finishes running without raising errors or returning a non-true result, then the transaction is considered valid.
Validation scripts offer an obvious method for encoding application-specific data, but other techniques have been used. For example, one scheme stored data by writing it into two payment addresses.
Although these ad hoc methods solved the problem of adding application-specific data to the block chain, the long-term costs were a cause for concern. For example, the block chain grows at an exponential rate, doubling in size over the last year alone. This growth exerts upward pressures on storage space, and more importantly network bandwidth.
A more serious concern stems from the need to keep application-specific data within the unspent transaction output (UTXO) set. The UTXO set contains references to all spendable coins, and so should be kept small to ensure fast transaction validation. Embedding application data into addresses or challenge scripts forces each full node to add a reference to the UTXO set.
Validation scripts can choose from a diverse palette of predefined functions. Due to security concerns, only a handful of functions are permitted in standard transactions. Because miners only relay standard transactions, many of the most useful script functions remained off-limits.
The 0.9.0 release of Bitcoin Core added a new standard transaction type granting access to a previously disallowed script function,
OP_RETURN. This function accepts a user-defined sequence of up to 40 bytes. When a transaction containing a challenge script with an
OP_RETURN function is mined into a block, the accompanying byte sequence enters the block chain.
Although stored in the block chain,
OP_RETURN bytes are excluded from the UTXO set, conserving a scarce resource. As a side effect, an output using an
OP_RETURN challenge script becomes unspendable. For this reason, the value of an
OP_RETURN output is usually set to 0.
The OP_RETURN Compromise
OP_RETURN and its 40-byte limit represent a compromise between two opposing visions of Bitcoin’s future.
One camp sees the block chain as a secure, decentralized data store on which numerous financial and social applications can be built. Promoting the growth of these new applications helps ensure Bitcoin’s long-term relevance. Allowing transactions to carry application-specific data in a standard way advances this goal.
The other camp views the Bitcoin block chain exclusively as a medium for recording electronic cash payments. Even so, important scalability issues will need to be addressed sooner or later. Trying to accomodate the data requirements of arbitrary application layers only raises the cost of maintaining the network today, while pushing forward the eventual day of reckoning.
The 40 byte limit on
OP_RETURN data constrains the use of the block chain as a data store. For example,
OP_RETURN was originally expected to support 80 bytes of data. One of the strongest critisisms of the later 40-byte limit came from Counterparty, who claimed that 40 bytes was not enough to support the system of peer-to-peer markets and financial instruments it had created.
However, a 40 byte sequence more than suffices to encode an identifier such as a hash value. This value can uniquely represent any digital document, from an image, to a poem, to an abstract data structure. Embedded hash values in turn offer a method to link the block chain to other data stores such as distributed hash tables.
The Bitcoin Core 0.9.0 release notes attempted to clarify the purpose of
On OP_RETURN: There was been some confusion and misunderstanding in the community, regarding the OP_RETURN feature in 0.9 and data in the blockchain. This change is not an endorsement of storing data in the blockchain. The OP_RETURN change creates a provably-prunable output, to avoid data storage schemes – some of which were already deployed – that were storing arbitrary data such as images as forever-unspendable TX outputs, bloating bitcoin’s UTXO database.
Storing arbitrary data in the blockchain is still a bad idea; it is less costly and far more efficient to store non-currency data elsewhere.
These guidelines re-iterate Satoshi’s original view that external data should not be stored in the block chain.
Uses of OP_RETURN
Since its introduction in March 2014,
OP_RETURN has been examined from a number of angles. At least one service, Proof of Existence, now uses
OP_RETURN to permanently link digital documents to the block chain. Counterparty’s lead developer announced the discovery of a method to run the system off of 40-bytes of OP_RETURN data. Mastercoin, another project relying embedded data, started a discussion dedicated to finding a way to use
Stealth addresses offer another example of
OP_RETURN in action. This scheme enables payments to be received without publicly revealing the receiver’s public key or address. Data needed to make this system work are encoded within a call to
OP_RETURN. In essence, Bitcoin does double duty as a secure messaging protocol.
Although the best uses of
OP_RETURN may take some time to materialize, one thing is clear. Many Bitcoin users see value in adding data payloads to transactions, and some have started using
OP_RETURN for this purpose.
OP_RETURN can be monitored via Coin Secrets.
Bitcoin’s long-running debate over acceptable uses of the block chain has received some much needed clarity. Applications can now inexpensively add a 40 byte data payload to transactions using the
OP_RETURN script function. On a technical level,
OP_RETURN doesn’t enable anything that wasn’t previously possible. Instead,
OP_RETURN provides a standard interface through which new services can potentially be layered onto the block chain, and a central point of focus for future work on integration tools.
This move brings the vision of Bitcoin as a universal platform for mediating complex agreements one step closer to reality. The extent to which the Bitcoin community will embrace this vision remains an open question.