Transactions support a rich palette of composable features. Yet this system of programmable money remains difficult to reason about and apply. A visual language could simplify transaction design and documentation. This article introduces one that was developed for the book Owning Bitcoin.
Before proceeding, recall that Bitcoin is an electronic cash system. Physical cash allows users to exchange value by giving and accepting metal coins and paper bank notes. Electronic cash replaces these physical tokens with tokens comprised of digital data.
In electronic cash systems, a transaction provides the medium through which value is exchanged. The purpose of a transaction is twofold: (1) to demonstrate the right to unlock an existing coin; and (2) to generate and relock one or more new coins. Transactions contain two kinds of entries: inputs and outputs. The purpose of an input is to unlock an existing coin. The purpose of an output is to lock a new coin.
The most common operation in a Bitcoin transaction is to lock an output to an approved identity, represented by a public key. Unlocking the output requires a signature consistent with the approved identity. This is the basis of a signature lock.
Transaction diagrams are laid out as follows. Inputs appear to the left of the vertical line and outputs appear to the right. A circled number represents the face value to be either locked or unlocked. An avatar below the face value represents an identity. An avatar appearing on the input side means that a signature is required and was given. An avatar appearing on the output side means that a signature will be required as a condition for unlocking.
In the examples presented here, participants use self-generated identities known as “pseudonyms.” Every party may (and should) use a unique pseudonym for each coin they control. Personally-identifying avatars are used only to make ownership explicit.
Joint control of money can be essential in a number of situations. For example, corporations and families receive and spend funds by mutual consent. Bitcoin supports this form of collaborative control through multisignature locks.
An output protected by a multisignature lock will be unlocked by a signature from one or more named identities. Signature requirements are expressed using m-of-n notation, where m is the number of required signatures (or threshold) and n is the number of allowed identities. For example, Alice and Bob may run a business together and want to ensure that either party can spend monthly cash flow. They create a 1-of-2 multisignature output, allowing either Alice or Bob to spend unilaterally. Alternatively, Alice and Bob may want to require both signatures for capital purchases. A 2-of-2 multisignature lock accomplishes this goal.
An input requiring a signature may exist in unsigned form. This becomes especially important for outputs protected by multisignature locks, when a partially-signed transaction will need to be stored or passed between signers until a threshold is obtained.
Chain of Ownership
A transaction may be depicted in isolation, but it should always be regarded as part of a chain of ownership in which each input unlocks a previously-locked output. For example, a more complete chain of ownership would include not only Alice’s transaction but that of Carlos, which produced the coin she’s spending.
To emphasize the parent-child relationship, Carlos’ transaction appears to the left of Alice’s. This ordering allows transaction chains of ownership to be read chronologically from left-to-right.
An arrow connects Alice’s input to the output Carlos locked. Consistent with the fact that the input references the output it spends, the arrow points toward the output – not the other way around. In other words, the arrow does not represent the “flow” of money, but rather the direction in which references are made. The input of Alice’s transaction depends on the output of Carlos’ transaction, but the output of Carlos’ transaction exists independently of Alice’s transaction.
Chain of ownership links two of Bitcoin’s most important concepts: security and privacy. Tracing a coin’s chain of ownership to a valid issuance (coinbase) transaction authenticates it. But the same capability also enables money flows between users to be traced. Obfuscating chain of ownership therefore increases user privacy, a topic that will be discussed in the future.
A chain of ownership doesn’t by itself prevent two or more sibling inputs from spending the same parent output. This situation is known as double spending. Detecting and preventing siblings from being confirmed is the main purpose of the Bitcoin network. Nevertheless, unpublished sibling transactions held privately between two or more users can be very useful in advanced applications.
A user might want to pay for an item costing less than the lowest valued coin on hand. In this case, an output can be added to collect change. The payer ensures that the change can later be spent by locking the output to his/her own signature.
Sometimes the value to be tendered exceeds the value of any coin on hand. In these situations, two or more coins can be spent by adding inputs. Change can still be recovered by adding an appropriately-valued output. To prevent the arbitrary creation of new money, the cumulative value of inputs must exceed the cumulative value of outputs.
Should the cumulative value of inputs exceed the cumulative value of outputs, the difference is claimed by the network as a fee. Fees in Bitcoin are implicitly defined this way. In keeping with this model, the visual language does not explicitly capture the value paid in fees. Most scenarios will ignore fees altogether unless their presence is required.
Sometimes a payee wants to delay the confirmation of a transaction until some point in the future. Bitcoin supports such uses with time locks. A transaction bearing an unexpired time lock will be considered invalid until the time lock expires.
Bitcoin supports two forms of time lock: absolute and relative. An absolute time lock expires after a deadline. A relative time lock expires after a specific output maturity. Maturity refers to the length of time since the output was confirmed, or included in the block chain. Both forms of time lock can be expressed in units of block count or elapsed seconds.
Relative and absolute time locks differ in scope. An absolute time lock applies to the entire transaction, whereas a relative time lock applies only to a single input. The visual placement of the time lock reflects this difference, with an absolute time lock spanning the bottom of the transaction and a relative time lock just spanning the bottom of the input to which it applies.
Time Locked Output
A time lock allows a payee to delay a transaction’s confirmation, but it can’t guarantee the spendability of any referenced output. For example, nothing prevents a payer from applying a time lock to a transaction, then double spending one or more outputs in the interim. This problem can be solved with a time locked output.
The way this works is subtle and perhaps confusing. A time locked output requires any spending transaction to impose the appropriate time lock on itself. Until the time lock expires, the spending transaction will remain invalid. With no way to create a spending transaction that will be valid before the time lock expires, the output remains unspent.
An output locked to a signature doesn’t require the spender to reveal a secret (private key) but only prove knowledge of it. In certain situations, however, forcing a payee to reveal a secret as an unlocking condition can be useful. Single-use secret challenges are available to outputs secured by hash locks.
A hash lock requires a spending to publish the solve a mathematical puzzle: given a hash value h produced by a given hash function, present the message m that generated it. A hash function transforms a message m into a hash value h deterministically. Cryptographically-secure hash functions such as SHA-256 and RIPEMD-160 resist preimage attack, in which the message that produced a particular hash value (the “preimage”) is discovered by any means other than brute force guessing.
The main application of hash locks is to securely bind two otherwise unrelated transactions together. For example, imagine that Alice needs to pay Carol indirectly through Bob, and that none of the parties trust each other. Alice worries that Bob will just steal her payment, and Bob worries that Carol will just steal his payment.
The parties can solve their problem by using the same hash lock for both transactions. Carol begins by generating the hash value h of a secret message m. She then gives h to Alice, who uses it to lock her payment to Bob. Likewise, Bob uses h to lock his payment to Carol. Spending Bob’s payment requires Carol to publish m, also allowing Bob to spend Alice’s payment.
Payment relay of this sort is both contrived and insecure. However, it does lay the groundwork for much more robust and useful protocols including the Lightning Network, atomic swaps, and zero-knowledge contingent payments.
The future is uncertain. Sometimes the parties controlling an output want a lock to be compatible with one of several possible responses. This kind of flexibility is available with a conditional lock.
For example, Alice and Bob may want to place a deadline on the joint control over their business capital. For one year, spending an output requires signatures from both partners. After one year, either partner can spend unilaterally. They can accomplish these goals by securing an output with a conditional lock. One branch of the lock will be satisfied immediately given both partners’ signatures. The other branch requires only one signature, but can only be used after one year.
A payer will often want a refund in the event of non-delivery of goods or services. This can be accomplished by including a conditional branch that includes a signature of the payer and a time lock.
Imagine that Alice wants to pay Bob for the hash value of a preimage, but she’s worried that Bob may never deliver it. Alice won’t proceed without a guaranteed refund after a deadline. She can get it with a conditionally-locked output. One branch pays Bob for the hash value. The other refunds Alice after two days.
This kind of refund can be added to almost any transaction. For example, Bob may run a security service that only co-signs outputs after performing risk-analysis steps. This requires a 2-of-2 multisignature lock. But Alice is worried that something may happen to Bob that would prevent him from signing. She solves this problem by adding a refund that expires after 30 days.
Sometimes the purpose of a transaction isn’t to transfer value, but rather to publish data. This use case is supported through a data output. A data output is a coin of zero face value that can never be spent. Up to 80 bytes of data may be associated with the output, as described previously.
Nick Szabo coined term “smart contract” in a 1996 essay:
New institutions, and new ways to formalize the relationships that make up these institutions, are now made possible by the digital revolution. I call these new contracts “smart”, because they are far more functional than their inanimate paper-based ancestors. No use of “artificial intelligence” is implied. A smart contract is a set of promises, specified in digital form, including protocols within which the parties perform on the other promises.
The examples presented here demonstrate that even the simplest Bitcoin transaction should be viewed as a smart contract. This point must be emphasized due to the common misconception that Bitcoin is too simple to support smart contracts.
On the contrary, even more sophisticated contracts can be created by composing the building blocks described here. Presenting these contracts using a simple graphical language will hopefully make them easier to understand. Future posts will explore this idea in detail.
The Noun Project supplied some of the vector graphics, which can be used under a Creative Commons license: