The Transaction Graph for Modeling Blockchain Semantics

The advent of Bitcoin paved the way for a plethora of blockchain systems supporting diverse applications beyond cryptocur-rencies. Although in-depth studies of the consensus protocols as well as the privacy of blockchain transactions are available, there is no formal model of the transaction semantics that a blockchain is supposed to guarantee. In this work, we ﬁll this gap, motivated by the observation that the semantics of transactions in blockchain systems can be captured by a directed acyclic graph. Such a transaction graph, or TDAG, generally consists of the states and the transactions as transitions between the states, together with conditions for the consistency and validity of transactions. We instantiate the TDAG model for three prominent blockchain systems: Bitcoin, Ethereum, and Hyperledger Fabric. We specify the states and transactions as well as the validity conditions of the TDAG for each one. This demonstrates the applicability of the model and formalizes the transaction-level semantics that these systems aim for.


INTRODUCTION
The success of Bitcoin [Nakamoto 2008] has sparked the development of many other blockchain systems. Whereas the first blockchains after Bitcoin (called alt-coins) resembled the cryptocurrency functionality offered by Bitcoin and mostly differed in the choice of certain parameters, Ethereum [Ethereum 2017] was the pioneer of so-called smart contract systems that support arbitrary (deterministic) computation on the blockchain. Platforms for running smart contracts are seen to be of wide-spread interest for replacing trusted parties, whether in public blockchains where participation is open to anyone or in private blockchains inside a consortium.
Blockchain systems have attracted attention not only from industry but also from academia. Many works have analyzed blockchains from different perspectives, for example, focusing on the underlying consensus protocols [Garay et al. 2015;Eyal et al. 2016;, their privacy guarantees [Meiklejohn et al. 2013;Ben-Sasson et al. 2014;Ruffing and Moreno-Sanchez 2017], and many more aspects. This collection is necessarily partial; excellent surveys exist in the literature [Bonneau et al. 2015;Tschorsch and Scheuermann 2016;Armknecht et al. 2015;Narayanan et al. 2016].
What is, surprisingly, missing to date is a formal model of the semantics of a blockchain, addressing the transaction-level consistency guarantees that they aim to achieve. These guarantees are intuitive and easy to grasp in the context of Bitcoin: given a proper modeling of the mining of new coins, the overall amount of bitcoins must remain invariant. For the newer, generic, and more complex blockchains, such as Ethereum or Hyperledger Fabric, a proper model of the guarantees they provide appear necessary. For instance, such a model should allow for reasoning whether the intuitively expected guarantees are indeed achieved. It should also model the operation of a blockchain at an appropriate level, such that the properties of a system appear concisely and differences across platforms become visible. In particular, it has to describe the criteria that determine whether a transaction that manipulates state is considered valid and consequently executed by the nodes.
Our contributions. We introduce a formal model, called the transaction graph or TDAG for short, a directed acyclic graph that models the transactions occurring on a blockchain and how they interact through states. In a nutshell, a TDAG is a graph consisting of transactions that link states to each other. Each transaction may consume, observe, or produce states, and occurs only with respect to an external input that triggers the transaction. The model abstracts the transaction validation into a predicate that can be evaluated locally in the graph, in the sense that validation only considers the relevant states; this corresponds to how many blockchains work, during the process of transaction validation and consensus, which must be efficient and based on local state. The TDAG is a generic model to encode properties expected from every blockchain system, such as notions of validity and consistency, and for characterizing the invariants that must be enforced in a blockchain.
We instantiate the TDAG model for three different prominent blockchains: Bitcoin, Ethereum, and Hyperledger Fabric. For each system, we formally define the states and transactions of the TDAG, specify the notion of consistency, and describe the validity of transactions. This shows the broad applicability of our model, and results in an abstract description of these real-world systems.
Related work. Atzei et al. provided a formal transaction model for Bitcoin [Atzei et al. 2018]. While their model covers certain aspects of Bitcoin, like scripts and multi-signature, in more detail than ours, it does not allow to model and compare with other blockchain systems. The TDAG can be seen as a refinement of the precedence graph (or serialization graph) from database concurrency theory [Elmasri and Navathe 2011], which relates transactions with conflicting data access. The TDAG in addition contains states as vertices, as one goal of the TDAG (besides formalizing conflicts) is to make statements about the consistency of the states.

TRANSACTION GRAPHS
This section introduces the transaction directed acyclic graph, abbreviated transaction graph or TDAG for representing the semantics of a blockchain. It models the context held by the blockchain and its evolution through transactions that obey validation rules.
We start by introducing some notation. Let E ⊆ X × Y be a relation between sets X and Y . For the predicate (x, y) ∈ E, we also write xEy. Furthermore, we denote the set {y : xEy} by xE! and its size by |xE!|.

Definition
A transaction graph or TDAG is a directed acyclic graph G = (V , E). The vertices V can be partitioned into states S and witnesses W , that is, V = S∪W . At a high level the edges E represent transitions between states. More precisely, an edge e ∈ E represents the relation between a state and a witness in the context of a transaction, and an edge may connect a state to a witness or vice versa.
The edges can be partitioned into consuming, observing, and producing edges, denoted E C , E O , and E P , respectively, such that E = E C∪ E O∪ E P . We now introduce the elements of G informally.
States ©. The first type of vertex, s ∈ S, denotes an atomic state represented by the blockchain and is depicted by a circle ©. It models an individual asset, a digital coin, some coins controlled by a particular cryptographic key, a variable of a smart contract at a moment in time, and so on. The complete context of the blockchain consists of all states that exist at a particular time. A state results from a transaction on the blockchain and can transition to other states through a transaction.
There is a special genesis state s g ∈ S, which represents the initial state of the blockchain. There is a single genesis state by intention because the blockchain system can be initialized exactly once.
Witnesses □. The second kind of vertex, w ∈ W , denotes a witness in the context of a transaction and is depicted by a rectangle □. It represents any data included in a transaction that is required for the transaction to be valid according to the validation rules of the blockchain system. Every transaction of the blockchain system contains exactly one witness.
Consuming edges © −−−→ □. A consuming edge e ∈ E C connects a state to a witness and models that the state © is consumed by the transaction that involves witness □, i.e., the unique transaction that corresponds to □. A state can be consumed exactly once, i.e., it is not available for being consumed by another transaction once it has been consumed. Consuming a state means that the state is "updated" or "overwritten" by the transaction.
Observing edges © −−→ □. An observing edge e ∈ E O also connects a state to a witness; it models that the state enters into the transaction represented by the witness, but that it remains available for consumption by another transaction. A state can be observed by many transactions, independently of whether it is also consumed or not. Intuitively a transaction that observes a state "reads" it.
Producing edges □ −−−→ ©. A producing edge e ∈ E P connects a witness to a state, and denotes that the state is created or produced by the transaction corresponding to the witness. Every state apart from the genesis state is produced exactly once.
With these notions, a transaction represents a transition from one state, or from some set of states, in a TDAG to another set of states according to the blockchain system. The transaction is linked to a unique witness, which makes it "valid" as described later. We say that a transaction has input states that are consumed or observed by the transaction and output states that are produced by the transaction. More formally, a transaction is also a weakly connected DAG, i.e., a DAG that is connected as a graph.
Definition 2.1 (Transaction). A weakly connected DAG T = (V , E) with a set of input states S I , a set of output states S O , and a witness w is called a transaction whenever -Every input state in S I is a source (has indegree zero); -Every output state in S O is a sink (has outdegree zero); -V = S I∪ S O∪ {w}; -Every edge in E is either a consuming edge or an observing edge and links some input state s i ∈ S I to w, or it is a producing edge and links w to some output state s o ∈ S O .
As the name suggests, a transaction graph consists of many transactions.
Definition 2.2 (TDAG). A transaction graph (TDAG) is a directed unweighted graph G = (V , E), where V = S∪ W are the vertices and E = E C∪ E O∪ E P are the edges. The set S denotes the states and contains a special state s g called genesis. The set W denotes the witnesses. Edges are partitioned into three subsets, where E C ⊆ S × W denotes consuming edges, E O ⊆ S × W denotes observing edges, and E P ⊆ W × S denotes the producing edges.
It satisfies the following conditions: (1) s g does not have any producing or observing edges and it has a single consuming edge, i.e., (2) Every state except for the genesis state has exactly one producing edge, i.e., ∀s ∈ S \ {s g } ∃!w ∈ W : wE P s.
(3) Every state except for the genesis state may have multiple successors, but at most one among them is connected with a consuming edge, i.e., ∀s ∈ S : |sE C !| ≤ 1. (4) G is weakly connected. (5) G has no cycles.
The consuming and observing edges incident to a state are also called the outgoing edges of that state. Similarly, the consuming and observing edges incident to a witness are called incoming edges of that witness. The producing edges of a witness are outgoing edges of the witness. There is no order among the edges incident to a vertex in a TDAG. The set of all unconsumed states in a TDAG are the states without an incident consuming edge.
In a TDAG every witness w corresponds to a unique transaction t(w). The next definition follows naturally and is easily seen to be equivalent to Definition 2.1.
Definition 2.3 (Transaction in a TDAG). Given a TDAG G = (S∪ W , E) and a witness w ∈ W , the transaction with witness w is the unique subgraph t = (S ′∪ {w}, E ′ ) ⊆ G, where w ∈ W is the witness of the transaction; -S ′ is the set of states connected to w, i.e., S ′ = {s ∈ S : sE C w ∨ sE O w ∨ wE P s}; and -E ′ are the edges with both endpoints in S ′∪ {w}.
The input states of t(w) are the states being observed or consumed by t(w), and the output states of t(w) are the states being produced by t(w). With this terminology a transaction t ⊆ G can have one of the following five types, which depends mostly on the number of input and output states:

INIT.
A unique initialization transaction exists in every non-empty TDAG, consisting of a consuming edge that links the genesis state to a witness w and a set of producing edges that link w to a set of states. SISO. A single-input, single-output transaction consists of one consuming edge that links one input state to a witness w and one producing edge that links w to an output state. SIMO. A single-input, multi-output transaction consists of one consuming edge that links an input state s to a witness w, and a set of producing edges that link w to a set of output states. MISO. A multi-input, single-output transaction contains a set of multiple consuming and observing edges that link distinct input states to a witness w and one producing edge that links w to an output state. MIMO. A multi-input, multi-output transaction contains a set of multiple consuming and observing edges that link distinct input states to a witness w, and a set of producing edges that link w to a set of output states. Fig. 1 shows the possible transaction types in a TDAG. The initialization transaction plays a special role; it represents the creation of the blockchain, which typically creates all assets represented by the states. Modeling initialization through a specific transaction is a deliberate design choice that will become clear later, in the context of transaction validation. The other types represent "ordinary" transactions that consume (and possibly observe) one or more states and produce one or more states. We note that SISO and SIMO transactions have a single input state and have no observing edges. This models that a transaction must update or overwrite at least one state for it to make sense of being included in the blockchain, as simple read queries can be handled by inspecting the blockchain.
For the moment, it suffices to say that the initialization transaction typically creates all "assets" modeled by the blockchain or the "states" that it holds, setting them to a predefined value. This allows a subsequent transaction to be linked only with the state to which it refers and that it consumes. Otherwise, all transactions that modify any state would be linked from the genesis state (with a consuming edge), contrary to the condition that every state has at most one consuming edge. We consider this an important property of the TDAG model. A further argument for modeling only one initialization transaction goes as follows. If there were multiple INIT transactions, then it would . . .     not be easily possible to assess whether one INIT transaction is "valid" without looking also at the other ones. For instance, an INIT transaction that creates a new asset is only valid if no other INIT transaction has created the same asset beforehand. Therefore, we purposely restrict the model so that it has a single initialization transaction for simplicity, but without loss of generality as this unique initialization transaction can create as many states as required throughout the lifetime of the blockchain. Fig. 2 shows an illustrative example of a TDAG modeling a Bitcoin execution with four transactions. First, t 0 (w 0 ) represents the creation of the Bitcoin blockchain by minting all available bitcoins into a Bitcoin address containing unmined bitcoins (s 0 ). Here, w 0 represents the Bitcoin creation rules. Second, t 1 (w 1 ) represents a transaction that transfers some unmined bitcoins (s 0 ) to the Bitcoin address of a user u that successfully mined the first Bitcoin block (s 2 ); t 1 (w 1 ) saves the remaining unmined bitcoins (s 1 ) for subsequent block creations. Here, w 1 represents proof-of-work in the block mined by u. Third, t 2 (w 2 ) represents a transaction where u transfers some of her bitcoins (s 2 ) to another Bitcoin address (s 4 ). The associated transaction fee is modeled as another address (s 3 ). Here, w 2 represents the authorization of the transaction in the form of a digital signature by u. Finally, t 3 (w 3 ) represents a transaction that rewards a user for creating a Bitcoin block containing t 2 (w 2 ). In that sense, t 3 (w 3 ) is similar to t 1 (w 1 ), with the difference that t 3 (w 3 ) also captures the fact that the user also receives the fees associated to t 2 (w 2 ).
We note that this example does not contain any observing edge. This results from the fact that read-only operations are not supported in Bitcoin.

Conflicts and validity
A central goal of blockchain systems is to prevent conflicts among transactions and to ensure validity for all transactions, as a result of a consensus process executed among the participating entities. The TDAG model permits to have a closer look at the semantics of conflicts and validity; modeling consensus is outside the scope of this work.
Intuitively, a conflict in a blockchain underlying a cryptocurrency such as Bitcoin occurs in an attempt to "double-spend" money. According to the example describing Bitcoin from before (and expanded in Section 3), assume that a state s in a TDAG corresponds to bitcoins held by a particular Bitcoin address. Two transactions that double-spend such bitcoins map to two transactions that both consume s. But every state in a TDAG can be consumed at most once, hence, the TDAG model already prevents this form of conflict.
In blockchains for arbitrary smart contracts, a conflict corresponds to a situation where generic validation rules for transactions are violated. Such rules may refer to coins (such as an amount of Ether in Ethereum) or to other assets modeled in the blockchain. The TDAG model for these blockchains also imposes that every state can be consumed at most once.
When one considers an arbitrary set of transactions (not arising from the same transaction graph), such as transactions that have merely been proposed and are not executed on the blockchain yet, then conflicts among them could exist. This is the case in a cryptocurrency like Bitcoin when a miner searches for the next block, for example, and two transactions might be floating around in the network that both attempt to consume the same state s. Similarly, conflicting transactions exist in smart-contract platforms during the process of reaching consensus on a valid blockchain execution.
We now consider a set of transactions (in the form of a graph) and define what it means for them to be conflict-free.
Definition 2.4 (Conflict-freedom). Consider a DAG T = (S T∪ W T , E T ) with states S T , witnesses W T , producing edges E P ⊆ E T and consuming edges E C ⊆ E T that contains a transaction for every witness w ∈ W T . We say that T has no conflicts if every state has at most one producing edge and one consuming edge, i.e., ∀s ∈ S T : |!E P s| ≤ 1 ∧ |sE C !| ≤ 1.
A conflict-free set of transactions can be added to a TDAG. To ensure that its addition does not cause any conflicts with the TDAG only simple and local conditions have to be verified.
Definition 2.5 (Adding transactions to a TDAG). Consider a TDAG G = (S∪ W , E) and a DAG T = (S T∪ W T , E T ) containing a conflict-free set of transactions such that Then the result of adding T to G is the DAGḠ = (S∪W ,Ē), withS = S ∪ S T ,W = W∪ W T , andĒ = E∪ E T . THEOREM 2.6. When a conflict-free set of transactions T = (S T∪ W T , E T ) is added to a TDAG G = (S∪ W , E), then the resulting graphḠ = (S∪W ,Ē) is also a TDAG.
PROOF. Here we show thatḠ satisfies the conditions to be a TDAG.
(1) The genesis state must not have producing or observing edges and it must have a single consuming edge. This condition is fulfilled since G is a TDAG and T does not contain the genesis state if it is already consumed in G.
(2) Every state, other than genesis, must have a single producing edge. This condition is fulfilled in G and in T by definition. Now, the addition of t to G does not create new edges. Therefore, this condition holds also inḠ.
(3) Every state, other than the genesis, can have multiple successors, but at most one among them is connected with a consuming edge. It is easy to see thatḠ fulfills this condition following an argument similar as before. (4) The graph must be weakly connected. Note that by the definition of TDAG, each vertex v ∈ S∪ W is weakly connected to every unconsumed state in G. Moreover, every vertex v ′ in S T∪ W T is weakly connected to at least one input state of T . Now, as the set of input states in T is a subset of the unconsumed states in G, it follows thatḠ is weakly connected. (5) The graph must not have cycles. According to the assumptions on T and because G is a DAG, and through the way in whichḠ is constructed, it is easy to see thatḠ has no cycles.
We now introduce the notion of validity for transactions in a TDAG, which models the fact that on a blockchain only "valid" transactions are executed. As an important design choice of the model, the validity of a transaction in a TDAG must be decidable locally, that is, from the transaction alone, considering only its input states, the witness, and the output states. To capture this, we assume that the blockchain context defines a boolean validation predicate P(·) on the space of all transactions.
Definition 2.7 (Validity). Let t be a transaction in a TDAG G. Then t is valid whenever P(t) = TRUE. Furthermore, G is a valid transaction graph if all transactions in G are valid.
Combined with the locally checkable conditions for adding transactions to a TDAG, the fact that the validity of a transaction is locally decidable defines, in an influential way, how many blockchain systems work during consensus, validation, and execution of new transactions. The only steps needed for validation are to ensure the validity predicate of a candidate transaction plus the checks according to Definition 2.5 involving the states to which the transaction refers.
Transaction validation also relies on the property that all states in the TDAG are distinct. In a typical blockchain, the validation function relies on a cryptographic hash of the states to which it refers; this directly ensures uniqueness. For example, consider an execution of a smart contract that holds state on the blockchain in the form of a local variable var. The contract may update var multiple times, and it may write the same value to var more than once. To make the resulting states in the TDAG different, the model will usually include a version number in the state that makes each assignment unique.
At this point, let us review our design choice of a single INIT transaction. Using a single transaction to create all assets represented by the states enables to locally check the validity of the initialization of the blockchain as well as preserve the locally checkable conditions for further transactions consuming those states.

Composition of transaction graphs
In Bitcoin (and many other cryptocurrencies), all the miners participate in the consensus protocol to decide about the validity of every single transaction. The permissionless nature of this consensus mechanism heavily limits the transaction throughput. One alternative to overcome this scalability issue is called sharding and consists in organizing disjoint sets of miners, letting each of these sets reach consensus about a subset of the transactions. The composition of those subsets of transactions is required then to shape the blockchain.
In the following, we describe the composition of transaction graphs, which states the conditions under which two TDAGs can be merged into a single one. One may then reason about their consistency and validity in a unified manner. Composition of transaction graphs can be used to model the goal of protocols for cross-chain transactions, namely that the combined state of both chains achieves the expected consistency properties.
Definition 2.8 (TDAG composition). Consider two TDAGs G := (S∪ W , E) and G ′ : and the output states are the union of output states from t(w) and t ′ (w ′ ). Then, the composition of G and G ′ is the TDAG THEOREM 2.9 (COMPOSITION OF TWO TDAGS INTO ONE TDAG). The composition of two TDAGs G and G ′ results in a graph ! G, which is also a TDAG.
PROOF. Here we show that ! G satisfies the conditions to be a TDAG.
(1) The genesis state must not have producing or observing edges and it must have a single consuming edge. This condition is fulfilled by our definition of the INIT transaction ! t( ! w).
(2) Every state, other than genesis, must have a single producing edge. As G and G ′ are two TDAGs, it is easy to see that each state in T G \ {t(w)} and T G ′ \ {t ′ (w ′ )} has a single producing edge.
Moreover, by definition of INIT transaction, each output state in ! t( ! w) has a single producing edge.
(3) Every state, other than the genesis, can have multiple successors, but at most one among them is connected with a consuming edge. It is easy to see that ! G fulfills this condition along the lines of previous argument.
(4) The graph must be weakly connected.
(5) The graph must not have cycles. T G \ {t(w)} and T G ′ \ {t ′ (w ′ )} are acyclic by definition, as G and G ′ are two TDAGs. Moreover, the addition of ! t( ! w) clearly does not introduce any cycle.

APPLICATIONS
In this section, we describe how executions of different blockchain systems are modeled by transaction graphs. We cover three prominent blockchains: Bitcoin, Ethereum, and Hyperledger Fabric (HLF). They differ in how they store assets in their state. Bitcoin, for example, does not have state "variables" but maintains an asset only in the context of the transaction that created it. Ethereum, on the other hand, uses variables and accounts for its state. The data model in HLF is a key-value store (KVS), which can be mapped to local database on each node. Due to lack of space, this section only gives a short overview and more details appear in the full version . Throughout this section, we denote by y ← H(x) a cryptographic, collision-free hash function that takes as input a bit-string x ∈ {0, 1} * of arbitrary length and returns a fixed-length string y ∈ {0, 1} l .

Bitcoin
Since Bitcoin (bitcoin.org) is the prototype of all blockchain systems, there are many publicly available descriptions [Nakamoto 2008;Antonopoulos 2014] and we keep the background short. Likewise, the discussion here applies to all alt-coins patterned after Bitcoin.
Bitcoin combines transaction validation, coin mining, and agreement on the ledger with the "Nakamoto protocol" that uses proof-of-work and ensures consensus. A block in Bitcoin can hold two types of transactions: -A coinbase transaction that transfers yet unmined bitcoins to a Bitcoin address as chosen by the miner of the corresponding block, as a reward for creating the block. This transaction is valid if (i) it transfers a number of bitcoins according to the height of the block to a Bitcoin address, and (ii) is accompanied by the solution to the proof-of-work puzzle for successful mining of the block. -A regular transaction transfers bitcoins from a set of Bitcoin (input) addresses to another set of Bitcoin (output) addresses. It also incurs a fee, defined as the difference between the bitcoin amounts in the input and output, which is assigned to the miner of the block in which the transaction appears. A regular transaction is valid if it includes a confirmation for each input for the amount and output and if it does not create new bitcoins.
Bitcoin value exists in the blockchain in the form of unspent transaction output, often abbreviated UTXO, which has been assigned to an address, representing a digital-signature public key. This value is controlled by the holder of the corresponding private key. It can be spent and transferred to another address by signing a transaction with the private key.
In the TDAG modeling Bitcoin, we let every state be a tuple of the form (addr, val, hash, height) .
where addr denotes an address, val denotes the amount of bitcoins held in this state, hash is the cryptographic hash of other states (whose UTXO is transferred by the transaction), and height denotes the index of the block in which the state was produced.
In contrast to the Bitcoin code, we model transaction fees and unmined bitcoins as held by or associated to an (imaginary) address. This allows a coherent model for the TDAG. Thus, the state resulting from the special INIT transaction is fixed to (ADDR 0 , 21M, H( / 0), 0), holding all 21M bitcoins that ever exist.
The form of a witness depends on the transaction type: The witness for a coinbase transaction is the solution for the proof-of-work to assign the bitcoins to the address designated by the miner. For a regular transaction, the witness consists of a set of confirmations for the transfer of bitcoin, in the form of a digital signature for each UTXO, over the input and output addresses of the transfer. Finally, the INIT transaction does not require any witness.
The TDAG for Bitcoin contains producing and consuming edges but no observing edges. For a coinbase transaction, the input states are the unconsumed state of unmined bitcoins and the fee states for the transactions included in the mined block. One producing edge leads to a state for collecting the fees and the mining reward, another one to a state containing the remaining unmined bitcoins. Its witness is the mining proof. For a regular transaction, the input states are the unconsumed states representing the transaction inputs and the produced output states correspond to the transactions output addresses. The witness holds a set of confirmations (digital signatures), confirming for each input state the transfer of some bitcoins to the corresponding output addresses.
The transaction predicate incorporates the validation rules of Bitcoin, as expressed in the states, witnesses, and transactions of the TDAG.
With these definitions, one can then show the intuitive result that except with negligible probability, every (legal) execution of Bitcoin, considering only bitcoin transactions that are "deep enough" in the blockchain (e.g., six blocks deep) [Garay et al. 2015] gives rise to a TDAG constructed like this. The formal analysis of this result exploits that the DAG formed by the hash-function applications among states has no cycles, and therefore satisfies the properties of a TDAG.

Ethereum
Ethereum [Ethereum 2017] is the most prominent public blockchain and cryptocurrency supporting generic smart contracts today (ethereum.org). In Ethereum there exist two types of accounts, called externally owned accounts and contract accounts. Externally owned accounts largely resemble the accounts of other cryptocurrencies such as Bitcoin, in which users maintain their currency balance in Ether, owned by them. But the main innovation of Ethereum lies in contract accounts, which represent a smart contract (an arbitrary piece of code in the platform-specific language) and that executes a set of instructions upon receiving suitable input. A contract account also holds and controls its own Ether balance and specifies a gas price, which determines the cost of executing its code for anyone that invokes the contract.
Ethereum supports several types of transactions. First, a transaction in Ethereum can be used to transfer Ether between two externally owned accounts. This type of transaction is like the exchange of coins in other cryptocurrencies. Second, a transaction can be used to create a contract with the code of the contract and an externally owned account as inputs. It outputs a contract account with the information required to initialize the implemented code (e.g., the inputs for the init function). Finally, a transaction can be used to invoke an existing contract on the blockchain.
An Ethereum transaction includes as input the sender's address (an externally owned account), a recipient address (another account), a transaction value to be transferred from the sender's address to the recipient, some arguments with parameters for the contract, and a gas limit, specifying a maximum price for the execution. A contract may also call functions of other contracts; however, this will not give rise to new transactions, as these calls take place in the context of the original transaction.
To model an Ethereum execution as a TDAG, we let each state consist of a tuple (addr, account-type, code, local-state, gas-price, val) .
Here, addr denotes the account address that produced the state, account-type determines whether this is a state of an external account or a contract account, code is a hash of the smart contract's code, local-state denotes collectively all variables held by the contract, gas-price is the price for executing transactions with this contract, and val is the Ether balance held by the account after the execution that produced the state. If account-type specifies an externally owned account, then the smart contract is the fixed logic to validate payments from such accounts.
There is also a genesis state that models the creation of an Ethereum blockchain. In contrast to Bitcoin, there is currently no bound on the amount of Ether that will exist in the public Ethereum blockchain; the creation of new Ether is therefore subsumed into the mining operation and its validation.
A transaction in the TDAG is determined by the witness. It corresponds to an invocation of a smart contract and contains a gas limit and regular input arguments that validate the transaction. For instance, these arguments must contain a digital signature valid under the public key associated to the invoking external account that runs the transaction.
The transaction contains the state of the invoking account and the state of the contract as input states, with consuming edges to the witness. It also produces two states, an updated state of the invoking account and an updated state of the contract, as resulting from running the contract with the given gas limit and input arguments. If the contract calls functions of other contracts and they modify their state, then the states representing these contracts are also part of the transaction in the TDAG (as input states and output states). The validation predicate simply executes the code.
For mining new Ether, running transactions, and collecting the corresponding fees, similar states and validation logic as in the TDAG model of Bitcoin are added. Given these notions one can show that every (legal) execution of Ethereum, considering as in Bitcoin only those transactions that are deep enough in the blockchain, produces a valid TDAG.

Hyperledger Fabric
Hyperledger Fabric (www.hyperledger.org/projects/fabric), or HLF for short, is a permissioned blockchain framework, designed to support modular implementations of different components, including its consensus protocol, membership provider, and cryptography library [Cachin 2016]. The nodes executing the HLF blockchain are called peers.
An instance of HLF may contain multiple channels that may run on different sets of peers, where each channel operates like a blockchain system independent of the others, apart from using some of the same code infrastructure, ordering protocol, and other components. We therefore consider only one channel here, modeling one blockchain.
On a channel, a configuration transaction (configtx) sets the initial values used for transaction processing, such as the credentials of the peers or organizations controlling the channel, the implementation of its ordering service, and so on. Once a channel has been prepared like this, it is ready to execute operations on its peers. Transactions in HLF are executed by smart contracts called chaincode.
Chaincode is first installed on the peer and may later be upgraded; it must be instantiated for a specific channel before it can process transactions. Once instantiated on the channel, a chaincode supports two types of transactions: init and invoke. An init transaction is executed once after the chaincode has been installed or upgraded; it specifies an endorsement policy that determines how any subsequent transaction of this chaincode should be authorized. A chaincode determines through the endorsement policy on which peers it executes: whether all peers in the channel execute it, or only some, and which peers or which set of peers are sufficient to authorize the execution of the transaction.
An invoke transaction is used to execute a computation that may read and modify the state of the chaincode, which is a set of key-value pairs. The operations to access the state are GETSTATE(k) → v (given a key k, return the last value v written to it) and PUTSTATE(k, v) (write the value v to storage under the key k).
The processing of a transaction on HLF proceeds like this [Androulaki et al. 2016]: (1) A client creates and signs a transaction for a particular chaincode and sends it to the respective endorsing peers.
(2) The endorsing peers simulate the transaction on their current current copy of the key-value store (KVS), verifying that the client is authorized to execute it. If successful, each endorsing peer returns the result of the execution to the client. This is also called an endorsement. It comes in the form of a signed readset and writeset (with the key-value pairs accessed during simulation, including a version for every value in the readset, determined by the logical time when this value was written). The endorsement serves as a static representation of the chaincode execution.
(3) When the client has assembled enough endorsements that produce the same KVS changes and that satisfy the endorsement policy, it combines them to a transaction proposal. Then the client broadcasts this transaction proposal to the ordering service, which simply orders transactions without considering their semantics. Currently an ordering service based on Apache Kafka (kafka.apache.org) running in a cluster is supported and an ordering service using BFT consensus is under development [Vukolić 2017;]. (4) The ordering service disseminates an ordered stream of transactions (grouped into blocks) to the peers on the channel. Each peer on its own then validates each transaction, by verifying that the endorsement policy is satisfied and that there were no changes to the key-value pairs contained in the readset (since transaction simulation). (5) If successful, the peer appends the block to the blockchain (of the channel) and performs the updates from the writeset to its local copy of the KVS. This assigns a version to the modified key-value pairs. Since the validation is deterministic, the states and versions are the same for all correct peers.
In the TDAG for HLF, the states correspond to the entries in the KVS. Every state is a tuple containing at least (key, version) .
It is assumed that an init transaction implicitly initializes every key used by the chaincode later with a default value (−). The init transaction is always valid.
Furthermore, every invoke transaction that reads or writes a set of keys K , contains an observing edge for every k ∈ K accessed by an operation GETSTATE(k) but not by an operation PUTSTATE(k, !), and a consuming edge for every k that is written using an operation PUTSTATE(k, !). In other words, every key is implicitly read before it is written and, thus, a transaction in the TDAG modeling an HLF execution has the same number of consuming edges as the number of producing edges.
A witness in the TDAG corresponds to a valid endorsement, in the form of signatures from the endorsers issued on the same readset/writeset pair from the transaction proposal. The validation predicate P(·) contains the steps that each peer takes to validate a transaction coming from the ordering service, with respect to its local KVS. Notice that this validation only accesses the versions in the readset, but no other state entry in the KVS. Since these states are also contained in the transaction in the TDAG, the evaluation of P(·) in the graph is local.
Given that the ordering service of HLF outputs the same stream of blocks with transactions to every connected peer, it is easy to verify that the graph resulting from any execution of HLF is a TDAG.

CONCLUSION
Blockchains and distributed ledger platforms are of great interest for the financial industry today, due to their role as trustless intermediaries gained from their resilience to attacks and subversion. For gaining confidence in a new technology, it is paramount to study its security with formal models. This work has proposed transaction graphs or TDAGs as a discrete model for the semantics of the interactions in a blockchain system. In contrast to existing event-based models for generic distributed and concurrent systems, it explicitly takes into account the validation of transactions, which is an important aspect of blockchains. For instance, the TDAG model allows to model assets and their transfer among different entities. It also facilitates comparisons among different technologies available today.
We envision that richer semantics can be expressed by refining the TDAG model. For instance, one may argue about further invariants of the blockchain system as properties of the TDAG, similar to modeling Bitcoin's fixed coin supply. One might also use a TDAG to formally model the provenance for generic assets that are handled by smart contracts, building on the paths through which the asset was transferred in the TDAG. One could also leverage a TDAG to formally describe the guarantees provided by a blockchain equipped with a pruning mechanism, reasoning about the remaining states in the TDAG after pruning. Finally, we additionally foresee that the TDAG can be extended to model invariants required for payment channels, for instance payment channel transactions should be free of conflicts with those included in the TDAG.

A. TRANSACTION GRAPH FOR BITCOIN
We start with the description of an execution of the Bitcoin system as represented by the corresponding blockchain. A Bitcoin blockchain is composed of blocks, where each block is created as a result of successfully executing the Bitcoin mining process [Nakamoto 2008]. The miner of such block (i.e., user showing a valid proof of successful mining) chooses a set of regular transactions to be added in the block along with a single coinbase transaction. There exists a special block, denoted as genesis block, that represents the initialization of the blockchain.
A coinbase transaction transfers unmined bitcoins to a (set of) Bitcoin address, chosen by the corresponding miner, as a reward for creating the block. A coinbase transaction is valid if it transfers only the number of bitcoins set as reward according to the height of the mined blocked. A regular transaction transfers bitcoins from a set of Bitcoin addresses (i.e., input addresses) to another set of Bitcoin addresses (i.e., output addresses). A regular transaction is valid if: (i) it includes a confirmation for each input address; (ii) it does not create new bitcoins. Finally, a regular transaction has an associated fee (i.e., between the bitcoins held at input and output addresses). We now describe our modeling of a given execution of Bitcoin as a TDAG. A state represents a Bitcoin address that holds a group of bitcoins, a transaction fee or the yet unmined bitcoins. We note that fees and unmined Bitcoins are not associated to an address in the real Bitcoin, but we model them as held by an address to have a coherent transaction graph model. The genesis state represents a Bitcoin address holding the 21M bitcoins ever existing in the Bitcoin system. Each witness represents either a proof of successful mining for a block or the (set of) confirmations required in a regular transaction. Finally, we consider two types of edges: producing and consuming edges. A producing edge links unconsumed addresses for unmined bitcoins and transaction fees to the mining proof for the corresponding coinbase transaction; or an input address to the corresponding confirmation in a regular transaction. A consuming edge links a mining proof to the Bitcoin addresses getting the reward, or a set of confirmations to the corresponding output addresses receiving (part of) the transferred bitcoins.
Definition A.2 (Transaction graph for Bitcoin). We model an execution of Bitcoin system as a graph G BTC := (S BTC∪ W BTC , E BTC ) defined as follows: State. Each state s ∈ S BTC is defined as a tuple (addr, val, hash, height), where addr denotes a Bitcoin address, val denotes the amount of Bitcoins held at addr, hash denotes the result of applying H to a set of vertices S ′ BTC∪ {w} with S ′ BTC ⊂ S BTC , and height denotes a block index.
Witness. Each witness w ∈ W BTC is defined by a tuple (txtype, F ), where txtype denotes the type of the transaction and determines the content of F . In particular, (TINITX, / 0) is the witness for the initialization transaction; (TCBTX, MP) denotes a witness for a coinbase transaction and (TRTX, {CF i }) denotes the witness for a regular transaction.
Edge. Each edge e ∈ E is defined either as consuming edge or producing edge.
The transaction graph presented here determines the modeling of the possible transactions in a Bitcoin execution. The next definition maps transaction in a Bitcoin execution to transaction types supported in a TDAG. Finally, we complete our description of the Bitcoin context with the corresponding transaction predicate P. For that, we use VerifyContract (ADDR, CF) as a function that on input a Bitcoin address ADDR and a confirmation CF, returns TRUE if CF encodes a valid confirmation to spend the bitcoins held at ADDR. Otherwise, it returns FALSE. Additionally, we use VerifyWork (MP) as a function that on input a mining proof MP, returns TRUE if MP is a valid proof-of-work for the corresponding block, or FALSE otherwise. We thereby abstract away the implementation details for validation of Bitcoin scripts and mining proofs. (1) If t is a regular transaction (w.txtype = TRTX), the witness holds a valid confirmation for each input state i.e., ∀s ∈ !Ew, ∃CF ∈ w.F : VerifyContract(s.addr, CF).
(4) The sum of bitcoins held at the input states must be equal to the sum of bitcoins held at the output states, i.e., ∑ s∈!Ew s.val = ∑ s ′ ∈wE! s ′ .val (5) Each output state contains the evaluation of the hash function over input states and the witness, i.e., ∀s ∈ wE! : s.hash = H(!Ew∪ {w}).

A.1. Model analysis
We star this section by analyzing the definition of transaction graph presented in the previous section. We start by showing that it is a TDAG. Here, we consider legal, a Bitcoin execution that contains only transactions that are "deep enough" in the blockchain (e.g., six blocks deep). We thereby enable the study of any Bitcoin execution in terms of the properties of a TDAG such as conflict-freedom or validity.
THEOREM A.5. Assume H is a collision-resistant hash function [Goldwasser and Bellare ] and assume that L BTC is a legal Bitcoin execution. Then, the graph G BTC resulting from modeling L BTC is a TDAG.
PROOF. Here, we show that G BTC = (S BTC∪ W BTC , E BTC ) fulfills the conditions to be a TDAG.
(1) The genesis state must not have producing or observing edges and it must have a single producing edge. Our designed INIT transaction ensures this.
(2) Every state, other than the genesis, must have a single producing edge. Assume by contradiction that it is not fulfilled. Then, there is a state s ∈ S BTC with at least two producing edges and that implies that there exists two different sets V := S∪ {w} and V ′ := S ′∪ {w ′ } such that H(V ) = H(V ′ ). However, V and V ′ contradict the assumption that H is collision resistant.
(3) Every state other than the genesis can have multiple successors, but at most one among them is connected with a consuming edge. Each Bitcoin address is consumed only once in a legal Bitcoin execution. Therefore, this condition is fulfilled. (4) The graph must be weakly connected. Each new transaction consumes a previously unconsumed state in the graph , i.e., either a unspent Bitcoin address or mines yet unmined bitcoins and consumes unclaimed fees. Therefore, the overall graph is weakly connected.
(5) The graph must not have cycles. Assume by contradiction that there is a cycle in G BTC . This, however, implies that there are two different transactions t and t ′ that produce the same state. However, as we have seen before, this contradicts the fact that H is collision resistant.
Remember from Theorem 2.7 that a TDAG is valid if each transaction individually is valid according to a transaction predicate P. Next, we show that validating Bitcoin transactions individually in our model, suffices to safely consider that unconsumed states represent all bitcoins in the system. Definition A.6 (Unspent bitcoins). Consider G BTC a TDAG modeling a Bitcoin execution. Then, the unspent bitcoins in G BTC are the sum of bitcoins held at unconsumed states of G BTC . THEOREM A.7 (UNSPENT BITCOINS ARE ALL BITCOINS IN THE SYSTEM). Consider G BTC a valid TDAG that models a Bitcoin execution. Then, the amount of unspent bitcoins in G BTC is equal to all bitcoins ever existing in the system. More formally, let S ′ be the set of unconsumed states in G BTC , then ∑ s∈S s.val = s g .val.
PROOF. Assume by contradiction that Theorem A.7 does not hold. Then, there must exist a transaction t := (S∪ {w}, E) in T G BTC such that ∑ s∈!Ew s.val ∕ = ∑ s ′ ∈wE! s ′ .val. This, however, clearly implies that P(t) returns FALSE, which contradicts the assumption that G BTC is a valid TDAG.

A.2. Modeling an example of bitcoin execution
Here, we describe our modeling for an illustrative example of Bitcoin execution. We assume for simplicity that the block reward is fixed to a value of 50 bitcoins as it was the first reward set in the Bitcoin system. Additionally, we assume that the transaction fee is fixed to 1 bitcoin. We stress, however, that the TDAG model is expressive enough to relax these assumptions.
We focus in the illustrative example depicted in Fig. 3. In particular, Fig. 3a shows a possible Bitcoin execution L BTC := {B g , B 1 , B 2 }, where B g := ( / 0, {t 0 }), B 1 := (MP, {t 1 }) and B 2 := (MP ′ , {t 2 ,t 3 ,t 4 }). We note that this example is similar to that in Fig. 2 and due to lack of space we do not describe it here again. However, we remark that it is expanded here with an extra MIMO transaction (i.e., t 3 (w 3 )) to show how we model transactions that involve multiple payers and multiple payees. Instead, we focus on the description of G BTC := (S∪ W , E P∪ E C ), a transaction graph modeling the aforementioned Bitcoin execution as depicted in Fig. 3b. t 1 := ({s 0 , s 1 , s 2 , w 1 }, {(s 0 , w 1 ), (w 1 , s 1 ), (w 1 , s 2 )}), where (s 0 , w 1 ) ∈ E C and {(w 1 , s 1 ), (w 1 , s 2 )} ⊆ E P . A SIMO transaction that issues bitcoins to Alice after she has successfully mined a block.
In a bit more detail,

B. TRANSACTION GRAPH FOR HYPERLEDGER FABRIC
In this section, we study the Hyperledger Fabric (HLF) [Cachin 2016] blockchain-based system. We start by the description of an execution of HLF. An execution of HLF is represented as a set of blockchains, one per channel. However, as each single blockchain evolves independently from each other, we restrict our description here to a single blockchain. This description, however, can be easily extended to model a HLF execution with multiple channels. A blockchain is composed of blocks. We denote the first block as genesis block and each subsequent block is created by the ordering service. Such ordering service chooses the sorted set of transactions to be included in each block. HLF supports two types of transactions: Init and Invoke. An init transaction is included in the genesis block and it is used to initialize every key used in the blockchain to a default value − and includes an endorsement policy, that determines how any subsequent transaction should be authorized. We consider that an initialization transaction is always valid.
An invoke transaction is used to carry out updates in a set of key-value pairs for the local current key-value store (KVS) through two operations: (i) GETSTATE(k) → v, that given a key k provides the most current value v associated to it; and (ii) PUTSTATE(k, v), that updates value associated to a given key k to the newly provided value v. An invoke transaction is valid if it contains enough endorsements from the set of endorsers specified in the endorsement policy. We continue by describing the modeling of a HLF execution. Informally, each state in our model represents a key-value pair. Each witness represents the set of endorsements required for a transaction to be valid. Finally, here we consider three type of edges: observing, consuming and producing edges. An observing edge links a key k to the endorsement specified in a transaction that reads k but does not modify it (e.g., an invoke transaction that contains only a GETSTATE(k) operation). If the key k is modified (e.g., an invoke transaction that contains PUTSTATE(k, !) operation), a consuming edge links then the key k with the endorsements for such transaction. Finally, a producing edge links the endorsements to a key k a transaction has modified it (e.g., by means of a PUTSTATE(k, !) operation).
Definition B.2 (Model for HLF execution). We model a HLF execution L HLF as a graph G HLF := (S HLF∪ W HLF , E HLF ) defined as follows: States:. Each state s ∈ S HLF is defined as a tuple (key, version), where key denotes the key part of a key-value pair and version denotes the current version number of the key-value pair. The genesis state is defined as s g := (params, 0) and denotes a special key-value pair that holds the configuration parameters for a channel as indicated in channel initialization.
Witness:. Each witness w ∈ W HLF is defined as a tuple (txtype, F ), where txtype set to TINITX indicates an init transaction and set to TINVTX indicates an invoke transaction. F denotes an endorsement policy EP if txtype = TINITX or a set of endorsements {END i } if txtype = TINVTX. For simplicity, we assume that an endorsement END also contains the corresponding set of operations GETSTATE(!) and PUTSTATE(!, !).
Edges:. Each edge e ∈ E HLF is defined as either observing, consuming or producing edge. Definition B.3 (Transaction types). An invoke transaction is modeled as a SISO, MISO or MIMO transaction depending on the set of operations GETSTATE(!) and PUTSTATE(!, !) that it uses. For instance, a SISO transaction models a transaction that uses a single PUTSTATE(k, !) operation for a key k. A MISO transaction models a transaction that updates a single key k and reads at least one additional key k ′ (e.g., {GETSTATE(k), PUTSTATE(k ′ , v)}). Finally, a MIMO transaction models a transaction that updates several keys and possibly reads other additional keys (e.g., {GETSTATE(k), PUTSTATE(k ′ , v), PUTSTATE(k ′′ , v ′ )}). An init transaction is of type INIT and is defined as t := ({s g , w}∪ {s i }, {(s g , w)}∪ {(w, s 1 ), . . . , (w, s n )}), where (s g , w) ∈ E C , {(w, s 1 ), . . . , (w, s n )} ⊆ E P , w := (TINITX, EP), and each s i := (k i , −). The genesis state s g is defined in Theorem B.2.
We make two observations in the definition of the transaction types. First, MISO and MIMO types are restricted in the sense that they must have the same number of consuming and producing edges. This is due to the fact that we model each PUTSTATE(!, !) operation as a consuming edge from the state of the key being updated and a producing edge to the state corresponding to the updated keyvalue pair. We note, however, that this is a characteristic inherent to all systems based on key-value stores and not a particular limitation of HLF.
Second, as any system based in a key-value store, each key must exist only once. For that, we model our initialization transaction such that all the keys used in the given HLF's execution are created and initialized to a fixed initial value (−). Now, we finalize the description of our model by defining the transaction predicate for HLF. Here, we denote by VerifyEndorsement({END i }) a boolean function that takes a set of endorsements {END i } and returns TRUE if {END i } represents a valid set of endorsements according to the endorsement policy EP, and FALSE otherwise. Here, we assume that EP is obtained from the initialization transaction included in the corresponding HLF execution.
Definition B.4 (Transaction predicate in HLF). Consider a transaction t := (S∪{w}, E O∪ E P∪ E C ). Then, P(t) returns TRUE if the following conditions hold and FALSE otherwise.
(2) If t is an invoke transaction, each output state must represent an update of a key included in a input state. Moreover, the version number for the output state must be bigger than the version number for the input state representing the same key, i.e., w.txtype = TINVTX ⇒ ∀s ′ ∈ wE P !, ∃s ∈ !E C w : s ′ .key = s.key ∧ s ′ .version > s.version.

B.1. Model Analysis
In this section we analyze our model for the execution of the HLF system. We start by showing that any legal HLF execution modeled as aforementioned results in a TDAG. Here, we consider as legal a HLF execution that contains only blocks included in the blockchain that have been produced by the ordering service.
THEOREM B.5. Assume that L HLF is a legal HLF execution. Then, the G HLF instance modeling L HLF is a TDAG.
PROOF. Here, we show that G HLF fulfills all the conditions required in Theorem 2.2.
(1) The genesis state must not have any producing or observing edges and it must have a single producing edge. This condition is ensured by our definition of initialization transaction. (2) Every state, other than the genesis, must have a single producing edge. Assume by contradiction that ∃s ∈ S HLF \ {s g } : |!E P s| > 1. 1 This implies that there are at least two transactions t and t ′ in G HLF that update the same key-value pair simultaneously. This, however, contradicts the assumption that a valid execution contains only transactions sorted by an ordering service.
(3) Every state other than the genesis can have multiple successors, but at most one among them is connected with a consuming edge. The proof for this condition holds along the same lines as for the previous condition. (4) The graph must be weakly connected. Each new transaction reads our updates a key represented by an unconsumed state in the graph. Therefore, the overall graph is weakly connected.
(5) The graph must not have cycles. Assume by contradiction that there is a cycle in G HLF . This necessarily implies that there are two transactions that produce the same state. However, as we