Most methods for Byzantine fault tolerance (BFT) in the partial synchrony setting divide the local state of the nodes into views, and the transition from one view to the next dictates a leader change. In order to provide liveness, all honest nodes need to stay in the same view for a sufficiently long time. This requires view synchronization, a requisite of BFT that we extract and formally define here.
Existing approaches for Byzantine view synchronization incur quadratic communication (in , the number of parties). A cascade of view changes may thus result in communication complexity. This paper presents a new Byzantine view synchronization algorithm named that has optimistically linear communication complexity and constant latency. Faced with benign failures, has expected linear communication and constant latency.
The result here serves as an important step towards reaching solutions that have overall quadratic communication, the known lower bound on Byzantine fault tolerant consensus. is particularly useful for a family of BFT protocols that already exhibit linear communication under various circumstances, but suffer quadratic overhead due to view synchronization.
Keywords: Distributed systems
Logical synchronization is a requisite for progress to be made in asynchronous state machine replication (SMR). Previous Byzantine fault tolerant (BFT) synchronization mechanisms incur quadratic message complexities, frequently dominating over the linear cost of the consensus cores of BFT solutions. In this work, we define the view synchronization problem and provide the first solution in the Byzantine setting, whose latency is bounded and communication cost is linear, under a broad set of scenarios.
Many practical reliable distributed systems do not rely on network synchrony because networks go through outages and periods of Distributed Denial-of-Service (DDoS) attacks; and because synchronous protocols have hard coded steps that wait for a maximum delay. Instead, asynchronous replication solutions via state machine replication (SMR) [1] usually optimize for stability periods. This approach is modeled as partial synchrony [2]. It allows for periods of asynchrony in which progress might be compromised, but consistency never does.
In the crash-failure model, this paradigm underlies most successful industrial solutions, for example, the Google Chubbie lock service [1], Yahoo’s ZooKeeper [1], etcd [3], Google’s Spanner [1], Apache Cassandra [4], and others. The algorithmic cores of these systems, e.g., Paxos [5], Viewstamp Replication [6], or Raft [7], revolve around a view-based paradigm. In the Byzantine model, where parties may act arbitrarily, this paradigm underlies many blockchain systems, including VMware’s Concord [8], Hyperledger Fabric [9], Cypherium [10][11], Celo [12], PaLa [13], and Libra [14]. The algorithmic cores of these BFT system are view-based, e.g., PBFT [15], SBFT [16], and HotStuff [17].
The advantage of the view-based paradigm is that each view has a designated leader from the parties that can drive a decision efficiently. Indeed, in both models, there are protocols that have per-view linear message and communication complexity, which is optimal.
In order to guarantee progress, nodes must give up when a view does not reach a decision after a certain timeout period. Mechanisms for changing the view whose communication is linear exist both for the crash model (all the above) and, recently, for the Byzantine model (HotStuff [17]). An additional requirement for progress is that all nodes overlap for a sufficiently long period. Unfortunately, all of the above protocols incur quadratic message complexity for view synchronization.
In order to address this, we first define the view synchronization problem independently of any specific protocol and in a fault-model agnostic manner. We then introduce a view synchronization algorithm called whose message complexity is linear in expectation, as well as in the worst-case under a broad set of conditions.
We introduce the problem of view synchronization. All nodes start at view zero. A view change occurs as an interplay between the synchronizer, which implements a view synchronization algorithm, and the outer consensus solutions. The consensus solution must signal that it wishes to end the current view via a notification. The synchronizer eventually invokes a consensus signal to indicate when a new view starts. View synchronization requires to eventually bring all honest nodes to execute the same view for a sufficiently long time, for the outer consensus protocol to be able to drive progress.
The two measures of interest to us are latency and communication complexity between these two events. Latency is measured only during periods of synchrony, when a bound on message transmission delays is known to all nodes, and is expressed in units.
View synchronization extends the PaceMaker abstraction presented in [17], formally defines the problem it solves, and captures it as a separate component. It is also related to the seminal work of Chandra & Toueg [18], [19] on failure detectors. Like failure detectors, it is an abstraction capturing the conditions under which progress is guaranteed, without involving explicit engineering solutions details such as packet transmission delays, timers, and computation. Specifically, Chandra & Toueg define a leader election abstraction, denoted , where eventually all non-faulty nodes trust the same non-faulty node as the leader. was shown to be the weakest failure detector needed in order to solve consensus. Whereas Chandra & Toueg’s seminal work focuses on the possibility/impossibility of an eventually elected leader, here we care about how quickly it takes for a good leader to emerge (i.e., the latency), at what communication cost, and how to do so repeatedly, allowing the extension of one time single-shot consensus to SMR.
We tackle the view synchronization problem against asynchrony and the most severe type of faults, Byzantine [20][21]. This makes the synchronizers we develop particularly suited for Byzantine Fault Tolerance (BFT) consensus systems relevant in today’s cryptoeconomic systems.
More specifically, we assume a system of nodes that need to form a sequence of consensus decisions that implement SMR. We assume up to nodes are Byzantine, the upper bound on the number of Byzantine nodes in which Byzantine agreement is solvable [22]. The challenge is that during “happy” periods, progress might be made among a group of Byzantine nodes cooperating with a “fast” sub-group of the honest nodes. Indeed, many solutions advance when a leader succeeds in proposing a value to a quorum of nodes, but it is possible that only the “fast” honest nodes learn it and progress to the next view. The remaining “slow” honest nodes might stay behind, and may not even advance views at all. Then at some point, the Byzantine nodes may stop cooperating. A mechanism is needed to bring the “slow” nodes to the same view as the “fast” ones.
Thus, our formalism and algorithms may be valuable for the consensus protocols mentioned above, as well as others, such as Casper [23] and Tendermint [24][25], which reported problems around liveness [26][27].
We first extract two synchronization mechanisms that borrow from previous BFT consensus protocols, casting them into our formalism and analyzing them.
One is a straw-man mechanism that requires no communication at all and achieves synchronization albeit with unbounded latency. This synchronizer works simply by doubling the duration of each view. Eventually, it guarantees a sufficiently long period in which all the nodes are in the same view.
The second is the broadcast-based synchronization mechanism built into PBFT [15] and similar Byzantine protocols, such as [16]. This synchronizer borrows from the Bracha reliable broadcast algorithm [28]. Once a node hears of nodes who wish to enter the same view, it relays the wish reliably so that all the honest nodes enter the view within a bounded time.
The properties of these synchronizers in terms of latency and communication costs are summarized in Table 1. For brevity, these algorithms and their analysis are deferred to Appendix A.
The main contribution of our work is , which is a leader-based view synchronization algorithm. utilizes views that have an honest leader to relay messages, instead of broadcasting them. When a node wishes to advance a view, it sends the message to the leader of the view, and not to all the other nodes. If the leader is honest, it will gather the messages from the nodes and multicast them (send the same message to all the other nodes) using a threshold signature [29][30][31] to the rest of the nodes, incurring only a linear communication cost. The protocol implements additional mechanisms to advance views despite faulty leaders.
The latency and communication complexity of this algorithm depend on the number of actual failures and their type. In the best case, the latency is constant and communication is linear. Faced with benign failures, in expectation, the communication is linear and in worst-case , as mandated by the lower bound of Dolev & Reischuk [32]; the latency is expected constant and in the worst-case. Byzantine failures do not change the latency, but they can drive the communication to an expected complexity and in the worst-case up to . It remains open whether a worst-case linear synchronizer whose latency is constant is possible.
To summarize, performs just as well as a broadcast-based synchronizer in terms of latency and message complexity, and in certain scenarios shows up-to better results in terms of message complexity. summarizes the properties of all three synchronizers.
The contributions of this paper are as follows:
To the best of our knowledge, this is the first paper to formally define the problem of view synchronization.
It includes two natural synchronizer algorithms cast into this framework and uses them as a basis for comparison.
It introduces , a leader-based Byzantine synchronizer exhibiting faultless and expected linear communication complexity and constant latency.
The rest of this paper is structured as follows: Section 2 discusses the model; Section 3 formally presents the view synchronization problem; Section 4 presents the view synchronization algorithm with formal correctness proof latency and communication cost analysis; Section 5 describes real-world implementations where the view synchronization algorithms can be integrated; Section 6 presents related work; and Section 7 concludes the paper. The description of the two natural view synchronization algorithms, view doubling and broadcast-based, are presented in Appendix A.
We follow the eventually synchronous model [2] in which the execution is divided into two durations: first, an unbounded period of asynchrony, where messages do not have a bounded time until delivered; and then, a period of synchrony, where messages are delivered within a bounded time, denoted as The switch between the first and second periods occurs at a moment named Global Stabilization Time (). We assume all messages sent before GST arrive at or before
Our model consists of a set of nodes, and a known mapping, denoted by : that continuously rotates among the nodes. Formally, . We use a cryptographic signing scheme, a public key infrastructure (PKI) to validate signatures, as well as a threshold signing scheme [29][30][31]. The threshold signing scheme is used in order to create a compact signature of -of- nodes and is used in other consensus protocols such as [30]. Usually or .
We assume a non-adaptive adversary who can corrupt up to nodes at the beginning of the execution. This corruption is done without the knowledge of the mapping . The set of remaining honest nodes is denoted . We assume the honest nodes may start their local execution at different times.
In addition, as in [1][30], we assume the adversary is polynomial-time bounded, i.e., the probability it will break the cryptographic assumptions in this paper (e.g., the cryptographic signatures, threshold signatures, etc.) is negligible.
We define a synchronizer, which solves the view synchronization problem, to be a long-lived task with an API that includes a operation and a signal, where . Nodes may repeatedly invoke , and in return get a possibly infinite sequence of signals. Informally, the synchronizer should be used by a high-level abstraction (e.g., BFT state-machine replication protocol) to synchronize view numbers in the following way: All nodes start in view , and whenever they wish to move to the next view they invoke . However, they move to view only when they get a signal.
Formally, a time interval consists of a starting time and an ending time and all the time points between them. ’s length is . We say if begins after or when begins, and ends before or when ends. We denote by the time when node gets the signal , and assume that all nodes get at the beginning of their execution. We denote time as the time when the last honest node began its execution, formally . We further denote as the time interval in which node is in view , i.e., begins at and ends at . We say node is at view at time , or executes view at time , if .
We are now ready to define the two properties that any synchronizer must achieve. The first property, named view synchronization ensures that there is an infinite number of views with an honest leader that all the correct nodes execute for a sufficiently long time:
Property 1 (View Synchronization): For every there exists and an infinite number of time intervals and views , such that if the interval between every two consecutive calls to by an honest node is , then for any and any the following holds:
The second property ensures that a synchronizer will only signal a new view if an honest node wished to advance to it. Formally:
Property 2 (Synchronization Validity): The synchronizer signals only if there exists an honest node and some view s.t. calls at least times while executing view .
The parameter , which is used in Property 1 is the time an honest node waits between two successive invocations of , and may differ between view synchronization algorithms. This parameter is needed to make sure that is called an infinite number of times in an infinite run. In reality, it is likely that in most view synchronization algorithms is larger than some value which is a function of the message delivery bound , and also of from Property 1, i.e., the synchronization algorithm will work for any . In this case, a consensus protocol using the synchronizer can execute the same view as long as progress is made, and trigger a new view synchronization in case liveness is lost. See Appendix A.3 for concrete examples.
The requirement that the leader of all the synchronized views is honest is needed to ensure that once a view is synchronized, the leader of that view will drive progress in the upper-layer protocol, thus ensuring liveness. Without this condition, a synchronizer might only synchronize views with faulty leaders.
Synchronization validity (Property 2) ensures that the synchronizer does not suggest a new view to the upper-layer protocol unless an honest node running that upper-layer protocol wanted to advance to that view.
In order to define how the latency and message communication complexity are calculated, we first define to be the time at which the -th view synchronization is reached. Formally, , where is defined according to Property 1.
With this we can define the latency of a synchronizer implementation:
Definition 3.1 (Synchronizer Latency): The latency of a synchronizer is defined as .
Next, in order to define communication complexity, we first need to introduce a few more notations. Let be the total number of messages sent between and . In addition, denote as the total number of messages sent by from the beginning of ’s execution and .
With this, we define the communication complexity of a synchronizer implementation:
Definition 3.2 (Synchronizer communication complexity): Denote the view in which view synchronization occurs (Property 1). The message communication cost of a synchronizer is defined as .
This concludes the formal definition of the view synchronization problem. Next, we present , a view synchronization algorithm with expected constant latency and linear communication complexity in a variety of scenarios.
Before presenting , it is worth mentioning that we assume that all messages between nodes are signed and verified; for brevity, we omit the details about the cryptographic signatures. In the algorithm, when a node collects messages from senders, it is implied that these messages carry distinct signatures. We also assume that the mapping is based on a permutation of the nodes such that every consecutive views have at least one honest leader, e.g., . The algorithm can be easily altered to a scenario where this is not the case.
is a new approach to view synchronization that leverages leaders to optimistically achieve linear communication. The key idea is that instead of nodes broadcasting synchronization messages all-to-all and incurring quadratic communication, nodes send messages to the leader of the view they wish to enter. If the leader is honest, it will relay a single broadcast containing an aggregate of all the messages it received, thus incurring only linear communication.
If a leader of a view is Byzantine, it might not help as a relay. In this case, the nodes time out and then try to enlist the leaders of subsequent views, one by one, up to view , to help with relaying. Since at least one of those leaders is honest, one of them will successfully relay the aggregate.
The full protocol is presented in , and consists of several message types. The first two are sent from a node to a leader. They are used to signal to the leader that the node is ready to advance to the next stage in the protocol. Those messages are named and where is the view the message refers to.
The other two message types are ones that are sent from leaders to nodes. The first is called (short for “Time Certificate”) and is sent when the leader receives messages; and the second is called (short for “Quorum Certificate”) and is sent when the leader receives messages. In both cases, a leader aggregates the messages it receives using threshold signatures such that each broadcast message from the leader contains only one signature.
The general flow of the protocol is as follows: When is invoked, the node sends to , where is the view succeeding (Line 5). Next, there are two options: (i) If forms a , it broadcast it to all nodes (Line 7). The nodes then respond with message to the leader (Line 10) (ii) Otherwise, if time elapses after sending to without receiving , a node gives up and sends to the next leader, i.e., (Line 24). It then waits again before forwarding to , and so on, until is received.
Whenever has been received, a node sends (even if it did not send ) to . Additionally, as above, it enlists leaders one by one until is obtained. Here, the node sends leaders as well as . When a node finally receives from a leader, it enters view immediately (Line 17).
We will prove that achieves eventual view synchronization (Property 1) for any as well as synchronization validity (Property 2). Thus, the claims and lemmas below assume this.
We start by proving that if an honest node entered a new view, and the leader of that view is honest, then all the other honest nodes will also enter that view within a bounded time.
Claim 4.1: After , if an honest node enters view at time , and the leader of view is honest then all the honest nodes enter view by , i.e., if then .
PROOF: Let be the first honest node that entered view at time . entered view since it received from such that (Line 17).
If then we are done, since when sent it also sent it to all the other honest nodes (Line 16), which will be received by , and all the honest nodes will enter view .
Next, if then the only way for to send is if it gathered messages (), meaning at least of the messages were sent by honest nodes. An honest node will send a message only after first receiving from s.t. (Line 10).
Since when receiving a an honest node sends the to (Line 12), then will receive by , will forward it to all other nodes by , who will send to by and by all honest nodes will receive from and enter view .
Next, assuming an honest node entered a new view, we bound the time it takes to at least honest nodes to enter the same view. Note that this time we do not assume anything on the leader of the new view, and it might be Byzantine.
Claim 4.2: After , when an honest node enters view at time , at least honest nodes enter view by , i.e., after for every there exists a group S of honest nodes s.t. and .
PROOF: Let be the first node that entered view at time . entered since it received from and (Line 17). If is honest then we are done, since multicasted to all honest nodes (Line 16), and within all honest nodes will also enter view by .
Next, if is Byzantine, then it might have sent to a subset of the honest nodes, potentially only to . In order to form a , had to receive messages (Line 14), meaning that at least honest nodes sent to . Denote as the group of those honest nodes.
Each node in sent message since it received from for (Line 10). Note that different nodes in might have received from a different leader, i.e., might not be the same leader for each node in .
After a node in sent it will either receive a within and enter view , or timeout after and send with to (Line 30). They will continue to do so when not receiving for the next views after . This ensures that at least one honest leader will receive after at most . Then, this honest leader will multicast the it received (Line 7) and at most by , all the honest nodes will receive The honest nodes will then send to the honest leader, which will be able to create and multicast it. The will thus be received by all the honest nodes by and we are done.
Next, we show that during the execution, an honest node will enter some new view.
Claim 4.3: After , some honest node enters a new view.
PROOF: From Claim 4.2, if an honest node enters some view , the time by which at least another other honest nodes also enter is bounded. Eventually, those honest nodes will timeout and will be invoked (Line 5), which will cause them to send to .
If is honest, then it will send a to all the nodes (Line 7) which will be followed by the leader sending a (), and all honest nodes will enter view .
If is not honest then the protocol dictates that the honest nodes that wished to enter will continue to forward their message to the next leaders (up to , ) until each of them receives This is guaranteed since at least one of those leaders is honest.
The same process is then followed for (Line 28), and eventually all of those honest nodes will enter view .
Lemma 4.4: achieves eventual view synchronization (Property 1).
PROOF: From Claim 4.3 an honest node eventually will enter a new view, and by at least honest nodes will enter the same view within a bounded time. By applying Claim 4.3 recursively and again, eventually, a view with an honest leader is reached and by Claim 4.1 all honest nodes will enter the view within .
Thus, for any , if the protocol is run with it is guaranteed that all honest nodes will eventually execute the same view for .
The above arguments can be applied inductively, i.e., there exists an infinite number of such intervals and views in which view synchronization is reached, also ensuring that the views that synchronized also have an honest leader.
Lemma 4.5: achieves synchronization validity (Property 2).
To enter a new view a is needed, which is consisted of messages i.e., at least are from honest nodes. An honest node will send message only when it receives a message, that requires message, meaning at least one of those messages came from an honest node.
An honest node will send when the upper-layer protocol invokes while it was in view .
This concludes the proof that is a synchronizer for any . Similar to the broadcast-based synchronizer, it allows upper-layer protocols to determine the time they spend in each view.
Let be the maximum view an honest node is in at , and let denote the number of consecutive Byzantine leaders after . Assuming that leaders are randomly allocated to views, then is a random variable of a geometrical distribution with a mean of . This means that in the worst case of , then .
Since when honest nodes at view want to advance to view , and if is honest, all honest nodes enter view in constant time (Claim 4.1), the latency for view synchronization, in general, is . For the same reasoning, this is also the case for any two intervals between view synchronizations (see Definition 3.1).
In the worst-case of , where is the number of actual failures during the run, then latency is linear in the view duration, i.e., . But, in the expected case of a constant number of consecutive Byzantine leaders after , the expected latency is .
For communication complexity, there is a difference between Byzantine failures and benign ones. If a Byzantine leader of a view obtains for , then it can forward the to all the leaders that follow view and those leaders will multicast the message (Line 7), leading to expected communication complexity, in the case of at least one Byzantine leader after . In the worst-case of a cascade of failures after , the communication complexity is .
In the case of benign failures, communication complexity is dependent on , since the first correct leader after will get all nodes to enter his view and achieve view synchronization, and the benign leaders before it will only cause delays in terms of latency, but will not increase the overall number of messages sent. Thus, in general, the communication complexity with benign failures is . In the worst-case of communication complexity is , but in the average case it is linear, i.e., . For the same reasoning, this is also the case between any consecutive occurrences of view synchronization (see Definition 3.2).
To sum up, the expected latency for both benign and Byzantine failures is , and worst-case . Communication complexity for Byzantine nodes is optimistically , expected , and worst-case and for benign failures is expected and worst-case .
achieves expected constant latency and linear communication under a broad set of assumptions. It is another step in the direction of reaching the quadratic communication lower bound of Byzantine consensus in an asynchronous model [32].
In addition to we present in Appendix A two more view synchronization algorithms. The first one is view doubling, where nodes simply double their view duration when entering a new view, which guarantees that eventually all nodes will be in the same view for sufficiently long. The other algorithm is borrowed from consensus protocols such as PBFT [15] and SBFT [16]. In Appendix A.3 we present a comprehensive discussion on all three algorithms.
In this section, we describe real-world usages of the view synchronization algorithms. Many times, in different works, the terms “phase,” “round,” and “view” are mixed. In this work when “view” is mentioned, the meaning is that all the nodes agree on some integer value, mapped to a specific node that acts as the leader.
There are SMR protocols where as long as the leader is driving progress in the protocol it is not changed. This will correspond to all the nodes staying in the same view, and this view can be divided into many phases, e.g., in PBFT [15] a single-shot consensus consists of two phases. In an SMR protocol based on PBFT a view can consist of many more phases, all with the same leader as long as progress is made, and there is no bound on the view duration.
As mentioned in Section 1.2, in HotStuff [17], the view synchronization logic is encapsulated in a module named a PaceMaker, but does not provide a formal definition of what the PaceMaker does, nor an implementation. The most developed work which adopted HotStuff as the core of its consensus protocol is LibraBFT [33]. In LibraBFT, a module also named a PaceMaker is in charge of advancing views. In this module whenever a node timeouts of its current view, say view , it sends a message named “TimeoutMsg, ”, and whenever it receives of these messages, it advances to view . In addition, the node sends an aggregated signature of these messages to the leader of view , which according to the paper, if the leader of is honest, guarantees that all other nodes will enter view within . The current implementation of the PaceMaker is linear communication as long as there are honest leaders, but quadratic upon reaching a view with a Byzantine one. The latency is constant.
Many other works on consensus rely on view synchronization as part of their design. For example, in [34] a doubling view synchronization technique is used: “For the view-change process, each replica will start with a timeout and double this timeout after each view-change (exponential backoff). When communication becomes reliable, exponential backoff guarantees that all replicas will eventually view-change to the same view at the same time.”
The idea of doubling round duration to cope with partial synchrony borrows from the DLS work [2], and has been employed in PBFT [15] and in various works based on DLS/PBFT [33][25][17]. In these works, nodes double the length of each view when no progress is made. The broadcast-based synchronization algorithm is also employed as part of the consensus protocol in works such as PBFT.
HotStuff [17] encapsulates view synchronization in a separate module named a PaceMaker. Here, we provide a formal definition, concrete solutions, and performance analysis of such a module. HotStuff is the core consensus protocol of various works such as Cypherium [11], PaLa [13], and LibraBFT [33]. Other consensus protocols such as Tendermint [25] and Casper [23] reported issues related to the liveness of their design [26][27].
Causal ordering is a notion designed to give partial ordering to events in a distributed system. The most known protocols to provide such ordering are Lamport Timestamps [35] and vector clocks [36]. Both works assume a non-crash setting.
Another line of work stemmed from Awerbuch’s work on synchronizers [37]. The synchronizer in Awerbuch’s work is designed to allow an algorithm that is designed to run in a synchronous network to run in an asynchronous network without any changes to the synchronous protocol itself. This work is orthogonal to the work in this paper.
Recently, Ford published preliminary work on Threshold Logical Clocks (TLC) [38]. In a crash-fail asynchronous setting, TLC places a barrier on view advancement, i.e., nodes advance to view only after a threshold of them reached view . A few techniques are also described on how to convert TLCs to work in the presence of Byzantine nodes. The TLC notion of a view “barrier” is orthogonal to view synchronization, though a 2-phase TLC is very similar to our reliable broadcast synchronizer.
The seminal work of Chandra & Toueg [18], [19] introduces the leader election abstraction, denoted , and proves it is the weakest failure detector needed to solve consensus. By using , consensus protocols can usually be written in a more natural way. The view synchronization problem is similar to , but differs in several ways. First, it lacks any notion of leader and isolates the view synchronization component. Second, view synchronization adds recurrence to the problem definition. Third, it has a built-in notion of view-duration: nodes commit to spend a constant tine in a view before moving to the next. Last, this paper focuses on latency and communication costs of synchronizer implementations.
Dutta et al. [39] look at the number of rounds it takes to reach consensus in the crash-fail model after a time defined as GSR (Global Stabilization Round) which only correct nodes enter. This work provides an upper and a lower bound for reaching consensus in this setting. Other works such as [40][41] further discuss the latency for reaching consensus in the crash-fail model. These works focus on the latency for reaching consensus after . Both bounds are tangential to our performance measures, as they analyze round latency. GIRAF [42][43] is a view-based framework to analyze consensus protocols, and specifically analyzes protocols in the crash-fail model.
Dolev et al. [32] showed a quadratic lower bound on communication complexity to reach deterministic Byzantine broadcast, which can be reduced to consensus. This lower bound is an intuitive baseline for work like ours, though it remains open to prove a quadratic lower bound on view synchronization per se.
The clock synchronization problem [44] in a distributed system requires that the maximum difference between the local clock of the participating nodes is bounded throughout the execution, which is possible since most works assume a synchronous setting. The clock synchronization problem is well-defined and well-treaded, and there are many different algorithms to ensure this in different models, e.g., [45][46][47]. In practical distributed networks, the most prevalent protocol is NTP [48]. Again, clock synchronization is an orthogonal notion to view synchronization, the latter guaranteeing to and stay in a view within a bounded window, but does not place any bound on the views of different nodes at any point in time.
We formally defined the Byzantine view synchronization problem, which bridges classic works on failure detectors aimed to solve one-time consensus, and SMR which consists of multiple one-time consensus instances. We presented which is a view synchronization algorithm that displays linear communication cost and constant latency under a broad variety of scenarios.
This project was partially funded by a grant from the Technion Hiroshi Fujiwara Cyber Security Research Center.
In this section we place into the view synchronization framework two view synchronization algorithms which are used in various consensus protocols, and prove their correctness, as well as discuss their latency and message complexity.
All protocol messages between nodes are signed and verified; for brevity, we omit the details about the cryptographic signatures.
A solution approach inspired by PBFT [15] is to use view doubling as the view synchronization technique. In this approach, each view has a timer, and if no progress is made the node tries to move to the next view and doubles the timer time for the next view. Whenever progress is made, the node resets its timer. This approach is intertwined with the consensus protocol itself, making it hard to separate, as the messages of the consensus protocol are part of the mechanism used to reset the timer.
We adopt this approach and turn it into an independent synchronizer that requires no messages. Fist, the nodes need to agree on some predefined constant which is the duration of the first view. Next, there exists some global view duration mapping , which maps a view to its duration: . A node in a certain view must move to the next view once this duration passes, regardless of the outer protocol actions.
The view doubling protocol is described in Algorithm 2. A node starts at view () and a view duration of (Line 4). Next, when is called, a counter named is incremented (Line 5). This counter guarantees validity by moving to a view only when the counter reaches . Every time a view ends (Line 7), an internal counter is incremented, and if the allows it, the synchronizer outputs with a new view .
We show that the view doubling protocol achieves the properties required by a synchronizer.
Lemma A.1: The view doubling protocol achieves view synchronization (Property 1).
PROOF: Since this protocol does not require sending messages between nodes, the Byzantine nodes cannot affect the behavior of the honest nodes, and we can treat all nodes as honest.
Recall that denotes the time by which all the honest nodes started their local execution of . Let be the view at which node is at during . W.l.o.g assume at time . It follows from the definition of and the sum of a geometric series that
We begin by showing that for every the following condition holds: for any view . Let and . From the ordering of the node starting times, for all . We get:
Hence, for , since at node had a view number larger than , then will start all future views before .
Next, let and , i.e., the minimal view and the maximal view at respectively. To prove that the first interval of view synchronization is achieved, it suffices to show that for any constant there exists a time interval and a view such that and . Using this, we will show that there exists an infinite number of such intervals and views that will conclude the proof. This also ensures that there is an infinite number of such views with honest leaders.
Indeed, first note that as shown above, node will start view before any other node in the system. The left-hand side of the equation is the time length in which both node and node execute together view . If the left-hand side is negative, then there does not exist an overlap, and if it is positive then an overlap exists.
We get
For any there exists a minimum view number such that the inequality holds, and since is the minimum view number at this solution holds for any other node as well. In addition, for any the inequality also holds, meaning there is an infinite number of solutions for it, including an infinite number of views with an honest leader.
If is called in intervals with then by the time the value of reaches some view value , will always be bigger than , meaning the condition in will always be true, and the synchronizer will always propose view by the time stated in Equation 1.
Lemma A.2: The view doubling protocol achieves synchronization validity (Property 2).
PROOF: The if condition in Line 10 ensures that the output of the synchronizer will always be a view that a node wished to advance to.
This concludes the proof that view doubling is a synchronizer for any .
Since the protocol sends no messages between the nodes, it is immediate that the communication complexity is .
As for latency, the minimal satisfying Equation 2 grows with . Since the initial view-gap is unbounded, so is the view in which synchronization is reached. The latency to synchronization is , also unbounded.
Another leaderless approach is based on the Bracha reliable broadcast protocol [28] and is presented in Algorithm 3. In this protocol, when a node wants to advance to the next view it multicasts a message (multicast means to send the message to all the nodes including the sender) (Line 3). When at least messages are received by an honest node, it multicasts as well (Line 5). A node advances to view upon receiving messages (Line 7).
We start by showing that the broadcast-based synchronizer achieves eventual view synchronization (Property 1) for any . Thus, the claims and lemmas below assume this.
Claim A.3: After GST, whenever an honest node enters view at time , all other honest nodes enter view by , i.e.,
PROOF: Suppose an honest node enters view at time , then it received messages, from at least honest nodes (Line 7).
Since the only option for an honest node to disseminate message is by multicasting it, then by all nodes will receive at least messages. Then, any left honest nodes (at most nodes) will thus receive enough to multicast the message on their own (Line 5) which will be received by by all the nodes. This ensures that all the honest nodes receive messages and enter view by .
Claim A.4: After GST, eventually an honest node enters some new view.
PROOF: All honest nodes begin their local execution at view , potentially at different times. Based on the protocol eventually at least nodes (some of them might be Byzantine) send . This is because is called every . Thus, eventually all honest nodes will reach view , and from Claim A.3 the difference between their entry is at most after .
The above argument can be applied inductively. Suppose at time node is at view . We again know that by all other honest nodes are also at view , and once are sent all honest nodes will eventually enter view , and we are done.
Lemma A.5: The broadcast-based protocol achieves view synchronization (Property 1).
Proof. From Claim A.4 an honest node will eventually advance to some new view and from Claim A.3 after all other honest nodes will join it. For any , if the honest nodes call every then it is guaranteed that all the honest nodes will execute view together for at least time, since it requires messages to move to view , i.e., at least one message is sent from an honest node.
This argument can be applied inductively, and each view after is synchronized, thus making an infinite number of time intervals and views which all honest leaders execute at the same time.
Lemma A.6: The broadcast-based synchronizer achieves synchronization validity (Property 2).
PROOF: In order for an honest node to advance to view it has to receive messages (Line 7). From those, at least originated from honest nodes. An honest node can send on two scenarios:
(i) was called when the node was at view (Line 3) and we are done.
(ii) It received messages (Line 5), meaning at least one honest node which already sent the message was at view and called and again we are done.
This concludes the proof that the broadcast-based synchronizer is a view synchronizer for any .
The broadcast-based algorithm synchronizes every view after within Since the leaders of each view are allocated by the mapping , in expectation every nodes have an honest leader (see the communication complexity analysis done for in Section 4). Therefore, for latency, the broadcast-based synchronizer will take an expected constant time to reach view synchronization after , as we have proved, and also the same between every two consecutive occurrences of view synchronization. Thus, the latency of this protocol is expected . In the worst-case of consecutive failures, the latency is .
For communication costs, the protocol requires that every node sends one message to all the other nodes, and since the latency is expected constant, the overall communication costs are also expected quadratic, i.e., . In the worst-case of consecutive failures, the communication complexity is .
The three presented synchronizers in the paper have tradeoffs in their latency and communication costs, which are summarized in Table 1. Hence, a protocol designer may choose a synchronizer based on its needs and constraints. It might be possible to create combinations of the three protocols and achieve hybrid characteristics; we leave such variations for future work.
In addition, there are differences in the constraints on the parameter in these protocols, which is the time interval between two successive calls to (see Property 1). The view doubling synchronizer prescribes a precise , which results in each view duration to be exactly twice as its predecessor. In the other two synchronizers there is only a lower bound on : in the broadcast-based it is , and in it is .
This difference is significant. Suppose an upper-layer protocol utilizing the synchronizer wishes to spend an unbounded amount of time in each view as long as progress is made, and triggers a view-change upon detecting that progress is lost. While the broadcast-based and algorithms allow this upper-layer behavior, the view doubling technique does not, and thus may influence the decision on which view synchronization algorithm to choose.
Another difference between the algorithms is that the view-doubling and the broadcast-based synchronizers both guarantee that after the first synchronized view, all subsequent views are also synchronized, regardless if the leaders are honest or not. only guarantees synchronization after GST in views that have an honest leader. For most leader-based consensus protocols, this guarantee suffices to ensure progress, other protocols using a synchronizer might find the strengthened guarantee preferable.