【论文阅读】 The Honey Badger of BFT Protocols

本文链接：https://blog.csdn.net/miracleoa/article/details/127222512

论文标题：BFT协议中的蜜糖獾

摘要

加密货币的惊人成功引发了人们对为关键任务应用(如金融交易)部署大规模、高鲁棒、拜占庭容错(BFT)协议的兴趣激增。

The surprising success of cryptocurrencies has led to a surge of interest in deploying large scale, highly robust, Byzantine fault tolerant (BFT) protocols for mission-critical applications, such as financial transactions.

尽管传统的做法是在PBFT(或其变体)等(弱)同步协议的基础上构建，但此类协议严重依赖于网络定时假设，并且仅在网络行为符合预期时才能保证活性。

Although the conventional wisdom is to build atop a (weakly) synchronous protocol such as PBFT (or a variation thereof), such protocols rely critically on network timing assumptions, and only guarantee liveness when the network behaves as expected.

我们认为这些协议不适合这种部署场景。

We argue these protocols are ill-suited for this deployment scenario.

我们提出了一个替代方案，HoneyBadgerBFT，这是第一个实用的异步BFT协议，它在不做任何时间假设的情况下保证了活性。

We present an alternative, HoneyBadgerBFT, the first practical asynchronous BFT protocol, which guarantees liveness without making any timing assumptions.

我们的解决方案基于一种新颖的原子广播协议，它能达到最优的渐近效率。

We base our solution on a novel atomic broadcast protocol that achieves optimal asymptotic efficiency.

我们给出了一个实现和实验结果，表明我们的系统可以实现每秒数万个事务的吞吐量，并扩展到广域网上的100多个节点。

We present an implementation and experimental results to show our system can achieve throughput of tens of thousands of transactions per second, and scales to over a hundred nodes on a wide area network.

我们甚至在Tor上进行BFT实验，不需要调优任何参数。

We even conduct BFT experiments over Tor, without needing to tune any parameters.

与其他替代方案不同，HoneyBadgerBFT根本不关心底层网络。

Unlike the alternatives, HoneyBadgerBFT simply does not care about the underlying network.

1. 介绍

分布式容错协议是关键任务基础设施(如金融事务数据库)的有前途的解决方案。

Distributed fault tolerant protocols are promising solutions for mission-critical infrastructure, such as financial transaction databases.

传统上，它们以相对较小的规模部署，通常部署在单一的管理领域，对抗性攻击可能不是主要问题。

Traditionally, they have been deployed at relatively small scale, and typically in a single administrative domain where adversarial attacks might not be a primary concern.

作为一个代表性的例子，谷歌的容错锁服务Chubby[14]的部署由五个节点组成，最多可容忍两个崩溃故障。

As a representative example, a deployment of Google’s fault tolerant lock service, Chubby [14], consists of five nodes, and tolerates up to two crash faults.

近年来，一种被称为“加密货币”或“区块链”的分布式系统的新体现出现了，首先是比特币的显著成功[43]。

In recent years, a new embodiment of distributed systems called “cryptocurrencies” or “blockchains” have emerged, beginning with Bitcoin’s phenomenal success [43].

这样的加密货币系统代表了[12]令人惊讶和有效的突破，打开了我们对分布式系统理解的新篇章。

Such cryptocurrency systems represent a surprising and effective breakthrough [12], and open a new chapter in our understanding of distributed systems.

加密货币系统挑战了我们对容错协议部署环境的传统看法。

Cryptocurrency systems challenge our traditional belief about the deployment environment for fault tolerance protocols.

与经典的“谷歌中的5个胖胖的节点”环境不同，加密货币揭示并刺激了对广域网络上的共识协议的新需求，在大量相互不信任的节点之间，此外，网络连接可能比经典的LAN设置更不可预测，甚至是对抗的。

Unlike the classic “5 Chubby nodes within Google” environment, cryptocurrencies have revealed and stimulated a new demand for consensus protocols over a wide area network, among a large number of nodes that are mutually distrustful, and moreover, network connections can be much more unpredictable than the classical LAN setting, or even adversarial.

这种新的设置提出了有趣的新挑战，并呼吁我们重新考虑容错协议的设计。

This new setting poses interesting new challenges, and calls upon us to rethink the design of fault tolerant protocols.

健壮是一等公民。

Robustness is a first-class citizen.

加密货币证明了对一种不同寻常的操作点的需求和可行性，该操作点将稳健性置于一切之上，即使是以牺牲性能为代价。

Cryptocurrencies demonstrate the demand for and viability of an unusual operating point that prioritizes robustness above all else, even at the expense of performance.

事实上，按照分布式系统的标准，比特币提供了糟糕的性能:一个交易平均需要10分钟才能提交，而整个系统的吞吐量为每秒10个交易。

In fact, Bitcoin provides terrible performance by distributed systems standards: a transaction takes on average 10 minutes to be committed, and the system as a whole achieves throughput on the order of 10 transactions per second.

然而，与传统的容错部署场景相比，加密货币在高度敌对的环境中蓬勃发展，在这种环境中，动机良好的恶意攻击是预料之中的(如果不是普遍的)。

However, in comparison with traditional fault tolerant deployment scenarios, cryptocurrencies thrive in a highly adversarial environment, where well-motivated and malicious attacks are expected (if not commonplace).

由于这个原因，许多比特币的狂热支持者将其称为“金钱中的蜜糖獾”[41]。

For this reason, many of Bitcoin’s enthusiastic supporters refer to it as the “Honey Badger of Money” [41].

我们注意到，对健壮性的需求往往与对分散化的需求密切相关- -因为分散化通常需要广域网络中大量不同的参与者的参与。

We note that the demand for robustness is often closely related to the demand for decentralization — since decentralization would typically require the participation of a large number of diverse participants in a wide-area network.

优先考虑吞吐量而不是延迟。

Favor throughput over latency.

大多数现有的关于可扩展容错协议的研究[6,49]都关注于优化由单一管理域控制的局域网环境中的可伸缩性。

Most existing works on scalable fault tolerance protocols [6, 49] focus on optimizing scalability in a LAN environment controlled by a single administrative domain.

由于带宽供应充足，这些工作通常专注于减少(加密)计算和在竞争(即请求竞争同一对象)下最小化响应时间。

Since bandwidth provisioning is ample, these works often focus on reducing (cryptographic) computations and minimizing response time while under contention (i.e., requests competing for the same object).

相比之下，区块链激发了人们对一类金融应用程序的兴趣，这些应用程序的响应时间和争用不是最关键的因素，例如支付和结算网络[1]。

In contrast, blockchains have stirred interest in a class of financial applications where response time and contention are not the most critical factors, e.g., payment and settlement networks [1].

事实上，一些金融应用程序有意地在提交事务时引入延迟，以允许可能的回滚/回退操作。

In fact, some financial applications intentionally introduce delays in committing transactions to allow for possible rollback/chargeback operations.

尽管这些应用程序的延迟并不严重，但银行和金融机构已表示对区块链技术的高吞吐量替代方案感兴趣，以能够维持高容量的请求。

Although these applications are not latency critical, banks and financial institutions have expressed interest in a high-throughput alternative of the blockchain technology, to be able to sustain high volumes of requests.

例如，Visa平均处理2,000 tx/sec，峰值为59,000 tx/sec[1]。

For example, the Visa processes 2,000 tx/sec on average, with a peak of 59,000 tx/sec [1].

1.1我们的贡献

1.1 Our Contributions

时间假设被认为是有害的。

Timing assumptions considered harmful.

大多数现有的拜占庭容错(BFT)系统，甚至那些被称为“鲁棒”的系统，都假设弱同步的某些变化，其中，粗略地说，保证消息在特定的∆之后被交付，但∆可能是时变的或协议设计者未知的。

Most existing Byzantine fault tolerant (BFT) systems, even those called “robust,” assume some variation of weak synchrony, where, roughly speaking, messages are guaranteed to be delivered after a certain bound ∆, but ∆ may be time-varying or unknown to the protocol designer.

我们认为，基于时间假设的协议不适合去中心化的加密货币设置，在这种情况下，网络链接可能不可靠，网络速度变化迅速，网络延迟甚至可能是由对抗引起的。

We argue that protocols based on timing assumptions are unsuitable for decentralized, cryptocurrency settings, where network links can be unreliable, network speeds change rapidly, and network delays may even be adversarially induced.

首先，当预期的时间假设被违反(例如，由于恶意的网络对手)时，弱同步协议的活性属性可能完全失效。

First, the liveness properties of weakly synchronous protocols can fail completely when the expected timing assumptions are violated (e.g., due to a malicious network adversary).

为了演示这一点，我们显式地构造了一个违反假设的对抗性“间歇同步”网络，这样现有的弱同步协议(如PBFT[20])就会逐渐停止(第3节)。

To demonstrate this, we explicitly construct an adversarial “intermittently synchronous” network that violates the assumptions, such that existing weakly synchronous protocols such as PBFT [20] would grind to a halt (Section 3).

其次，即使弱同步假设在实践中得到满足，当底层网络不可预测时，弱同步协议在吞吐量方面也会显著降低。

Second, even when the weak synchrony assumptions are satisfied in practice, weakly synchronous protocols degrade significantly in throughput when the underlying network is unpredictable.

理想情况下，我们希望协议的吞吐量能够密切跟踪网络性能，即使在快速变化的网络条件下也是如此。

Ideally, we would like a protocol whose throughput closely tracks the network’s performance even under rapidly changing network conditions.

不幸的是，弱异步协议需要的超时参数非常挑剔，特别是在加密货币应用程序设置中;当选择的超时值过长或过短时，吞吐量就会受到影响。

Unfortunately, weakly asynchronous protocols require timeout parameters that are finicky to tune, especially in cryptocurrency application settings; and when the chosen timeout values are either too long or too short, throughput can be hampered.

作为一个具体的例子，我们展示了即使在满足弱同步假设的情况下，此类协议从暂态网络分区中恢复的速度也很慢(第3节)。

As a concrete example, we show that even when the weak synchrony assumptions are satisfied, such protocols are slow to recover from transient network partitions (Section 3).

实用的异步BFT。

Practical asynchronous BFT.

我们提出了HoneyBadgerBFT，这是第一个在异步设置中提供最佳渐近效率的BFT原子广播协议。

We propose HoneyBadgerBFT, the first BFT atomic broadcast protocol to provide optimal asymptotic efficiency in the asynchronous setting.

因此，我们直接反驳这种协议必然不切实际的普遍观点。

We therefore directly refute the prevailing wisdom that such protocols a re necessarily impractical.

由于Cachin et al[15]，我们对已知的最好的异步原子广播协议进行了显著的效率改进，该协议要求每个节点对每个提交的事务传输O(N2)位，基本上限制了除最小网络外的所有网络的吞吐量。

We make significant efficiency improvements on the best priorknown asynchronous atomic broadcast protocol, due to Cachin et al [15], which requires each node to transmit O(N2) bits for each committed transaction, substantially limiting its throughput for all but the smallest networks.

这种低效率有两个根本原因。

This inefficiency has two root causes.

第一个原因是各方之间的冗余工作。

The first cause is redundant work among the parties.

然而，试图消除冗余的naïve会损害公平性属性，并允许有针对性的审查攻击。

However, a naïve attempt to eliminate the redundancy compromises the fairness property, and allows for targeted censorship attacks.

我们发明了一种新的解决方案来克服这个问题，使用阈值公开密钥加密来防止这些攻击。

We invent a novel solution to overcome this problem by using threshold publickey encryption to prevent these attacks.

第二个原因是使用了异步公共子集(ACS)子组件的次优实例化。

The second cause is the use of a suboptimal instantiation of the Asynchronous Common Subset (ACS) subcomponent.

我们展示了如何通过结合现有但被忽视的技术有效地实例化ACS:使用擦除码[18]的高效可靠广播，以及从多方计算文献[9]中从ACS简化为可靠广播。

We show how to efficiently instantiate ACS by combining existing but overlooked techniques: efficient reliable broadcast using erasure codes [18], and a reduction from ACS to reliable broadcast from the multi-party computation literature [9].

HoneyBadgerBFT的设计针对类似加密货币的部署场景进行了优化，其中网络带宽是稀缺资源，但计算量相对充足。

HoneyBadgerBFT’s design is optimized for a cryptocurrencylike deployment scenario where network bandwidth is the scarce resource, but computation is relatively ample.

这允许我们利用密码构建块(特别是阈值公钥加密)，而在经典的容错数据库设置中，这种设置的主要目标是即使在争用情况下也要最小化响应时间，因此会被认为太昂贵。

This allows us to take advantage of cryptographic building blocks (in particular, threshold public-key encryption) that would be considered too expensive in a classical fault-tolerant database setting where the primary goal is to minimize response time even under contention.

在异步网络中，消息最终被传递，但不做其他计时假设。

In an asynchronous network, messages are eventually delivered but no other timing assumption is made.

与现有的弱同步协议(其中参数调优非常繁琐)不同，HoneyBadgerBFT不关心这些。

Unlike existing weakly synchronous protocols where parameter tuning can be finicky, HoneyBadgerBFT does not care.

不管网络条件如何波动，HoneyBadgerBFT的吞吐量总是密切跟踪网络的可用带宽。

Regardless of how network conditions fluctuate, HoneyBadgerBFT’s throughput always closely tracks the network’s available bandwidth.

不精确地说，只要消息最终得到传递，HoneyBadgerBFT最终就会取得进展;而且，一旦消息被传递，它就会取得进展。

Imprecisely speaking, HoneyBadgerBFT eventually makes progress as long as messages eventually get delivered; moreover, it makes progress as soon as messages are delivered.

我们正式证明了我们的HoneyBadgerBFT协议的安全性和活性，并通过实验表明，即使在最乐观的情况下，它也比经典的PBFT协议[20]提供了更好的吞吐量。

We formally prove the security and liveness of our HoneyBadgerBFT protocol, and show experimentally that it provides better throughput than the classical PBFT protocol [20] even in the optimistic case.

实施和大规模实验。我们提供了HoneyBadgerBFT的完整实现，它将在不久的将来作为免费的开源软件发布。1我们展示了分布在5个大洲的超过100个节点的亚马逊AWS部署的实验结果。为了演示它的多功能性和鲁棒性，我们还在Tor匿名中继网络上部署了HoneyBadgerBFT，而不更改任何参数，并给出了吞吐量和延迟结果。

Implementation and large-scale experiments. We provide a fullfledged implementation of HoneyBadgerBFT, which will we release as free open source software in the near future.1We demonstrate experimental results from an Amazon AWS deployment with more than 100 nodes distributed across 5 continents. To demonstrate its versatility and robustness, we also deployed HoneyBadgerBFT over the Tor anonymous relay network without changing any parameters, and present throughput and latency results.

1.2建议部署场景

1.2 Suggested Deployment Scenarios

在众多可能的应用程序中，我们强调了银行、金融机构和完全去中心化加密货币倡导者所追求的两种可能的部署场景。

Among numerous conceivable applications, we highlight two likely deployment scenarios that are sought after by banks, financial institutions, and advocates for fully decentralized cryptocurrencies.

联盟cryptocurrencies。

Confederation cryptocurrencies.

比特币等去中心化加密货币的成功，激发了银行和金融机构以新的眼光审视自己的交易处理和结算基础设施。

The success of decentralized cryptocurrencies such as Bitcoin has inspired banks and financial institutions to inspect their transaction processing and settlement infrastructure with a new light.

“联邦加密货币”是一个经常被引用的愿景[24,25,47]，在这个愿景中，金融机构联合为一个拜占庭协议协议做出贡献，以允许快速和健壮的交易结算。

“Confederation cryptocurrency” is an oft-cited vision [24, 25, 47], where a conglomerate of financial institutions jointly contribute to a Byzantine agreement protocol to allow fast and robust settlement of transactions.

人们对这种方式的热情高涨，认为它将简化目前缓慢而笨拙的银行间结算基础设施。

Passions are running high that this approach will streamline today’s slow and clunky infrastructure for inter-bank settlement.

因此，一些新的开源项目致力于为这种设置构建一个合适的BFT协议，例如IBM的open区块链和Hyperledger项目[40]。

As a result, several new open source projects aim to build a suitable BFT protocol for this setting, such as IBM’s Open Blockchain and the Hyperledger project [40].

联盟加密货币将需要部署在广域网上的BFT协议，可能涉及数百到数千个共识节点。

A confederation cryptocurrency would require a BFT protocol deployed over the wide-area network, possibly involving hundreds to thousands of consensus nodes.

在这种设置中，注册可以很容易地控制，这样共识节点集被称为先验—通常称为“允许的”区块链。

In this setting, enrollment can easily be controlled, such that the set of consensus nodes are known a priori — often referred to as the “permissioned” blockchain.

显然，HoneyBadgerBFT是在这种联盟加密货币中使用的自然候选对象。

Clearly HoneyBadgerBFT is a natural candidate for use in such confederation cryptocurrencies.

适用于无许可区块链。

Applicability to permissionless blockchains.

相比之下，去中心化加密货币如比特币和以太坊选择了“未经许可”的区块链，注册对任何人开放，节点可以动态和频繁地加入和离开。

By contrast, decentralized cryptocurrencies such as Bitcoin and Ethereum opt for a “permissionless” blockchain, where enrollment is open to anyone, and nodes may join and leave dynamically and frequently.

为了在这种设置下实现安全性，已知的共识协议依赖于工作量证明来击败Sybil攻击，并在吞吐量和延迟方面付出了巨大的代价，例如，比特币每10分钟提交一次交易，其吞吐量限制为7 tx/秒，即使在当前块大小最大化的情况下。

To achieve security in this setting, known consensus protocols rely on proofs-of-work to defeat Sybil attacks, and pay an enormous price in terms of throughput and latency, e.g., Bitcoin commits transactions every ∼ 10 min, and its throughput limited by 7 tx/sec even when the current block size is maximized.

最近的几项研究提出了一个很有前途的想法，即利用较慢的外部区块链(如比特币)或涉及基础货币本身的经济“权益证明”假设[32,32,35,37]，通过选择一个随机委员会在每个不同的时代执行BFT来引导更快的BFT协议。

Several recent works have suggested the promising idea of leveraging either a slower, external blockchain such as Bitcoin or economic “proof-of-stake” assumptions involving the underlying currency itself [32, 32, 35, 37] to bootstrap faster BFT protocols, by selecting a random committee to perform BFT in every different epoch.

这些方法承诺在开放注册、去中心化网络以及与经典BFT协议相匹配的吞吐量和响应时间方面实现两方面的最佳效果。

These approaches promise to achieve the best of both worlds, security in an open enrollment, decentralized network, and the throughput and response time matching classical BFT protocols.

HoneyBadgerBFT也是一个自然的选择，因为随机选择的委员会可能在地理上是不同的。

Here too HoneyBadgerBFT is a natural choice since the randomly selected committee can be geographically heterogeneous.

2. 背景和相关工作

BACKGROUND AND RELATED WORK

我们的总体目标是构建一个复制状态机，客户端生成并提交事务，节点网络接收并处理事务。

Our overall goal is to build a replicated state machine, where clients generate and submit transactions and a network of nodes receives and processes them.

它从特定于应用程序的细节(例如如何表示状态和计算转换)中抽象出来，足以构建一个完全全局一致的、完全有序的、仅可追加的事务日志。

Abstracting away from application specific details (such as how to represent state and compute transitions), it suffices to build a totally globally-consistent, totallyordered, append-only transaction log.

传统上，这样的原语称为总阶或原子广播[23];用比特币的说法，我们称之为区块链。

Traditionally, such a primitive is called total order or atomic broadcast [23]; in Bitcoin parlance, we would call it a blockchain.

容错状态机复制协议提供了强大的安全性和活性保证，允许分布式系统在网络延迟和一些节点故障的情况下提供正确的服务。

Fault tolerant state machine replication protocols provide strong safety and liveness guarantees, allowing a distributed system to provide correct service in spite of network latency and the failure of some nodes.

大量的工作研究了这样的协议，提供了不同的性能权衡，容忍不同形式的故障和攻击，并对底层网络做出了不同的假设。

A vast body of work has studied such protocols, offering different performance tradeoffs, tolerating different forms of failures and attacks, and making varying assumptions about the underlying network.

我们在下面解释与我们最密切相关的努力。

We explain below the most closely related efforts to ours.

2.1健壮的BFT协议

2.1 Robust BFT Protocols

Paxos[36]、Raft[45]和许多其他著名的协议容忍崩溃故障，而拜占庭容错协议(BFT)，从PBFT[20]开始，甚至容忍任意(例如，恶意)损坏的节点。

While Paxos [36], Raft [45], and many other well-known protocols tolerate crash faults, Byzantine fault tolerant protocols (BFT), beginning with PBFT [20], tolerate even arbitrary (e.g., maliciously) corrupted nodes.

许多后续协议提供了改进的性能，通常是通过乐观的执行，在没有故障、客户端竞争不太多、网络行为良好、至少在其他方面有一些进步的情况下提供了出色的性能[2,5,33,39,51]。

Many subsequent protocols offer improved performance, often through optimistic execution that provides excellent performance when there are no faults, clients do not contend much, and the network is well-behaved, and at least some progress otherwise [2, 5, 33, 39, 51].

一般来说，BFT系统是在部署场景中评估的，其中延迟和CPU是瓶颈[49]，因此最有效的协议减少了轮数并将昂贵的加密操作最小化。

In general, BFT systems are evaluated in deployment scenarios where latency and CPU are the bottleneck [49], thus the most effective protocols reduce the number of rounds and minimize expensive cryptographic operations.

Clement等人[22]发起了一项最新的工作[4,6,10,21,22,50]，主张改进最坏情况下的性能，即使在系统受到攻击时也提供服务质量保证——即使这是以在乐观情况下牺牲性能为代价的。

Clement et al [22] initiated a recent line of work [4, 6, 10, 21, 22, 50] by advocating improvement of the worst-case performance, providing service quality guarantees even when the system is under attack — even if this comes at the expense of performance in the optimistic case.

然而，尽管这种类型的“健壮BFT”协议优雅地容忍受损节点，但它们仍然依赖于对底层网络的时序假设。

However, although the “Robust BFT” protocols in this vein gracefully tolerate compromised nodes, they still rely on timing assumptions about the underlying network.

我们的工作进一步采用了这种方法，即使在完全异步的网络中也能保证良好的吞吐量。

Our work takes this approach further, guaranteeing good throughput even in a fully asynchronous network.

2.2 随机的协议。

2.2 Randomized Agreement

对于大多数任务[27]，确定性异步协议是不可能的。

Deterministic asynchronous protocols are impossible for most tasks [27].

虽然绝大多数实际BFT协议通过做时间假设来避免这种不可能的结果，但随机性(特别是密码学)提供了另一种途径。

While the vast majority of practical BFT protocols steer clear of this impossibility result by making timing assumptions, randomness (and, in particular, cryptography) provides an alternative route.

事实上，我们知道异步BFT协议可用于各种任务，如二进制协议(ABA)、可靠广播(RBC)等[13,15,16]。

Indeed we know of asynchronous BFT protocols for a variety of tasks such as binary agreement (ABA), reliable broadcast (RBC), and more [13, 15, 16].

我们的工作与SINTRA[17]密切相关，SINTRA是一个基于Cachin等人(CKPS01)[15]的异步原子广播协议的系统实现。

Our work is most closely related to SINTRA [17], a system implementation based on the asynchronous atomic broadcast protocol from Cachin et al (CKPS01) [15].

该协议由原子广播协议(ABC)简化为公共子集协议(ACS)， ACS简化为多值验证协议(MVBA)。

This protocol consists of a reduction from atomic broadcast (ABC) to common subset agreement (ACS), as well as a reduction from ACS to multi-value validated agreement (MVBA).

我们贡献的关键发明是从ABC到ACS的新颖简化，通过批处理提供了更好的效率(通过一个O(N)因子)，同时使用阈值加密来保持审查弹性(见第4.4节)。

The key invention we contribute is a novel reduction from ABC to ACS that provides better efficiency (by an O(N) factor) through batching, while using threshold encryption to preserve censorship resilience (see Section 4.4).

我们还通过从改进的子组件实例化的文献中筛选获得了更好的效率。

We also obtain better efficiency by cherry-picking from the literature improved instantiations of subcomponents.

特别是，我们通过使用ACS[9]和高效的RBC[18]来避开昂贵的MVBA原语，详见4.4节。

In particular, we sidestep the expensive MVBA primitive by using an alternative ACS [9] along with an efficient RBC [18] as explained in Section 4.4.

Table 1 summarizes the asymptotic performance of HoneyBadgerBFT with several other atomic broadcast protocols. Here “Comm.compl.” denotes the expected communication complexity (i.e., total bytes transferred) per committed transaction. Since PBFT relies on weak synchrony assumptions, it may therefore fail to make progress at all in an asynchronous network. Protocols KS02 [34] and RC05 [46] are optimistic, falling back to an expensive recovery mode based on MVBA. As mentioned the protocol of Cachin et al (CKPS01) [15] can be improved using a more efficient ACS construction [9, 18]. We also obtain another O(N) improvement through our novel reduction.

表1总结了HoneyBadgerBFT与其他几个原子广播协议的渐近性能。这里 “Comm.compl.”表示每个提交的事务的预期通信复杂度(即传输的总字节数)。由于PBFT依赖于较弱的同步假设，因此在异步网络中它可能根本无法取得进展。协议KS02[34]和RC05[46]是乐观的，回落到基于MVBA的昂贵恢复模式。如前所述，Cachin等人(CKPS01)的[15]协议可以使用更有效的ACS结构进行改进[9,18]。通过我们的新还原，我们还获得了另一个O(N)的改进。

最后，King和Saia[30,31]最近通过在稀疏图上路由通信开发了消息数量小于二次的协议协议。

Finally, King and Saia [30,31] have recently developed agreement protocols with less-than-quadratic number of messages by routing communications over a sparse graph.

然而，将这些结果扩展到异步设置仍然是一个开放的问题。

However, extending these results to the asynchronous setting remains an open problem.

在这里插入图片描述
表1:原子广播协议的渐近通信复杂度(每事务位，预期。

Table 1: Asymptotic communication complexity (bits per transaction, expected) for atomic broadcast protocols

3.异步和弱同步网络模型之间的差距

THE GAP BETWEEN ASYNCHRONOUS AND WEAKLY SYNCHRONOUS NETWORK MODELS

几乎所有的现代BFT协议都依赖于时间假设(如部分或弱同步)来保证活性。

Almost all modern BFT protocols rely on timing assumptions (such as partial or weak synchrony) to guarantee liveness.

近年来，纯异步BFT协议受到的关注要少得多。

Purely asynchronous BFT protocols have received considerably less attention in recent years.

考虑下面的论点，如果它成立，将证明这种缩小的焦点是正确的:[X]弱同步假设是不可避免的，因为在任何违反这些假设的网络中，即使异步协议也会提供不可接受的性能。

Consider the following argument, which, if it held, would justify this narrowed focus: [X] Weak synchrony assumptions are unavoidable, since in any network that violates these assumptions, even asynchronous protocols would provide unacceptable performance.

在这一节中，我们提出了两个反驳上述前提的论点。

In this section, we present make two counterarguments that refute the premise above.

首先，我们说明了异步和弱同步网络模型之间的理论分离。

First, we illustrate the theoretical separation between the asynchronous and weakly synchronous network models.

具体来说，我们构建了一个对抗网络调度器，它违反了PBFT的弱同步假设(并确实导致它失败)，但在这种假设下，任何纯异步协议(如HoneyBadgerBFT)都能取得良好的进展。

Specifically we construct an adversarial network scheduler that violates PBFT’s weak synchrony assumption (and indeed causes it to fail) but under which any purely asynchronous protocol (such as HoneyBadgerBFT) makes good progress.

其次，我们做了一个实际的观察:即使他们的假设得到了满足，弱同步协议从网络分区中恢复的速度很慢，而异步协议在消息被交付后就会迅速恢复。

Second, we make a practical observation: even when their assumptions are met, weakly synchronous protocols are slow to recover from a network partition once it heals, whereas asynchronous protocols make progress as soon as messages are delivered.

3.1多种计时假设形式

3.1 Many Forms of Timing Assumptions

在继续之前，我们回顾了计时假设的各种标准形式。

Before proceeding we review the various standard forms of timing assumptions.

在异步网络中，对手可以在任何时间以任何顺序传递消息，但最终必须传递在正确节点之间发送的每条消息。

In an asynchronous network, the adversary can deliver messages in any order and at any time, but nonetheless must eventually deliver every message sent between correct nodes.

异步网络中的节点实际上并不使用“实时”时钟，只能根据它们接收到的消息的顺序采取行动。

Nodes in an asynchronous network effectively have no use for “real time” clocks, and can only take actions based on the ordering of messages they receive.

众所周知的FLP[27]结果排除了原子广播和许多其他任务使用确定性异步协议的可能性。

The well-known FLP [27] result rules out the possibility of deterministic asynchronous protocols for atomic broadcast and many other tasks.

因此，确定性协议必须做出一些更强的时间假设。

A deterministic protocol must therefore make some stronger timing assumptions.

一个方便的(但非常强大的)网络假设是同步:∆-同步网络保证每条发送的消息最多在∆(其中∆是实时度量)延迟后才被发送。

A convenient (but very strong) network assumption is synchrony: a ∆-synchronous network guarantees that every message sent is delivered after at most a delay of ∆ (where ∆ is a measure of real time).

较弱的时间假设有几种形式。

Weaker timing assumptions come in several forms.

在未知-∆模型中，协议无法使用延迟边界作为参数。

In the unknown-∆ model, the protocol is unable to use the delay bound as a parameter.

或者，在最终同步模型中，消息延迟边界∆只保证在某个(未知)瞬间(称为“全局稳定时间”)后保持不变。

Alternatively, in the eventually synchronous model, the message delay bound ∆ is only guaranteed to hold after some (unknown) instant, called the “Global Stabilization Time.

这两个模型统称为部分同步[26]。

” Collectively, these two models are referred to as partial synchrony [26].

另一种变化是弱同步[26]，其中延迟边界随时间变化，但最终增长速度不超过时间[20]的多项式函数。

Yet another variation is weak synchrony [26], in which the delay bound is time varying, but eventually does not grow faster than a polynomial function of time [20].

就可行性而言，上述两者是等价的——在一种情况下成功的协议可以系统地适用于另一种情况。

In terms of feasibility, the above are equivalent — a protocol that succeeds in one setting can be systematically adapted for another.

然而，就具体性能而言，调整为弱同步意味着随着时间的推移逐渐增加超时参数(例如，通过“指数后退”策略)。

In terms of concrete performance, however, adjusting for weak synchrony means gradually increasing the timeout parameter over time (e.g., by an “exponential back-off” policy).

正如我们稍后所展示的，这将导致从暂态网络分区恢复时出现延迟。

As we show later, this results in delays when recovering from transient network partitions.

协议通常以超时事件的形式显示这些假设。

Protocols typically manifest these assumptions in the form of a timeout event.

例如，如果当事各方发现在某一段时间内没有取得进展，那么它们就采取纠正行动，例如选举新的领导人。

For example, if parties detect that no progress has been made within a certain interval, then they take a corrective action such as electing a new leader.

异步协议不依赖于计时器，无论实际时钟时间如何，只要消息被传递就会取得进展。

Asynchronous protocols do not rely on timers, and make progress whenever messages are delivered, regardless of actual clock time.

异步网络中的轮数。

Counting rounds in asynchronous networks.

尽管最终交付的保证与“实时”的概念是分离的，但仍然需要描述异步协议的运行时间。

Although the guarantee of eventual delivery is decoupled from notions of “real time,” it is nonetheless desirable to characterize the running time of asynchronous protocols.

标准的方法(如Canetti和Rabin[19]所解释的)是对手给每个消息分配一个虚拟整数，但必须在发送任何(r + 1)-消息之前在正确节点之间传递每个(r−1)-消息。

The standard approach (e.g., as explained by Canetti and Rabin [19]) is for the adversary to assign each message a virtual round number, subject to the condition that every (r − 1)-message between correct nodes must be delivered before any (r + 1)-message is sent.

3.2 当弱同步失败

3.2 When Weak Synchrony Fails

我们现在继续描述为什么弱同步BFT协议可以失败(或遭受性能下降)当网络条件是敌对的(或不可预测的)。

We now proceed to describe why weakly synchronous BFT protocols can fail (or suffer from performance degradation) when network conditions are adversarial (or unpredictable).

这就是为什么这些协议不适合第1节中描述的面向加密货币的应用程序场景。

This motivates why such protocols are unsuited for the cryptocurrency-oriented application scenarios described in Section 1.

阻止PBFT的网络调度器。

A network scheduler that thwarts PBFT.

我们使用实用拜占庭容错(PBFT)[20]，经典的基于领导者的BFT协议，它是一个代表性的例子，来描述一个对抗网络调度器如何导致一类基于领导者的BFT协议[4,6,10,22,33,50]陷入停滞。

We use Practical Byzantine Fault Tolerance (PBFT) [20], the classic leader-based BFT protocol, a representative example to describe how an adversarial network scheduler can cause a class of leader-based BFT protocols [4, 6, 10, 22, 33, 50] to grind to a halt.

在任何给定的时间，指定的领导负责提议下一批事务。

At any given time, the designated leader is responsible for proposing the next batch of transactions.

如果由于leader故障或网络停滞而没有取得进展，那么节点将尝试选举新的leader。

If progress isn’t made, either because the leader is faulty or because the network has stalled, then the nodes attempt to elect a new leader.

PBFT协议严重依赖于弱同步网络的活性。

The PBFT protocol critically relies on a weakly synchronous network for liveness.

我们构造了一个违反这个假设的对抗性调度器，它确实完全阻止了PBFT取得任何进展，但HoneyBadgerBFT(事实上，任何异步协议)在这方面都表现良好。

We construct an adversarial scheduler that violates this assumption, and indeed prevents PBFT from making any progress at all, but for which HoneyBadgerBFT (and, in fact, any asynchronous protocol) performs well.

当时序假设被违反时，基于时序假设的协议失败就不足为奇了;然而，演示显式攻击有助于激发我们的异步构造。

It is unsurprising that a protocol based on timing assumptions fails when those assumptions are violated; however, demonstrating an explicit attack helps motivate our asynchronous construction.

我们的调度程序背后的直觉很简单。

The intuition behind our scheduler is simple.

首先，我们假设单个节点已经崩溃。

First, we assume that a single node has crashed.

然后，每当一个正确的节点是leader时，网络就会延迟消息，阻止进程并使下一个节点按循环顺序成为新的leader。

Then, the network delays messages whenever a correct node is the leader, preventing progress and causing the next node in round-robin order to become the new leader.

当崩溃的节点是下一个成为leader的节点时，调度程序立即修复网络分区，并在诚实的节点之间非常快速地传递消息;然而，由于leader已经坠毁，这里也没有任何进展。

When the crashed node is the next up to become the leader, the scheduler immediately heals the network partition and delivers messages very rapidly among the honest nodes; however, since the leader has crashed, no progress is made here either.

这种攻击违反了弱同步假设，因为它必须在每个周期中延迟越来越长的消息，因为PBFT在每次失败的leader选举后都会扩大其超时间隔。

This attack violates the weak synchrony assumption because it must delay messages for longer and longer each cycle, since PBFT widens its timeout interval after each failed leader election.

另一方面，它也提供了越来越长时间的同步。

On the other hand, it provides larger and larger periods of synchrony as well.

然而，由于这些同步期发生在不方便的时间，PBFT无法利用它们。

However, since these periods of synchrony occur at inconvenient times, PBFT is unable to make use of them.

展望未来，HoneyBadgerBFT(实际上是任何异步协议)将能够在这些同步的机会期取得进展。

Looking ahead, HoneyBadgerBFT, and indeed any asynchronous protocol, would be able to make progress during these opportunistic periods of synchrony.

为了证实我们的分析，我们实现了这个恶意调度器作为一个代理，它拦截和延迟所有视图更改消息到新的leader，并在一个1200行的Python PBFT实现上测试它。

To confirm our analysis, we implemented this malicious scheduler as a proxy that intercepted and delayed all view change messages to the new leader, and tested it against a 1200 line Python implementation of PBFT.

我们观察到的结果和消息日志与上面的分析一致;我们的副本陷入了请求视图更改的循环中，但从未成功。

The results and message logs we observed were consistent with the above analysis; our replicas became stuck in a loop requesting view changes that never succeeded.

在Ap附录A中，我们给出了PBFT的完整描述，并解释了它在这种攻击下的行为。

In the Ap pendix A we give a complete description of PBFT and explain how it behaves under this attack.

从网络分区恢复缓慢。

Slow recovery from network partitions.

即使弱同步假设最终得到满足，依赖于它的协议也可能在从暂态网络分区中恢复时速度较慢。

Even if the weak synchrony assumption is eventually satisfied, protocols that rely on it may also be slow to recover from transient network partitions.

考虑以下场景，这只是上面描述的攻击的有限前缀:一个节点崩溃，网络被临时分区，持续时间为2D∆。

Consider the following scenario, which is simply a finite prefix of the attack described above: one node is crashed, and the network is temporarily partitioned for a duration of 2D∆.

当轮到崩溃的节点成为leader时，我们的调度程序会精确地治疗网络分区。

Our scheduler heals the network partition precisely when it is the crashed node’s turn to become leader.

由于此时的超时间隔现在是2D+1∆，协议必须等待另一个2D+1∆间隔后才能开始选举新的leader，尽管在此间隔期间网络是同步的。

Since the timeout interval at this point is now 2D+1∆, the protocol must wait for another 2D+1∆ interval before beginning to elect a new leader, despite that the network is synchronous during this interval.

健壮性和响应性之间的权衡。我们在上面观察到的这些行为并不是PBFT特有的，而是依赖超时来处理崩溃的协议从根本上固有的。不管协议变体是什么，从业者都必须根据某种权衡来调整他们的超时策略。在一种极端情况下(最终同步)，从业者对网络时延∆进行了具体的估计。如果估计过低，那么系统可能根本没有进展;如果太高，就不能利用可用的带宽。在另一个极端(弱同步)，执行者避免指定任何绝对延迟，但仍然必须选择影响系统跟踪变化条件的速度的“增益”。异步协议避免了对这些参数进行调优的需要。

The tradeoff between robustness and responsiveness. Such behaviors we observe above are not specific to PBFT, but rather are fundamentally inherent to protocols that rely on timeouts to cope with crashes. Regardless of the protocol variant, a practitioner must tune their timeout policy according to some tradeoff. At one extreme (eventual synchrony), the practitioner makes a specific estimate about the network delay ∆. If the estimate is too low, then the system may make no progress at all; too high, and it does not utilize the available bandwidth. At the other extreme (weak synchrony), the practitioner avoids specifying any absolute delay, but nonetheless must choose a “gain” that affects how quickly the system tracks varying conditions. An asynchronous protocol avoids the need to tune such parameters.

4. HoneyBadgerBFT协议

THE HoneyBadgerBFT PROTOCOL

在本节中，我们介绍了HoneyBadgerBFT协议，这是第一个实现最佳渐近效率的异步原子广播协议。

In this section we present HoneyBadgerBFT, the first asynchronous atomic broadcast protocol to achieve optimal asymptotic efficiency.

4.1问题定义:原子广播

4.1 Problem Definition: Atomic Broadcast

我们首先定义我们的网络模型和原子广播问题。

We first define our network model and the atomic broadcast problem.

我们的设置涉及一个由N个指定节点组成的网络，这些节点具有不同的已知标识(P0到PN−1)。

Our setting involves a network of N designated nodes, with distinct well-known identities (P0 through PN−1).

节点将事务作为输入接收，它们的目标是就这些事务的顺序达成共同的协议。

The nodes receive transactions as input, and their goal is to reach common agreement on an ordering of these transactions.

我们的模型特别适合“受许可的区块链”的部署场景，其中事务可以由任意客户机提交，但负责执行协议的节点是固定的。

Our model particularly matches the deployment scenario of a “permissioned blockchain” where transactions can be submitted by arbitrary clients, but the nodes responsible for carrying out the protocol are fixed.

原子广播原语允许我们抽象出任何特定于应用程序的细节，比如如何解释事务(例如，为了防止重放攻击，应用程序可以定义包含签名和序列号的事务)。

The atomic broadcast primitive allows us to abstract away any application-specific details, such as how transactions are to be interpreted (to prevent replay attacks, for example, an application might define a transaction to include signatures and sequence numbers).

就我们的目的而言，事务只是惟一的字符串。

For our purposes, transactions are simply unique strings.

在实践中，客户机将生成事务并将其发送到所有节点，并在从大多数节点收集签名后将其视为已提交。

In practice, clients would generate transactions and send them to all of the nodes, and consider them committed after collecting signatures from a majority of nodes.

为了简化表示，我们没有显式地对客户机建模，而是假设事务是由对手选择的，并作为节点的输入提供。

To simplify our presentation, we do not explicitly model clients, but rather assume that transactions are chosen by the adversary and provided as input to the nodes.

同样，事务一旦被节点输出，就被视为已提交。

Likewise, a transaction is considered committed once it is output by a node.

我们的系统模型做了以下假设:。

Our system model makes the following assumptions:

(纯异步网络)我们假设每对节点由一个可靠的认证的点对点信道连接，该信道不会丢弃消息。2交付时间表完全由对手决定，但正确节点之间发送的每条消息最终必须被交付。我们将对基于异步轮数的协议运行时间的特征感兴趣（如第2节所述）。由于网络可能以任意的延迟来排查消息，我们还假设节点有无界的缓冲区，并且能够处理它们收到的所有消息。

(Purely asynchronous network) We assume each pair of nodes is connected by a reliable authenticated point-to-point channel that does not drop messages.2The delivery schedule is entirely determined by the adversary, but every message sent between correct nodes must eventually be delivered. We will be interested in characterizing the running time of protocols based on the number of asynchronous rounds (as described in Section 2). As the network may queue messages with arbitrary delay, we also assume nodes have unbounded buffers and are able to process all the messages they receive.

(静态拜占庭故障)对手被给予最多f个故障节点的完全控制，其中f是一个协议参数。

(Static Byzantine faults) The adversary is given complete control of up to f faulty nodes, where f is a protocol parameter.

注意，在此设置中，3 f + 1≤N(我们的协议实现)是广播协议的下限。

Note that 3 f + 1 ≤ N (which our protocol achieves) is the lower bound for broadcast protocols in this setting.

(可信设置)为了便于表示，我们假设在特定于协议的初始设置阶段，节点可能与可信的经销商交互，我们将使用该设置阶段建立公钥和秘密共享。

(Trusted setup) For ease of presentation, we assume that nodes may interact with a trusted dealer during an initial protocolspecific setup phase, which we will use to establish public keys and secret shares.

注意，在实际部署中，如果实际的受信任方不可用，则可以使用分布式密钥生成协议(c.f, Boldyreva[11])。

Note that in a real deployment, if an actual trusted party is unavailable, then a distributed key generation protocol could be used instead (c.f., Boldyreva [11]).

我们所知道的所有分布式密钥生成协议都依赖于时间假设;幸运的是，这些假设只需要在设置过程中成立。

All the distributed key generation protocols we know of rely on timing assumptions; fortunately these assumptions need only to hold during setup.

定义1。原子广播协议必须满足以下特性，所有这些特性在异步网络中(作为安全参数λ的1 - negl(λ)函数)都应具有高概率(λ)，尽管有任意对手:

DEFINITION 1. An atomic broadcast protocol must satisfy the following properties, all of which should hold with high probability (as a function 1 − negl(λ) of a security parameter , λ) in an asynchronous network and in spite of an arbitrary adversary:

•(协议)如果任何正确的节点输出一个事务tx，那么每个正确的节点输出tx。

•(总订单)如果一个正确的节点输出了交易的顺序htx0,tx1，…Tx ji和另一个有输出htx00,tx01，…当I≤min(j, j0)时，则txi = tx0i。

•(审查弹性)如果一个事务tx被输入到N−f个正确的节点，那么它最终由每个正确的节点输出。

• (Agreement) If any correct node outputs a transaction tx, then every correct node outputs tx.

• (Total Order) If one correct node has output the sequence of transactions htx0,tx1, …tx ji and another has output htx00,tx01, …tx0j0i, then txi = tx0i for i ≤ min( j, j0).

• (Censorship Resilience) If a transaction tx is input to N − f correct nodes, then it is eventually output by every correct node.

审查弹性属性是一种活动属性，可以防止对手阻止提交哪怕是单个事务。这个属性有其他名称，例如Cachin等人[15]的“公平性”，但我们更喜欢这个更具描述性的短语。

性能指标。我们将主要对分析原子广播协议的效率和事务延迟感兴趣。

•(效率)假设每个诚实节点的输入缓冲区足够满Ω(poly(N，λ))。那么效率就是每个节点的预期通信成本分摊到所有已提交的事务上。

The censorship resilience property is a liveness property that prevents an adversary from blocking even a single transaction from being committed. This property has been referred to by other names, for example “fairness” by Cachin et al [15], but we prefer this more descriptive phrase.

Performance metrics. We will primarily be interested in analyzing the efficiency and transaction delay of our atomic broadcast protocol.

• (Efficiency) Assume that the input buffers of each honest node are sufficiently full Ω(poly(N,λ)). Then efficiency is the expected communication cost for each node amortized over all committed transactions.

由于每个节点必须输出每个事务，O(1)效率(我们的协议实现的)是渐近最优的。上面的效率定义假设网络处于负载状态，反映了我们的主要目标:在充分利用网络可用带宽的同时保持高吞吐量。由于我们通过批处理实现了良好的吞吐量，在事务很少到达的低需求期间，我们的系统对每个提交的事务使用了更多的带宽。如果我们的目标是最小化成本(例如，基于使用的计费)，那么一个没有此限定的更强的定义将是合适的。

在实践中，网络链接的容量是有限的，如果提交的事务超过了网络的处理能力，一般情况下，对确认时间的保证就不能成立。因此，我们将事务延迟定义为相对于已在相关事务之前输入的事务的数量。有限的事务延迟意味着审查弹性。

Since each node must output each transaction, O(1) efficiency (which our protocol achieves) is asymptotically optimal. The above definition of efficiency assumes the network is under load, reflecting our primary goal: to sustain high throughput while fully utilizing the network’s available bandwidth. Since we achieve good throughput by batching, our system uses more bandwidth per committed transaction during periods of low demand when transactions arrive infrequently. A stronger definition without this qualification would be appropriate if our goal was to minimize costs (e.g., for usage-based billing).

In practice, network links have limited capacity, and if more transactions are submitted than the network can handle, a guarantee on confirmation time cannot hold in general. Therefore we define transaction delay below relative to the number of transactions that have been input ahead of the transaction in question. A finite transaction delay implies censorship resilience.

(事务延迟)假设对手将事务tx作为输入传递给N−f个正确的节点。设T为“backlog”，即之前输入到任何正确节点的事务总数与已提交的事务数量之间的差值。那么事务延迟就是每个正确节点输出tx之前期望的异步轮数，作为T的函数。

(Transaction delay) Suppose an adversary passes a transaction tx as input to N − f correct nodes. Let T be the “backlog”, i.e. the difference between the total number of transactions previously input to any correct node and the number of transactions that have been committed. Then transaction delay is the expected number of asynchronous rounds before tx is output by every correct node as a function of T .

4.2概述和直观感受

4.2 Overview and Intuition

在HoneyBadgerBFT中，节点将事务作为输入接收并将其存储在它们的(无界)缓冲区中。

In HoneyBadgerBFT, nodes receive transactions as input and store them in their (unbounded) buffers.

协议以epoch为单位进行，在每个epoch之后，一个新的事务批次被追加到已提交的日志中。

The protocol proceeds in epochs, where after each epoch, a new batch of transactions is appended to the committed log.

在每个历的开始，节点在它们的缓冲区中选择事务的一个子集(通过我们稍后将定义的策略)，并将它们作为输入提供给随机协议协议的一个实例。

At the beginning of each epoch, nodes choose a subset of the transactions in their buffer (by a policy we will define shortly), and provide them as input to an instance of a randomized agreement protocol.

在协议协议的最后，选择这个时代的最终事务集。

At the end of the agreement protocol, the final set of transactions for this epoch is chosen.

在这个高级别上，我们的方法类似于现有的异步原子广播协议，特别是Cachin等[15]，这是大规模事务处理系统(SINTRA)的基础。

At this high level, our approach is similar to existing asynchronous atomic broadcast protocols, and in particular to Cachin et al [15], the basis for a large scale transaction processing system (SINTRA).

和我们的协议一样，Cachin的协议是围绕异步公共子集(ACS)原语的一个实例展开的。

Like ours, Cachin’s protocol is centered around an instance of the Asynchronous Common Subset (ACS) primitive.

粗略地说，ACS原语允许每个节点提出一个值，并保证每个节点输出一个公共向量，其中包含至少N−2个正确节点的输入值。

Roughly speaking, the ACS primitive allows each node to propose a value, and guarantees that every node outputs a common vector containing the input values of at least N − 2 f correct nodes.

从这个原语构建原子广播非常简单——每个节点只是从队列前端提出一个事务子集，并输出商定向量中元素的并集。

It is trivial to build atomic broadcast from this primitive — each node simply proposes a subset of transactions from the front its queue, and outputs the union of the elements in the agreed-upon vector.

然而，有两个重要的挑战。

However, there are two important challenges.

挑战1:实现审查弹性。

Challenge 1: Achieving censorship resilience.

ACS的成本直接取决于每个节点提议的事务集的大小。

The cost of ACS depends directly on size of the transaction sets proposed by each node.

由于输出向量至少包含N−f个这样的集合，因此我们可以通过确保节点提出的事务集大多互不相连，从而在同一批处理中提交更多不同的事务，从而提高整体效率。

Since the output vector contains at least N − f such sets, we can therefore improve the overall efficiency by ensuring that nodes propose mostly disjoint sets of transactions, thus committing more distinct transactions in one batch for the same cost.

因此，不是简单地从其缓冲区中选择第一个元素(如CKPS01[15])，协议中的每个节点提出一个随机选择的样本，这样，平均而言，每个事务只由一个节点提出。

Therefore instead of simply choosing the first element(s) from its buffer (as in CKPS01 [15]), each node in our protocol proposes a randomly chosen sample, such that each transaction is, on average, proposed by only one node.

然而，如果实现了naïvely，这种优化将损害审查弹性，因为ACS原语允许对手选择最终包含哪些节点的建议。

However, implemented naïvely, this optimization would compromise censorship resilience, since the ACS primitive allows the adversary to choose which nodes’ proposals are ultimately included.

对手可以有选择地审查一个交易，排除提出该交易的节点。

The adversary could selectively censor a transaction excluding whichever node(s) propose it.

我们通过使用阈值加密来避免这个陷阱，它阻止了对手知道哪些事务是由哪些节点提出的，直到已经达成协议。

We avoid this pitfall by using threshold encryption, which prevents the adversary from learning which transactions are proposed by which nodes, until after agreement is already reached.

完整的协议将在4.3节中描述。

The full protocol will be described in Section 4.3.

挑战2:实际吞吐量。

Challenge 2: Practical throughput.

虽然异步ACS和原子广播在理论上的可行性已经为人所知[9,15,17]，但其实际性能尚不明确。

Although the theoretical feasibility of asynchronous ACS and atomic broadcast have been known [9, 15, 17], their practical performance is not.

据我们所知，唯一实现ACS的其他工作是由Cachin和Portiz[17]完成的，他们表明他们可以在广域网上实现0.4 tx/sec的吞吐量。

To the best of our knowledge, the only other work that implemented ACS was by Cachin and Portiz [17], who showed that they could attain a throughput of 0.4 tx/sec over a wide area network.

因此，一个有趣的问题是，这样的协议能否在实践中获得高吞吐量。

Therefore, an interesting question is whether such protocols can attain high throughput in practice.

在本文中，我们展示了通过将精心选择的子组件数组拼接在一起，我们可以有效地实例化ACS，并在渐近和实践中获得更大的吞吐量。

In this paper, we show that by stitching together a carefully chosen array of sub-components, we can efficiently instantiate ACS and attain much greater throughput both asymptotically and in practice.

值得注意的是，我们将ACS的渐近代价(每个节点)从O(N2)(如Cachin等[15,17])提高到O(1)。

Notably, we improve the asymptotic cost (per node) of ACS from O(N2) (as in Cachin et al [15, 17]) to O(1).

因为我们挑选的组件之前没有一起展示过(据我们所知)，所以我们在第4.4节中提供了整个结构的自包含描述。

Since the components we cherry-pick have not been presented together before (to our knowledge), we provide a self-contained description of the whole construction in Section 4.4.

模块化的协议组成。

Modular protocol composition.

现在我们准备正式展示我们的结构。

We are now ready to present our constructions formally.

在这样做之前，我们对我们的演示风格做一个评论。

Before doing so, we make a remark about the style of our presentation.

我们以模块化的方式定义我们的协议，其中每个协议可以运行其他(子)协议的几个实例。

We define our protocols in a modular style, where each protocol may run several instances of other (sub)protocols.

外部协议可以向子协议提供输入并从子协议接收输出。

The outer protocol can provide input to and receive output from the subprotocol.

一个节点甚至可以在向它提供输入之前就开始执行(子)协议(例如，如果它从其他节点接收到消息)。

A node may begin executing a (sub)protocol even before providing it input (e.g., if it receives messages from other nodes).

隔离此类(子)协议实例非常重要，以确保属于一个实例的消息不能在另一个实例中重播。

It is essential to isolate such (sub)protocol instances to ensure that messages pertaining to one instance cannot be replayed in another.

这在实践中是通过给每个(子)协议实例关联一个唯一的字符串(会话标识符)来实现的，用这个标识符标记在这个(子)协议中发送或接收的任何消息，并相应地路由消息。

This is achieved in practice by associating to each (sub)protocol instance a unique string (a session identifier), tagging any messages sent or received in this (sub)protocol with this identifier, and routing messages accordingly.

为了便于阅读，我们在协议描述中抑制了这些消息标记。

We suppress these message tags in our protocol descriptions for ease of reading.

我们使用括号来区分子协议的标记实例。

We use brackets to distinguish between tagged instances of a subprotocol.

例如，RBC[i]表示RBC子协议的第i个实例。

For example, RBC[i] denotes an ith instance of the RBC subprotocol.

我们隐式地假设各方之间的异步通信是通过身份验证的异步通道进行的。

We implicitly assume that asynchronous communications between parties are over authenticated asynchronous channels.

实际上，这样的通道可以使用TLS套接字来实例化，例如，我们将在第5节中讨论。

In reality, such channels could be instantiated using TLS sockets, for example, as we discuss in Section 5.

为了区分协议各方之间发送的不同消息类型，我们使用打字机字体的标签(例如，VAL(m)表示类型VAL的消息m)。

To distinguish different message types sent between parties within a protocol, we use a label in typewriter font (e.g., VAL(m) indicates a message m of type VAL).

4.3从异步公共子集构造HoneyBadgerBFT。

4.3 Constructing HoneyBadgerBFT from Asynchronous Common Subset

构建块:ACS。

Building block: ACS.

我们的主要构建块是一个称为异步公共子集(ACS)的原语。

Our main building block is a primitive called asynchronous common subset (ACS).

构建ACS的理论可行性已在若干著作中得到论证[9,15]。

The theoretical feasibility of constructing ACS has been demonstrated in several works [9, 15].

在本节中，我们将给出ACS的正式定义，并将其用作构造HoneyBadgerBFT的黑箱。

In this section, we will present the formal definition of ACS and use it as a blackbox to construct HoneyBadgerBFT.

在后面的4.4节中，我们将展示通过结合过去有些被忽略的几个结构，我们可以有效地实例化ACS !。

Later in Section 4.4, we will show that by combining several constructions that were somewhat overlooked in the past, we can instantiate ACS efficiently!

•(V有效性)如果一个正确节点输出集合V，那么| V |≥N−f，并且V包含至少N−2个正确节点的输入。

•(协议)如果一个正确的节点输出v，那么每个节点都输出v。

•(总体性)如果N−f个正确的节点接收一个输入，那么所有正确的节点产生一个输出。

More formally, an ACS protocol satisfies the following properties: • (V alidity) If a correct node outputs a set v, then |v| ≥ N − f and v contains the inputs of at least N − 2 f correct nodes.

• (Agreement) If a correct node outputs v, then every node outputs v.

• (Totality) If N − f correct nodes receive an input, then all correct nodes produce an output.

构建块:阈值加密。阈值加密方案TPKE是一种加密原语，允许任何一方将消息加密到一个主公钥，这样网络节点必须协同工作来解密它。一旦f + 1个正确节点计算并揭示密文的解密共享，就可以恢复明文;在至少一个正确的节点揭示其解密共享之前，攻击者对明文一无所知。阈值方案提供以下接口:

Building block: threshold encryption. A threshold encryption scheme TPKE is a cryptographic primitive that allows any party to encrypt a message to a master public key, such that the network nodes must work together to decrypt it. Once f + 1 correct nodes compute and reveal decryption shares for a ciphertext, the plaintext can be recovered; until at least one correct node reveals its decryption share, the attacker learns nothing about the plaintext. A threshold scheme provides the following interface:

TPKE。设置(1λ)→PK,{滑雪}生成一个公共密钥PK,连同每个政党滑雪•TPKE.Enc密钥(PK, m)→C m•TPKE.DecShare加密消息(滑雪,C)→σ我产生第i个分享的解密(或⊥如果C是畸形)•TPKE.Dec (PK C{我σ})→m结合一组解密股票{我σ}至少f + 1党获得明文m(或者,如果C包含无效的股票,然后确认无效的股票)。

TPKE.Setup(1λ ) → PK,{SKi} generates a public encryption key PK, along with secret keys for each party SKi • TPKE.Enc(PK,m) → C encrypts a message m • TPKE.DecShare(SKi,C) → σi produces the ith share of the decryption (or ⊥ if C is malformed) • TPKE.Dec(PK,C,{i,σi}) → m combines a set of decryption shares {i,σi} from at least f +1 parties obtain the plaintext m (or, if C contains invalid shares, then the invalid shares are identified).

在我们的具体实例中，我们使用Baek和Zheng[7]的阈值加密方案。这个方案也是稳健的（正如我们的协议所要求的），这意味着即使对于一个对抗性生成的密码文本C，最多可以恢复一个明文（除了⊥）。注意，我们假设TPKE.Dec能有效地识别输入中无效的解密份额。最后，该方案满足明显的正确性属性，以及IND-CPA游戏的阈值版本。

In our concrete instantiation, we use the threshold encryption scheme of Baek and Zheng [7]. This scheme is also robust (as required by our protocol), which means that even for an adversarially generated ciphertext C, at most one plaintext (besides ⊥) can be recovered. Note that we assume TPKE.Dec effectively identifies invalid decryption shares among the inputs. Finally, the scheme satisfies the obvious correctness properties, as well as a threshold version of the IND-CPA game.

在这里插入图片描述
设B = Ω(λN2 logN)为批大小参数。

设PK为从TPKE收到的公钥。设置(由经销商执行)，让SKi成为Pi的密钥。

设buf:=[]为输入事务的FIFO队列。

//步骤1:随机选择和加密•设提议是bB/Nc事务从buf的前B个元素的随机选择•加密x:= TPKE.Enc(PK，提议)//步骤2:密文协议•将x作为输入传递给ACS[r] //见图4•receive {v j} j∈S，其中S⊂[1…]N]， from ACS[r] //第3步:解密•for each j∈S: let e j:= TPKE。DecShare(SKi,v j)组播DEC(r, j,i,e j)等待接收至少f + 1个形式为DEC(r, j,k,e j,k) decode yj:= TPKE的消息。Dec(PK，{(k,e j,k)})•让blockr:=已排序(∪j∈S{y j})，使blockr按规范顺序排序(例如，按字典序)•设置buf:= buf−blockr

图1:HoneyBadgerBFT。

来自ACS的原子广播。

Atomic broadcast from ACS.

现在我们更详细地描述原子广播协议，如图1所示。

We now describe in more detail our atomic broadcast protocol, defined in Figure 1.

如前所述，该协议以ACS实例为中心。

As mentioned, this protocol is centered around an instance of ACS.

为了获得可扩展的效率，我们选择了批处理策略。

In order to obtain scalable efficiency, we choose a batching policy.

我们将B设为批处理大小，并在每个历中提交Ω(B)个事务。

We let B be a batch size, and will commit Ω(B) transactions in each epoch.

每个节点从其队列中提出B/N事务。

Each node proposes B/N transactions from its queue.

为了确保节点提出的事务大多不同，我们从每个队列的第一个B中随机选择这些事务。

To ensure that nodes propose mostly distinct transactions, we randomly select these transactions from the first B in each queue.

正如我们将在第4.4节中看到的，ACS实例化的总通信成本为O(N2|v| + λN3 logN)，其中|v|限制了任何节点输入的大小。

As we will see in Section 4.4, our ACS instantiation has a total communication cost of O(N2|v| + λN3 logN), where |v| bounds the size of any node’s input.

因此，我们选择批处理大小B = Ω(λN2 logN)，以便每个节点(B/N)的贡献吸收了这个附加开销。

We therefore choose a batch size B = Ω(λN2 logN) so that the contribution from each node (B/N) absorbs this additive overhead.

为了防止对手影响结果，我们使用阈值加密方案，如下所述。

In order to prevent the adversary from influencing the outcome we use a threshold encryption scheme, as described below.

简而言之，每个节点选择一组事务，然后对其进行加密。

In a nutshell, each node chooses a set of transactions, and then encrypts it.

然后，每个节点将加密作为输入传递给ACS子例程。

Each node then passes the encryption as input to the ACS subroutine.

因此，ACS的输出是一个密文向量。

The output of ACS is therefore a vector of ciphertexts.

一旦ACS完成，密文就会被解密。

The ciphertexts are decrypted once the ACS is complete.

这保证了在对手了解每个节点提出的建议的特定内容之前，就可以完全确定事务集。

This guarantees that the set of transactions is fully determined before the adversary learns the particular contents of the proposals made by each node.

这保证了一旦事务位于队列的最前面的足够正确的节点上，对手就不能有选择地阻止事务的提交。

This guarantees that an adversary cannot selectively prevent a transaction from being committed once it is in the front of the queue at enough correct nodes.

4.4高效实例化ACS。

4.4 Instantiating ACS Efficiently

Cachin等人提出了一个我们称之为CKPS01的协议，该协议（隐含地它将ACS简化为多值验证的拜占庭协议（MVBA）[15]。

粗略地说，MVBA允许节点提出满足一个谓词的值，最终会选择其中一个。

缩减很简单：验证谓词说，输出必须是来自至少N-f方的签名输入的矢量。

不幸的是，MVBA的原始协议成为一个瓶颈，因为我们知道的唯一的构造会产生O(N3|v|)的开销。

Cachin et al present a protocol we call CKPS01 that (implicitly) reduces ACS to multi-valued validated Byzantine agreement (MVBA) [15].

Roughly speaking, MVBA allows nodes to propose values satisfying a predicate, one of which is ultimately chosen.

The reduction is simple: the validation predicate says that the output must be a vector of signed inputs from at least N − f parties.

Unfortunately, the MVBA primitive agreement becomes a bottleneck, because the only construction we know of incurs an overhead of O(N3|v|).

我们通过使用ACS的替代实例来避免这个瓶颈，它完全绕过了MVBA。

We avoid this bottleneck by using an alternative instantiation of ACS that sidesteps MVBA entirely.

我们使用的实例化是由Ben-Or等人[9]提供的，在我们看来，它在某种程度上被忽略了。

The instantiation we use is due to Ben-Or et al [9] and has, in our view, been somewhat overlooked.

事实上，它早于CKPS01[15]，最初是为了一个几乎不相关的目的(作为实现高效异步多方计算[9]的工具)而开发的。

In fact, it predates CKPS01 [15], and was initially developed for a mostly unrelated purpose (as a tool for achieving efficient asynchronous multi-party computation [9]).

该协议是由ACS到可靠广播(RBC)和异步二进制拜占庭协议(ABA)的简化。

This protocol is a reduction from ACS to reliable broadcast (RBC) and asynchronous binary Byzantine agreement (ABA).

直到最近，我们才知道这些子组件的高效结构，我们稍后将对此进行解释。

Only recently do we know of efficient constructions for these subcomponents, which we explain shortly.

在较高的级别上，ACS协议分两个主要阶段进行。

At a high level, the ACS protocol proceeds in two main phases.

在第一阶段，每个节点Pi使用RBC向其他节点传播其提议的值，然后由ABA决定一个位向量，该位向量表示哪些RBC已成功完成。

In the first phase, each node Pi uses RBC to disseminate its proposed value to the other nodes, followed by ABA to decide on a bit vector that indicates which RBCs have successfully completed.

在更详细地解释Ben-Or协议之前，我们现在简要地解释RBC和ABA结构。

We now briefly explain the RBC and ABA constructions before explaing the Ben-Or protocol in more detail.

**通信最佳可靠广播。**异步可靠广播信道满足以下属性:

•(协议)如果任意两个正确的节点交付v和v0，则v = v0。

•(v有效性)如果发送者是正确的，输入v，那么所有正确的节点都传递v

Communication-optimal reliable roadcast. An asynchronous reliable broadcast channel satisfies the following properties: • (Agreement) If any two correct nodes deliver v and v0, then v = v0.

• (Totality) If any correct node delivers v, then all correct nodes deliver v • (V alidity) If the sender is correct and inputs v, then all correct nodes deliver v

虽然Bracha[13]的经典可靠广播协议需要O(N2|v|)比特的总通信量来广播一个大小为|v|的消息，但Cachin和Tessaro[18]观察到，即使在最坏的情况下，擦除编码可以将这个成本降低到仅为O(N|v| + λN2 logN)。这对大的信息来说是一个重大的改进（即当|v| ? λN logN），这（回顾第4.3节）指导我们对批处理规模的选择。这里使用擦除编码最多引起一个小的恒定系数的开销，等于NN-2 f < 3。

While Bracha’s [13] classic reliable broadcast protocol requires O(N2|v|) bits of total communication in order to broadcast a message of size |v|, Cachin and Tessaro [18] observed that erasure coding can reduce this cost to merely O(N|v| + λN2 logN), even in the worst case. This is a significant improvement for large messages (i.e., when |v| ? λN logN), which, (looking back to Section 4.3) guides our choice of batch size. The use of erasure coding here induces at most a small constant factor of overhead, equal to NN−2 f < 3.

如果发送者是正确的，总运行时间是三(异步)轮；并且在任何情况下，在第一个正确的节点输出值和最后一个正确的节点输出值之间最多经过两轮。

If the sender is correct, the total running time is three (asynchronous) rounds; and in any case, at most two rounds elapse between when the first correct node outputs a value and the last outputs a value.

图2所示的可靠广播算法。

The reliable broadcast algorithm shown in Figure 2.

在这里插入图片描述

算法RBC(对于Pi方，发送方为PSender)
在输入(v)时(如果Pi = PSender):
设{s j} j∈[N]是应用于v的(N 2 f，N)擦除编码方案的块
设h是在{s j}上计算的Merkle树根将VAL(h，b j，s j)
发送到每一方P j，其中b j是第j个Merkle树分支

在从PSender接收VAL(h，bi，si)时，多播回声(h，bi，si)
在检查b j是否是根h和叶s j的有效Merkle分支，否则丢弃

收到来自N-f个不同方的有效ECHO(h，，)消息后，
从收到的任何N-2 f个叶中插入{s0j}
重新计算Merkle根h0，如果h0 6= h，则中止

如果尚未发送就绪(h)，则多播就绪(h)收到f + 1个匹配就绪(h)消息后，如果尚未发送就绪消息，则发送多播就绪消息(h)
在收到2个f + 1匹配就绪消息(h)后，等待N 2 f回应消息，然后解码v

图2:可靠的广播算法，改编自布拉查的广播[13]，使用纠删码提高效率[18]。

二元协议。二进制协议是一种标准的原语，它允许节点就单个位的值达成一致。更正式地说，二元协定保证三个性质:(协定)如果任何正确的节点输出比特b，那么每个正确的节点输出b。

(终止)如果所有正确的节点都接收输入，那么每个正确的节点都输出一位。

(V有效性)如果任何正确的节点输出b，则至少一个正确的节点接收b作为输入。

Binary Agreement. Binary agreement is a standard primitive that allows nodes to agree on the value of a single bit. More formally, binary agreement guarantees three properties: • (Agreement) If any correct node outputs the bit b, then every correct node outputs b.

• (Termination) If all correct nodes receive input, then every correct node outputs a bit.

• (V alidity) If any correct node outputs b, then at least one correct node received b as input.

有效性属性意味着一致性:如果所有正确的节点接收相同的输入值b，那么b必须是决定的值。

The validity property implies unanimity: if all of the correct nodes receive the same input value b, then b must be the decided value.

另一方面，如果在任何时候两个节点接收到不同的输入，那么对手可能甚至在其余节点接收到输入之前就强制决定任一值。

On the other hand, if at any point two nodes receive different inputs, then the adversary may force the decision to either value even before the remaining nodes receive input.

我们用Moustefaoui等人[42]的协议实例化了这个原语，该协议基于一个加密公共硬币。

We instantiate this primitive with a protocol from Moustefaoui et al [42], which is based on a cryptographic common coin.

我们将这个实例化的解释推迟到附录中。

We defer explanation of this instantiation to the Appendix.

它的预期运行时间为O(1)，实际上以12k的概率在O(k)轮内完成，每个节点的通信复杂度为O(Nλ)，这主要是由于普通硬币中使用的阈值加密。

Its expected running time is O(1), and in fact completes within O(k) rounds with probability 1 − 2−k.

就提议值的子集达成一致。

Agreeing on a subset of proposed values.

综上所述，我们使用Ben-Or等人[9]的协议来商定一组包含至少N f个节点的完整建议的值。

Putting the above pieces together, we use a protocol from Ben-Or et al [9] to agree on a set of values containing the entire proposals of at least N − f nodes.

在高层次上，该协议分两个主要阶段进行。

At a high level, this protocol proceeds in two main phases.

在第一阶段，每个节点Pi使用可靠广播将其建议值传播给其他节点。

in the first phase, each node Pi uses Reliable Broadcast to disseminate its proposed value to the other nodes.

在第二阶段，使用二进制拜占庭协议的N个并发实例来商定一个位向量{b j} j∈[1…N]，其中b j = 1表示P j的建议值包含在最终集合中。

In the second stage, N concurrent instances of binary Byzantine agreement are used to agree on a bit vector {b j} j∈[1…N], where b j = 1 indicates that P j’s proposed value is included in the final set.

实际上，上面的简单描述隐藏了一个微妙的挑战，对此，Ben-Or提供了一个巧妙的解决方案。

Actually the simple description above conceals a subtle challenge, for which Ben-Or provide a clever solution.

在上述示意图的实现中，一个天真的尝试是让每个节点等待第一次(N f)广播完成，然后为与之对应的二进制协议实例建议1，为所有其他实例建议0。

A naïve attempt at an implementation of the above sketch would have each node to wait for the first (N − f ) broadcasts to complete, and then propose 1 for the binary agreement instances corresponding to those and 0 for all the others.

然而，正确的节点可能观察到广播以不同的顺序完成。

However, correct nodes might observe the broadcasts complete in a different order.

由于二进制一致性仅保证在所有正确节点一致建议1时输出为1，因此有可能得到的位向量为空。

Since binary agreement only guarantees that the output is 1 if all the correct nodes unaninimously propose 1, it is possible that the resulting bit vector could be empty.

为避免这一问题，节点在确定最终向量将至少设置N f位之前，不会建议0。

为了给这个协议的流程提供一些直觉，我们在图3中叙述了几个可能的场景。图4给出了Ben-Or等人[9]的算法。预期运行时间为O(logN ),因为它必须等待所有二进制协议实例完成。当用上面描述的可靠广播和二进制协议结构实例化时，假设|v|是任何节点输入的最大尺寸，总通信复杂度是O(N2|v| + λN3 logN)。

To avoid this problem, nodes abstain from proposing 0 until they are certain that the final vector will have at least N − f bits set.

To provide some intuition for the flow of this protocol, we narrate several possible scenarios in Figure 3. The algorithm from Ben-Or et al [9] is given in Figure 4. The running time is O(logN) in expectation, since it must wait for all binary agreement instances to finish. When instantiated with the reliable broadcast and binary agreement constructions described above, the total communication complexity is O(N2|v| + λN3 logN) assuming |v| is the largest size of any node’s input.

在这里插入图片描述
图3:(ACS执行的图解示例。)我们的协议的每次执行包括运行N个可靠广播(RBC)的并发实例，以及N个拜占庭协议(BA)，它们依次使用预期的恒定数量的公共硬币。我们从节点0的角度举例说明了这些实例如何发生的几个可能的例子。(a)在普通情况下，节点0从索引1处的可靠广播接收值V1(节点1的建议值)。因此，节点0向BA1提供输入“是”，BA1输出“是”(b)RBC 2需要太长时间才能完成，节点0已经接收到(N f)“是”输出，因此对BA2投“否”票。但是，其他节点已经看到RBC2成功完成，因此BA2的结果为“是”，节点0必须等待V2。©在RBC3完成之前，BA3以“否”结束。

在这里插入图片描述
算法ACS(对于Pi方)让{RBCi}N表示可靠广播协议的N个实例，其中Pi是RBCi的发送方。设{BAi}N表示二进制拜占庭协议协议的N个实例。

收到输入vi时，输入vi至RBCi //见图RBC j发送v j时，如果输入尚未提供给BA j，则向BA j提供输入1。见图11 BA至少N f个实例发送值1时，向尚未提供输入的BA的每个实例提供输入0。

完成BA的所有实例后，让c .⊂[1…N]是提供1的每个BA的索引。等待每个RBC j的输出v j，使得j ∈ C .最终输出∪ j∈Cv j。

图4:公共子集协议(来自Ben-Or等人[9])

4.5分析

4.5 Analysis

首先，我们观察到协议和全序性质直接来自ACS的定义和TPKE方案的鲁棒性。

定理1。(协议和总订单)。HoneyBadgerBFT协议满足协议和全序性质，除了可以忽略的概率。

First we observe that the agreement and total order properties follow immediately from the definition of ACS and robustness of the TPKE scheme.

THEOREM 1. (Agreement and total order). The HoneyBadgerBFT protocol satisfies the agreement and total order properties, except for negligible probability.

证明。PROOF.

这两个属性直接来自于高级协议的属性，ACS和TPKE。

These two properties follow immediately from properties of the high-level protoocls, ACS and TPKE.

每个ACS实例保证节点同意每个时期中的密文向量(步骤2)。

Each ACS instance guarantees that nodes agree on a vector of ciphertexts in each epoch (Step 2).

TPKE的鲁棒性保证了每个正确的节点将这些密文解密为一致的值(步骤3)。

The robustness of TPKE guarantees that each correct node decrypts these ciphertexts to consistent values (Step 3).

这足以确保协议和总秩序。

This suffices to ensure agreement and total order.

定理2.（复杂度）。假设批量大小为B=Ω(λN2 logN)，每个HoneyBadgerBFT epoch的运行时间预期为O(logN)，总的预期通信复杂性为O(B)。

THEOREM 2. (Complexity). Assuming a batch size of B = Ω(λN2 logN), the running time for each HoneyBadgerBFT epoch is O(logN) in expectation, and the total expected communication complexity is O(B).

证明。ACS的成本和运行时间在第4.4节中解释。阈值解密的N个实例导致一个额外的回合和O(λN2)的额外成本，这不影响总的渐近成本。

PROOF. The cost and running time of ACS is explained in Section 4.4. The N instances of threshold decryption incur one additional round and an additional cost of O(λN2), which does not affect the overall asymptotic cost.

HoneyBadgerBFT协议可以在单个时期内提交多达B个事务。然而，实际数量可能少于此，因为一些正确的节点可能提出重叠的事务集，其他节点可能响应得太晚，并且被破坏的节点可能提出空集。幸运的是，我们(在附录中)证明了假设每个正确节点的队列都已满，那么B/4就是一个时期内提交的事务的预期数量的下限。

The HoneyBadgerBFT protocol may commit up to B transactions in a single epoch. However, the actual number may be less than this, since some correct nodes may propose overlapping transaction sets, others may respond too late, and corrupted nodes may propose an empty set. Fortunately, we prove (in the Appendix) that assuming each correct node’s queue is full, then B/4 serves as an lower bound for the expected number of transactions committed in an epoch.5

定理3。(效率)。假设每个正确节点的队列包含至少B个不同的事务，则在一个时期中提交的事务的预期数量至少为B4，从而导致恒定的效率。

THEOREM 3. (Efficiency). Assuming each correct node’s queue contains at least B distinct transactions, then the expected number of transactions committed in an epoch is at least B4 , resulting in constant efficiency.

最后，我们证明(在附录中)对手不能显著延迟任何事务的提交。

Finally, we prove (in the Appendix) that the adversary cannot significantly delay the commit of any transaction.

定理4。(审查弹性)。假设对手将事务tx作为输入传递给N-f个正确的节点。设T为“积压”的大小，即先前输入到任何正确节点的事务总数和已经提交的事务数之差。则tx在O(T /B + λ)个时期内被提交，除非概率可以忽略。

THEOREM 4. (Censorship Resilience). Suppose an adversary passes a transaction tx as input to N − f correct nodes. Let T be the size of the “backlog”, i.e. the difference between the total number of transactions previously input to any correct node and the number of transactions that have been committed. Then tx is commited within O(T /B + λ) epochs except with negligible probability.

5.实施和评估

1. IMPLEMENTATION AND EV ALUATION

在本节中，我们使用HoneyBadgerBFT协议的原型实现进行了几个实验和性能测量。

In this section we carry out several experiments and performance measurements using a prototype implementation of the HoneyBadgerBFT protocol.

除非另有说明，否则本节中报告的数字默认用于乐观情况，即所有节点行为诚实。

Unless otherwise noted, numbers reported in this section are by default for the optimistic case where all nodes are behaving honestly.

首先，我们通过在广域网中进行实验来证明HoneyBadgerBFT确实是可扩展的，包括五大洲多达104个节点。

First we demonstrate that HoneyBadgerBFT is indeed scalable by performing an experiment in a wide area network, including up to 104 nodes in five continents.

即使在这种情况下，HoneyBadgerBFT也可以达到每秒数千个事务的峰值吞吐量。

Even under these conditions, HoneyBadgerBFT can reach peak throughputs of thousands of transactions per second.

此外，通过与代表性的部分同步协议PBFT进行比较，HoneyBadgerBFT的性能仅差一个小的常数因子。

Furthermore, by a comparison with PBFT, a representative partially synchronous protocol, HoneyBadgerBFT performs only a small constant factor worse.

最后，我们演示了在Tor匿名通信层上运行异步BFT的可行性。

Finally, we demonstrate the feasibility of running asynchronous BFT over the Tor anonymous communication layer.

实施细节。

Implementation details.

我们用Python开发了HoneyBadgerBFT的原型实现，使用gevent库处理并发任务。

We developed a prototype implementation of HoneyBadgerBFT in Python, using the gevent library for concurrent tasks.

对于确定性的擦除编码，我们使用zfec库[52]，它实现了Reed-Solomon码。为了实例化普通硬币基元，我们实现了Boldyreva的基于配对的阈值签名方案[11]。对于交易的阈值加密，我们使用Baek和Zheng的方案[7]来加密一个256位的短时密钥，然后在实际的有效载荷上使用CBC模式的AES-256。

我们使用Charm [3] PBC库的Python包装器[38]来实现这些阈值密码学方案。对于阈值签名，我们使用提供的MNT224曲线，导致签名（和签名份额）只有65字节，并启发式地提供112比特的安全性。6我们的阈值加密方案需要一个对称双线性组：因此我们使用SS512组，启发式地提供80比特的安全性[44] 。

For deterministic erasure coding, we use the zfec library [52], which implements Reed-Solomon codes. For instantiating the common coin primitive, we implement Boldyreva’s pairing-based threshold signature scheme [11]. For threshold encryption of transactions, we use Baek and Zheng’s scheme [7] to encrypt a 256-bit ephemeral key, followed by AES-256 in CBC mode over the actual payload.

We implement these threshold cryptography schemes using the Charm [3] Python wrappers for PBC library [38]. For threshold signatures, we use the provided MNT224 curve, resulting in signatures (and signature shares) of only 65 bytes, and heuristically providing 112 bits of security.6Our threshold encryption scheme requires a symmetric bilinear group: we therefore use the SS512 group, which heuristically provides 80 bits of security [44]

在我们的EC2实验中，我们使用普通的(未经认证的)TCP套接字。

In our EC2 experiments, we use ordinary (unauthenticated) TCP sockets.

在实际部署中，我们将TLS与客户端和服务器身份验证一起使用，为长寿命会话增加了微不足道的开销。

In a real deployment we would use TLS with both client and server authentication, adding insignificant overhead for longlived sessions.

类似地，在我们的Tor实验中，每个套接字只有一个端点被认证(通过“隐藏服务”地址)。

Similarly, in our Tor experiment, only one endpoint of each socket is authenticated (via the “hidden service” address).

我们的理论模型假设节点有无限的缓冲区。

Our theoretical model assumes nodes have unbounded buffers.

在实践中，每当内存消耗达到一个水印时(例如，每当75%满时)，更多的资源可以被动态地添加到一个节点，尽管我们的原型实现还不包括这个特性。

In practice, more resources could be added dynamically to a node whenever memory consumption reaches a watermark, (e.g., whenever it is 75% full) though our prototype implementation does not yet include this feature.

未能提供足够的缓冲将被计入故障预算f。

Failure to provision an adequate buffer would count against the failure budget f .

5.1带宽分解和评估。

5.1 Bandwidth Breakdown and Evaluation

我们首先分析系统的带宽成本。

We first analyze the bandwidth costs of our system.

在所有实验中，我们假设每个mT = 250字节的恒定交易大小，这将允许ECDSA签名、两个公钥以及应用程序有效载荷(即，大约为典型比特币交易的大小)。

In all experiments, we assume a constant transaction size of mT = 250 bytes each, which would admit an ECDSA signature, two public keys, as well as an application payload (i.e., approximately the size of a typical Bitcoin transaction).

我们的实验使用参数N = 4 f，8，每一方提出一批B/N事务。

Our experiments use the parameter N = 4 f ,8 and each party proposes a batch of B/N transactions.

为了模拟最差情况，节点从大小为b的相同队列开始，我们将运行时间记录为从实验开始到第(N f)个节点输出值的时间。

To model the worst case scenario, nodes begin with identical queues of size B. We record the running time as the time from the beginning of the experiment to when the (N − f )-th node outputs a value.

带宽和故障调查结果。

Bandwidth and breakdown findings.

每个节点消耗的总带宽包括固定的附加开销以及依赖于事务的开销。

The overall bandwidth consumed by each node consists of a fixed additive overhead as well as a transaction dependent overhead.

对于我们考虑的所有参数值，附加开销由O(λN2)项支配，该O(λ)项由ABA阶段和随后的解密阶段中的阈值加密产生。

For all parameter values we considered, the additive overhead is dominated by an O(λN2) term resulting from the threshold cryptography in the ABA phases and the decryption phase that follows.

ABA阶段包括每个节点发送预期的4N2个签名部分。

The ABA phase involves each node transmitting 4N2 signature shares in expectation.

只有RBC阶段会产生与事务相关的开销，等于擦除编码扩展因子r = NN 2 f。

Only the RBC phase incurs a transaction-dependent overhead, equal to the erasure coding expansion factor r = NN−2 f .

由于回显消息中包含Merkle树分支，RBC阶段也会对开销产生N2 logN哈希。

The RBC phase also contributes N2 logN hashes to the overhead because of Merkle tree branches included in the ECHO messages.

总通信成本(每个节点)估计为:。

The total communication cost (per node) is estimated as:

在这里插入图片描述

其中mE和mD分别是TPKE方案中密文和解密部分的大小，mS是TSIG签名部分的大小。

当我们增加建议的批量B时，系统的有效吞吐量增加，因此成本中与交易相关的部分占主导地位。如图5所示，对于N = 128，对于高达1024个事务的批量，与事务无关的带宽仍然在总成本中占主导地位。然而，当批量大小达到16384时，依赖于事务的部分开始占主导地位，这主要是由RBC造成的。节点传输擦除编码块的回声阶段。

where mE and mD are respectively the size of a ciphertext and decryption share in the TPKE scheme, and mS is the size of a TSIG signature share.

The system’s effective throughput increases as we increase the proposed batch size B, such that the transaction-dependent portion of the cost dominates. As Figure 5 shows, for N = 128, for batch sizes up to 1024 transactions, the transaction-independent bandwidth still dominates to overall cost. However, when when the batch size reaches 16384, the transaction-dependent portion begins to dominate — largely resulting from the RBC.ECHO stage where nodes transmit erasure-coded blocks.

在这里插入图片描述
图5:不同批量的估计通信成本，单位为兆字节(每个节点)。对于小批量，固定成本随着O(N2对数)增长。饱和时，开销系数接近NN 2 f < 3。

5.2 Experiments on Amazon EC2

5.2在亚马逊EC2上的实验

为了了解我们的设计有多实用，我们在亚马逊EC2服务上部署了我们的协议并全面测试了其性能。

我们在32、40、48、56、64和104个亚马逊EC2 t2.medium实例上运行HoneyBagderBFT，这些实例均匀分布在其跨越5大洲的8个地区。在我们的实验中，我们改变了批量大小，使每个节点提出256、512、1024、2048、4096、8192、16384、32768、65536或131072个交易。

To see how practical our design is, we deployed our protocol on Amazon EC2 services and comprehensively tested its performance.

We ran HoneyBagderBFT on 32, 40, 48, 56, 64, and 104 Amazon EC2 t2.medium instances uniformly distributed throughout its 8 regions spanning 5 continents. In our experiments, we varied the batch size such that each node proposed 256, 512, 1024, 2048, 4096, 8192, 16384, 32768, 65536, or 131072 transactions.

在这里插入图片描述
图6:吞吐量(每秒提交的事务)与提议的事务数量的关系。误差线表示95%的置信区间。

吞吐量。吞吐量被定义为单位时间内提交的事务数量。在我们的实验中，如果没有另外指定，我们使用“每秒确认的事务”作为度量单位。图6显示了吞吐量和所有N方提出的事务总数之间的关系。容错参数设置为f = N/4。

调查结果。从图6中我们可以看到，对于每个设置，吞吐量随着建议事务数量的增加而增加。对于多达40个节点的中型网络，我们实现的吞吐量超过每秒20，000个事务。对于一个104节点的大型网络，我们每秒可以处理超过1，500个事务。给定一个无限的批量大小，所有的网络大小将最终收敛到一个共同的上限，只受可用带宽的限制。

虽然网络中消耗的总带宽随着每个额外的节点而增加(线性)，但是额外的节点也贡献了额外的带宽容量。

Throughput. Throughput is defined as the number of transactions committed per unit of time. In our experiment, we use “confirmed transactions per second” as our measure unit if not specified otherwise. Figure 6 shows the relationship between throughput and total number of transactions proposed by all N parties. The fault tolerance parameter is set to be f = N/4.

Findings. From Figure 6 we can see for each setting, the throughput increases as the number of proposed transactions increases. We achieve throughput exceeding 20,000 transactions per second for medium size networks of up to 40 nodes. For a large 104 node network, we attain more than 1,500 transactions per second. Given an infinite batch size, all network sizes would eventually converge to a common upper bound, limited only by available bandwidth.

Although the total bandwidth consumed in the network increases (linearly) with each additional node, the additional nodes also contribute additional bandwidth capacity.

吞吐量、延时和规模的权衡。延迟被定义为从第一个节点收到客户请求到第(N - f )-个节点完成共识协议的时间间隔。这是合理的，因为(N - f )-第1个节点完成协议意味着诚实的各方完成了共识。

Throughput, latency, and scale tradeoffs. Latency is defined as the time interval between the time the first node receives a client request and when the (N − f )-th node finishes the consensus protocol. This is reasonable because the (N − f )-th node finishing the protocol implies the accomplishment of the consensus for the honest parties.

在这里插入图片描述
Figure 7: Latency vs. throughput for experiments over wide area networks. Error bars indicate 95% confidence intervals.

图7:广域网上实验的延迟与吞吐量。误差线表示95%的置信区间。

图7显示了N和f = N/4的不同选择下延迟和吞吐量之间的关系。正斜率表明我们的实验尚未使可用带宽完全饱和，即使批量较大，我们也能获得更好的通量。图7还显示，延迟随着节点数量的增加而增加，这主要源于协议的ABA阶段。事实上，在N = 104时，对于我们尝试的批量大小范围，我们的系统是CPU受限的，而不是带宽受限的，因为我们的实现是单线程的，并且必须验证O(N2)阈值签名。无论如何，我们最大的104节点实验在6分钟内完成。

Figure 7 shows the relationship between latency and throughput for different choices of N and f = N/4. The positive slopes indicate that our experiments have not yet fully saturated the available bandwidth, and we would attain better throughput even with larger batch sizes. Figure 7 also shows that latency increases as the number of nodes increases, largely stemming from the ABA phase of the protocol. In fact, at N = 104, for the range of batch sizes we tried, our system is CPU bound rather than bandwidth bound because our implementation is single threaded and must verify O(N2) threshold signatures. Regardless, our largest experiment with 104 nodes completes in under 6 minutes.

尽管在不影响最大可达吞吐量的情况下，可以向网络中添加更多的节点(在带宽供应相同的情况下)，但提交一批所需的最小带宽(以及延迟)会随着O(N2登录数)的增加而增加。这个约束意味着对可伸缩性的限制，这取决于带宽成本和用户的延迟容忍度。

Although more nodes (with equal bandwidth provisioning) could be added to the network without affecting maximum attainable throughput, the minimal bandwidth consumed to commit one batch (and therefore the latency) increases with O(N2 logN). This constraint implies a limit on scalability, depending on the cost of bandwidth and users’ latency tolerance.

在这里插入图片描述
图EC2s与PBFT的比较

与PBFT的比较。图8显示了与PBFT协议的比较，后者是用于部分同步网络的经典BFT协议。我们使用Croman等人[24]的Python实现，运行在平均分布在Amazon AWS区域的8、16、32和64个节点上。选择批量大小是为了使网络的可用带宽饱和。

Comparison with PBFT. Figure 8 shows a comparison with the PBFT protocol, a classic BFT protocol for partially synchronous networks. We use the Python implementation from Croman et al [24], running on 8, 16, 32, and 64 nodes evenly distributed among Amazon AWS regions. Batch sizes were chosen to saturate the network’s available bandwidth.

从根本上说，虽然PBFT和我们的协议在总体上具有相同的渐进通信复杂度，但我们的协议在网络链路中均匀地分配这一负载，而PBFT则在领导者的可用带宽上有瓶颈。因此，PBFT的可达到的吞吐量随着节点数量的增加而减少，而HoneyBadgerBFT的吞吐量则大致保持不变。

Fundamentally, while PBFT and our protocol have the same asymptotic communication complexity in total, our protocol distributes this load evenly among the network links, whereas PBFT bottlenecks on the leader’s available bandwidth. Thus PBFT’s attainable throughput diminishes with the number of nodes, while HoneyBadgerBFT’s remains roughly constant.

请注意，这个实验只反映了乐观的情况，没有故障或网络中断。即使对于小型网络，HoneyBadgerBFT也能在第3节中提到的不利条件下提供更好的健壮性。特别是，PBFT将在一个对立的异步调度器上实现零吞吐量，而HoneyBadgerBFT将以固定的速率完成epochs。

Note that this experiment reflects only the optimistic case, with no faults or network interruptions. Even for small networks, HoneyBadgerBFT provides significantly better robustness under adversarial conditions as noted in Section 3. In particular, PBFT would achieve zero throughput against an adversarial asynchronous scheduler, whereas HoneyBadgerBFT would complete epochs at a regular rate.

5.3 Tor上的实验

5.3 Experiments over Tor

为了证明HoneyBadgerBFT的健壮性，我们运行了在Tor(最成功的匿名通信网络)上执行的容错共识协议的第一个实例(据我们所知)。

To demonstrate the robustness of HoneyBadgerBFT, we run the first instance (to our knowledge) of a fault tolerant consensus protocol carried out over Tor (the most successful anonymous communication network).

与我们最初的AWS部署相比，Tor增加了大量不同的延迟。

Tor adds significant and varying latency compared to our original AWS deployment.

不管怎样，我们证明了我们可以在不调整任何参数的情况下运行HoneyBadgerBFT。

Regardless, we show that we can run HoneyBadgerBFT without tuning any parameters.

将HoneyBadgerBFT节点隐藏在Tor后面可能会提供更好的健壮性。

Hiding HoneyBadgerBFT nodes behind the shroud of Tor may offer even better robustness.

因为它帮助节点隐藏它们的IP地址，所以它可以帮助它们避免有目标的网络攻击和涉及它们的物理位置的攻击。

Since it helps the nodes to conceal their IP addresses, it can help them avoid targeted network attacks and attacks involving their physical location.

Tor的简要背景。

Brief background on Tor.

Tor网络由大约6，500个中继站组成，这些中继站列在公共目录服务中。

The Tor network consists of approximately 6,500 relays, which are listed in a public directory service.

Tor支持“隐藏服务”，即通过Tor接受连接以隐藏其位置的服务器。

Tor enables “hidden services,” which are servers that accept connections via Tor in order to conceal their location.

当客户端建立到隐藏服务的连接时，客户端和服务器都构建到公共“集合点”的3跳电路因此，每个到隐藏服务的连接通过5个随机选择的中继路由数据。

When a client establishes a connection to a hidden service, both the client and the server construct 3-hop circuits to a common “rendezvous point.

Tor为中继节点提供了一种方法来通告它们的容量和利用率，这些自我报告的指标由Tor项目汇总。

” Thus each connection to a hidden service routes data through 5 randomly chosen relays.

根据这些指标，9网络的总容量为145Gbps，当前利用率为65Gbps。

Tor provides a means for relay nodes to advertise their capacity and utilization, and these self-reported metrics are aggregated by the Tor project.

Tor实验设置。

According to these metrics,9 the total capacity of the network is ∼145Gbps, and the current utilization is ∼65Gbps.

我们设计了我们的实验设置，使得我们可以在运行Tor守护程序软件的单个台式机上运行所有N个HoneyBadgerBFT节点，同时能够真实地反映Tor中继路径。

Tor experiment setup.

为此，我们配置我们的机器监听N个隐藏服务(在我们的模拟网络中，每个HoneyBadgerBFT节点一个隐藏服务)。

We design our experiment setup such that we could run all N HoneyBadgerBFT nodes on a single desktop machine running the Tor daemon software, while being able to realistically reflect Tor relay paths.

由于每个HoneyBadgerBFT节点都形成了与其他节点的连接，因此我们在每个实验中总共构建了N2 Tor电路，从我们的机器开始和结束，并经过5个随机继电器。

To do this, we configured our machine to listen on N hidden services (one hidden service for each HoneyBadgerBFT node in our simulated network).

总之，所有成对覆盖链路穿过由随机中继节点组成的真实Tor电路，其设计使得所获得的性能代表Tor上的真实HoneyBadgerBFT部署(尽管所有模拟节点运行在单个主机上)。

Since each HoneyBadgerBFT node forms a connection to each other node, we construct a total of N2 Tor circuits per experiment, beginning and ending with our machine, and passing through 5 random relays.

由于Tor为许多用户提供了重要的公共服务，因此确保在实时网络上进行的研究实验不会对其产生负面影响非常重要。

In summary, all pairwise overlay links traverse real Tor circuits consisting of random relay nodes, designed so that the performance obtained is representative of a real HoneyBadgerBFT deployment over Tor (despite all simulated nodes running on a single host machine).

我们仅从一个有利位置形成连接(从而避免接收)，并运行短时间(几分钟)和小参数的实验(在我们最大的实验中仅形成256个电路)。

Since Tor provides a critical public service for many users, it is important to ensure that research experiments conducted on the live network do not adversely impact it.

总的来说，我们的实验涉及通过Tor传输大约5gb的数据——不到其日常利用率的1E-5分之一。

We formed connections from only a single vantage point (and thus avoid receiving), and ran experiments of short duration (several minutes) and with small parameters (only 256 circuits formed in our largest experiment).

在这里插入图片描述

图9:在Tor上运行HoneyBadgerBFT的实验的延迟与吞吐量。

图9显示了延迟如何随吞吐量变化。与我们的EC2实验相反，在那里节点有充足的带宽，Tor 电路受限于电路中最慢的链接。我们达到了Tor每秒超过800个交易的最大吞吐量。

Figure 9 shows how latency changes with throughput.In contrast to our EC2 experiment where nodes have ample bandwidth, Tor circuits are limited by the slowest link in the circuit.We attain a maximum throughput of over 800 transactions per second of Tor.

一般来说，通过Tor的中继网络传输的消息往往具有显著且高度可变的延迟。

In general, messages transmitted over Tor’s relay network tends to have significant and highly variable latency.

例如，在我们对每方提出16384个事务的8方进行的实验中，单个消息可能延迟316.18秒，延迟方差超过2208，而平均延迟只有12秒。

For instance, during our experiment on 8 parties proposing 16384 transactions per party, a single message can be delayed for 316.18 seconds and the delay variance is over 2208 while the average delay is only 12 seconds.

我们强调，我们的协议不需要像传统的最终同步协议那样针对这样的网络条件进行调整。

We stress that our protocol did not need to be tuned for such network conditions, as would a traditional eventually-synchronous protocol.

6.结论。

CONCLUSION

我们提出了HoneyBadgerBFT，这是第一个高效、高吞吐量的异步BFT协议。

We have presented HoneyBadgerBFT, the first efficient and highthroughput asynchronous BFT protocol.

通过我们的实现和实验结果，我们证明了HoneyBadgerBFT可以成为容错交易处理系统的初始加密货币启发部署中的合适组件。

Through our implementation and experimental results we demonstrate that HoneyBadgerBFT can be a suitable component in incipient cryptocurrency-inspired deployments of fault tolerant transaction processing systems.

更一般地说，我们相信我们的工作展示了基于异步协议构建可靠的事务处理系统的前景。

More generally, we believe our work demonstrates the promise of building dependable and transaction processing systems based on asynchronous protocol.

附录

APPENDIX

a .攻击PBFT。

A. ATTACKING PBFT

PBFT。

PBFT.

PBFT协议由两个主要工作流组成:在乐观情况下(当网络同步且领导者功能正常时)提供良好性能的“快速路径”，以及改变领导者的“视图-改变”过程。

The PBFT protocol consists of two main workflows: a “fast path” that provides good performance in optimistic case (when the network is synchronous and the leader functions correctly), and a “view-change” procedure to change leaders.

快速路径由三轮通信组成:PRE_PREPARE、PREPARE和COMMIT。

The fast path consists of three rounds of communication: PRE_PREPARE, PREPARE, and COMMIT.

给定视图的领导者负责对所有请求进行完全排序。

The leader of a given view is responsible for totally ordering all requests.

在接收到客户机请求时，领导者向所有其他复制品多播指定请求和序列号的PRE_PREPARE消息，这些复制品通过多播相应的PREPARE消息来响应。

Upon receiving a client request, the leader multicasts a PRE_PREPARE message specifying the request and a sequence number to all other replicas, who respond by multicasting a corresponding PREPARE message.

副本在收到2 f条准备消息(除了相应的PRE_PREPARE消息之外)时多播一条提交消息，并在收到2 f + 1条提交消息(包括它们自己的)时执行请求。

Replicas multicast a COMMIT message on receipt of 2 f PREPARE messages (in addition to the corresponding PRE_PREPARE message), and execute the request on receipt of 2 f + 1 COMMIT messages (including their own).

当请求花费太长时间来执行(即，长于超时间隔)，先前发起的视图改变花费太长时间，或者它接收到具有更高视图号的f + 1个视图改变消息时，副本增加它们的视图号并多播VIEW_CHANGE消息来选举新的领导者。

Replicas increment their view number and multicast a VIEW_CHANGE message to elect a new leader when a request takes too long to execute (i.e., longer than a timeout interval), a previously initiated view change has taken too long, or it receives f + 1 VIEW_CHANGE messages with a higher view number.

下一个视图的领导者是由视图编号对副本的数量取模来确定的(因此，领导权以循环方式转移)。

The leader of the next view is determined by the view number modulo the number of replicas (thus, leadership is transferred in a round-robin manner).

一旦接收到2 f + 1个VIEW_CHANGE消息，新的领导者就多播NEW_VIEW消息，并将其作为有效视图的证据。

The new leader multicasts a NEW_VIEW message once it receives 2 f + 1 VIEW_CHANGE messages and includes them as proof of a valid view.

如果NEW_VIEW消息的数量等于或大于其自己的当前视图数量，则副本接受该消息，并继续正常处理消息；但是，查看次数较少的消息会被忽略。

A replica accepts the NEW_VIEW mesage if its number is equal to or greater than its own current view number, and resumes processing messages as normal; however messages with lower view numbers are ignored.

超时间隔被初始化为一个固定值(∈)，但随着每一次连续失败的领导者选举，超时间隔增加2倍。

The timeout interval is initialized to a fixed value (∆), but increases by a factor of 2 with each consecutive unsuccessful leader election.

阻碍PBFT的间歇性同步网络。

An intermittently synchronous network that thwarts PBFT.

调度器不会丢弃或重新排序任何消息，只是延迟将消息传递给当前的领导者节点。

The scheduler does not drop or reorder any messages, but simply delays delivering messages to whichever node is the current leader.

特别是，每当当前的领导者是一个故障节点时，这意味着所有诚实节点之间的消息被立即传递。

In particular, whenever the current leader is a faulty node, this means that messages among all honest nodes are delivered immediately.

很快我们提供了一个PBFT协议在我们攻击下的行为的详细说明。

Shortly we provide a detailed illustration of the PBFT protocol behaves under our attack.

为了证实我们的分析，我们实现了这个恶意的调度器作为代理，它拦截并延迟了发送给新领导者的所有视图更改消息，并针对PBFT的1200行Python实现对其进行了测试。

我们观察到的结果和消息日志与上述分析一致；我们的副本陷入了请求视图更改的循环中，但从未成功。

The results and message logs we observed were consistent with the above analysis; our replicas became stuck in a loop requesting view changes that never succeeded.

因为这个调度器是间歇同步的，所以任何纯异步协议(包括HoneyBadgerBFT)在同步期间都会取得很好的进展，不管前面的间隔是多少。

Since this scheduler is intermittently synchronous, any purely asynchronous protocol (including HoneyBadgerBFT) would make good progress during periods of synchrony, regardless of preceding intervals.

在这里插入图片描述
图10:一个间歇同步的调度程序，它违反了PBFT的假设，实际上阻止了它取得进展。只显示了前四个阶段——该行为继续无限重复。在粉色区域，给领导者的消息被延迟(比超时时间长，因此违反了最终同步假设)。然而，所有其他消息都是以普通速率在诚实方之间传递的，因此是“间歇同步的”

PBFT在攻击下的表现。

How PBFT behaves under attack.

在图10中，我们展示了我们对PBFT的攻击。

In Figure 10, we illustrate our attack on PBFT.

调度器不会丢弃或重新排序任何消息，只是延迟将消息传递给当前的领导者节点。

The scheduler does not drop or reorder any messages, but simply delays delivering messages to whichever node is the current leader.

特别是，每当当前的领导者是一个故障节点时，这意味着所有诚实节点之间的消息被立即传递。

In particular, whenever the current leader is a faulty node, this means that messages among all honest nodes are delivered immediately.

我们将客户端请求缩写为“Req”，NEW_VIEW消息缩写为“N”，VIEW_CHANGE消息缩写为“V”，PRE_PREPARE消息缩写为“PP”邮件上的下标表示发送邮件的视图。这里？后跟一条消息，表示该消息已由列号指定的副本在行号乘以固定超时间隔∈指定的时间广播到所有其他节点(称为副本)。同样，后跟一条消息表示该消息已在该行指定的时间传递到列号指定的副本。当给定视图的多个VIEW_CHANGE消息被发送到每个单独的节点时，Vn表示视图编号为n的所有VIEW_CHANGE消息的传送。附加到已传送消息的红色“X”表示该消息被忽略，因为视图编号与该副本的当前视图不匹配。“*”表示定时器已经由于所传递的消息而启动。“**”表示副本的视图号因传递的消息而增加。红色区域表示此时来自该副本的所有广播操作将延迟∈。粉色区域表示所有消息的接收将延迟∈。

We abbreviate client requests as “Req,” NEW_VIEW messages as “N,” VIEW_CHANGE messages as “V ,” and PRE_PREPARE messages as “PP .” The subscript on a message indicates the view in which it was sent. Here, ? followed by a message indicates that this message has been broadcast to all other nodes (called replicas) by the replica specified by the column number, at the time specified by the row number multiplied by the fixed timeout interval ∆. Similarly, • followed by a message indicates that this message has been delivered to the replica specified by the column number, at the time specified by the row. As multiple VIEW_CHANGE messages for a given view are sent to each individual node, •Vn indicates the delivery of all VIEW_CHANGE messages with view number n. A red “X” appended to a delivered message indicates that the message is ignored because the view number does not match that replica’s current view. A “*” indicates that a timer has been started as a result of the delivered message. “**” indicates that a replica’s view number has incremented as a result of the delivered message(s). A red region indicates that all broadcast operations from this replica at this time will be delayed by ∆. A pink region indicates that the receipt of all messages will be delayed by ∆.

在这个例子中，有问题的副本0最初是领导者，并扣留了一个PRE_PREPARE消息，时间超过了超时周期∆。这触发了所有节点增加他们的视图计数器，并为视图1多播一个VIEW_CHANGE消息。然后，调度器推迟了所有VIEW_CHANGE消息对副本1（视图1的领导者）的传递。其余节点的视图改变操作超时，因为它们没有从副本1收到有效的NEW_VIEW消息。然后，节点0、2和3将他们的视图计数器增加到2，并组播另一个VIEW_CHANGE消息。此时，视图1的VIEW_CHANGE消息被传递给副本1，副本1通过在视图1中组播一个NEW_VIEW和一个PRE_PREPARE消息来回应。这些消息随后被传递，随后被所有其他节点忽略，因为它们已经进展到了视图编号2。然后，副本1将收到视图2的VIEW_CHANGE消息，并相应增加其视图计数器。然后，调度器会推迟所有VIEW_CHANGE消息对副本2的传递，确保所有其他节点的视图改变操作再次超时。这个过程将持续到有问题的副本0再次被选为领导者，此时调度器将以加速的速度传递所有消息，而副本0则扣留相应的NEW_VIEW和PRE_PREPARE消息以触发另一个视图变化，并重复这个循环。只要调度器扣留预定的非故障领导者的VIEW_CHANGE消息的时间超过（指数级增加的）超时间隔，这个循环就可以无限期地继续下去，阻止任何视图改变成功，并阻止协议取得任何进展，尽管在副本0是领导者的时间间隔内（0∆,8∆,64∆…）所有非故障副本都能够不受任何干扰地进行通信。

In this example, the faulty replica 0 is initially the leader and withholds a PRE_PREPARE message for longer than the timeout period ∆.This triggers all nodes to increment their view counter and multicast a VIEW_CHANGE message for view number 1. The scheduler then delays the delivery of all VIEW_CHANGE messages to replica 1 (the leader in view 1). The view change operation for the remaining nodes times out, as they do not receive a valid NEW_VIEW message from replica 1. Nodes 0,2, and 3 then increment their view counters to 2, and multicast another VIEW_CHANGE message. At this point, the VIEW_CHANGE messages for view 1 are delivered to replica 1, which responds by multicasting a NEW_VIEW and a PRE_PREPARE message in view 1. These messages are then delivered and subsequently ignored by all other nodes, as they have progressed to view number 2. Replica 1 will then receive the VIEW_CHANGE messages for view 2, and increments its view counter accordingly. The scheduler then delays the delivery of all VIEW_CHANGE messages to replica 2, ensuring that the view change operation of all other nodes times out again. This process will continue until the faulty replica 0 is again elected leader, at which point the scheduler will deliver all messages at an accelerated rate while replica 0 withholds the corresponding NEW_VIEW and PRE_PREPARE messages to trigger another view change and repeat this cycle. The cycle may continue indefinitely so long as the scheduler withholds VIEW_CHANGE messages from the intended non-faulty leader for longer than the (exponentially increasing) timeout interval, preventing any view changes from succeeding and stopping the protocol from making any progress, despite the fact that at time intervals where replica 0 is the leader (0∆,8∆,64∆…) all non-faulty replicas are able to communicate without any interference.

间歇同步网络。

Intermittently synchronous networks.

为了更清楚地说明异步网络之间的差异，我们引入了一个新的网络性能假设,∈-间歇同步，它严格弱于甚至弱同步。

To more clearly illustrate the difference between asynchronous networks, we introduce a new network performance assumption, ∆-intermittently synchrony, which is strictly weaker than even weak synchrony.

其思想是，间歇同步网络近似于同步网络，平均而言，它以1/∈的速率传递消息。

The idea is that a ∆intermittently synchronous network approximates a ∆-synchronous network in the sense that on average it delivers messages at a rate of 1/∆.

然而，传递速率可能在时间上分布不均匀(例如“突发”)，在一些时间间隔期间根本不传递消息，而在其他时间间隔期间快速传递消息。

However, the delivery rate may be unevenly distributed in time (e.g., “bursty”), delivering no messages at all during some time intervals and delivering messages rapidly during others.

定义2。如果对于任何初始时间T0，并且对于任何持续时间D，存在区间[T0，T1]使得T1 T0≥D，并且在[T0，T1]期间推进的异步回合数至少为(T1 T0)/∈，则网络是∈-间歇同步的。

DEFINITION 2. A network is ∆-intermittently synchronous if for any initial time T0, and for any duration D, there exists an interval [T0,T1] such that T1 − T0 ≥ D and the number of asynchronous rounds advanced during [T0,T1] is at least (T1 − T0)/∆.

很明显，每个∏-同步网络也是∏-间歇同步的，因为对于持续时间∏的每个间隔，在该间隔之前发送的消息在该间隔结束时被传送。

It is clearly the case that every ∆-synchronous network is also ∆-intermittently synchronous, since for every interval of duration ∆, messages sent prior to that interval are delivered by the end of that interval.

同样清楚的是，任何间歇同步网络都保证最终的交付(也就是说，它不比异步模型弱)。

It is also clear that any intermittently synchronous network guarantees eventual delivery (i.e., it is no weaker than the asynchronous model).

每当传递一轮又一轮的消息时，异步协议就会取得进展。

Asynchronous protocols make progress whenever rounds of messages are delivered.

由于间歇同步网络保证消息平均在∏内传递，这意味着任何异步协议也以平均∏的速率取得进展。

Since an intermittently-synchronous network guarantees messages are delivered on average within ∆, this means any asynchronous protocol also makes progress at an average rate of ∆.

B.延期证明。

B. DEFERRED PROOFS

我们现在重申并证明最初在4.5节中陈述的定理。

We now restate and prove the theorems originally stated in Section 4.5.

定理3。(效率)。假设每个正确节点的队列包含至少B个不同的事务，则在一个时期中提交的事务的预期数量至少为B4，从而导致恒定的效率。

证明。首先，我们考虑一个实验，其中用随机明文的加密代替阈值加密的密文。在这种情况下，对手不知道关于每个诚实方的建议批次的任何信息。我们将首先展示在这个实验中，一个epoch中提交的事务的预期数量至少是14 B。

PROOF. First, we consider an experiment where the thresholdencrypted ciphertexts are replaced with encryptions of random plaintexts. In this case, the adversary does not learn any information about the proposed batch for each honest party. We will first show that in this experiment, the expected number of transactions committed in an epoch is at least 14 B.

实验一。每个正确的节点从buf[: B]中选择B/N个不同事务的随机子集，其中buf[: B]表示其队列中的前B个元素。对手选择N ^ 2f个正确的节点，让S表示它们提议的事务的并集——回想一下，ACS协议保证协定集至少包含N ^ 2f个正确节点提议的事务。设X1表示s中不同事务的数量。

Experiment 1. Each correct node selects a random subset of B/N distinct transactions from buf[: B], where buf[: B] denotes the first B elements in its queue. The adversary selects N − 2 f correct nodes and let S denote the union of their proposed transactions — recall that the ACS protocol guarantees that the agreed set contains at least transactions proposed by N − 2 f correct nodes. Let X1 denote the number of distinct transactions in S.

buf[: B]的内容可以被对抗性地选择，显然，最坏的情况是buf[: B]对所有诚实方都是相同的；因为否则E[X1]只能更大。

The contents of buf[: B] can be adversarially chosen, and clearly, the worst case is when buf[: B] is identical for all honest parties; since otherwise E[X1] can only be greater.

我们现在考虑一个稍微不同的实验，其中不是从buf[: B]中选择B/N个不同的元素；每个诚实方从buf[: B]中选择一组B/N元素进行替换。在这个随机过程中，约定集合中不同元素的预期数量只能更小。另请注意，我们可以限制(N 2 f)(B/N)> B/3，因为N > 3 f。因此，我们将使用以下更简单的实验来限制实验1中一致同意的集合中不同项目的数量:

We now consider a slightly different experiment where instead of choosing B/N distinct elements from buf[: B]; each honest party chooses a set of B/N elements from buf[: B] with replacement. The expected number of distinct elements in the agreed set can only be smaller in this stochastic process. Also note that we can bound (N − 2 f )(B/N) > B/3 since N > 3 f . Therefore, we will bound the number of distinct items in the agreed set in Experiment 1 with the following, much simpler experiment:

实验二。把B3球扔进垃圾箱。让X2表示至少有一个球的箱子的数量。显然，E[X2] ≤ E[X1]。

我们现在驶往X2。因为对于每个容器，空的概率是1 1B B/3，所以至少有一个球的容器的期望数量是E[X2]= B(1(1 1B)B/3)> B(1 E1/3)> 14b

Experiment 2. Throw B3 balls at B bins. Let X2 denote the number of bins with at least one ball. Clearly, E[X2] ≤ E[X1].

We now bound E[X2]. Since for each bin, the probability of being empty is 1 − 1B B/3, the expected number of bins with at least one ball is E[X2] = B(1 − (1 − 1B )B/3) > B(1 − e−1/3) > 14 B.

We now claim that when the ciphertexts are instantiated with real encryptions rather than random ones, no polynomial-time adversary can cause the expected number of committed transactions in an epoch to be smaller than B4 . We can prove this by contradiction.

我们现在声称，当用真实加密而不是随机加密来实例化密文时，没有多项式时间的对手能够使一个时期中提交事务的预期数量小于B4。我们可以用矛盾来证明这一点。如果某个多项式时间对手A可以使期望值为B4或更小，那么我们可以构造一个区分器D，它可以通过运行Aω(λ)个历元来区分随机密文和真实密文。如果跨越这些时期的平均事务数小于14 B，D猜测密文是真实的；否则它会猜测它们是随机。根据标准赫夫丁界限，D以1 exp(ω(λ))的概率成功。注意，我们只依赖于底层门限加密方案的语义安全性(即IND-CPA)(而不是依赖于像INDCCA2这样更强的定义)；这是因为在ACS子协议完成之前，对手不能在一个时期内解密任何密文。

If some polynomial-time adversary A can cause the expectation to be B4 or smaller, then we can construct a distinguisher D that can distinguish random vs. real ciphertexts by running A for Ω(λ) many epochs. If the average number of transactions across these epochs is smaller than 14 B, D guesses that the ciphertexts are real; otherwise it guess they are random. By a standard Hoeffding bound, D succeeds with 1 − exp(−Ω(λ)) probability. Note that we rely only on the semantic security (i.e., IND-CPA) of the underlying threshold encryption scheme (not on a stronger definition like INDCCA2); this is because the adversary cannot decrypt any ciphertexts in an epoch until the ACS subprotocol completes.

在每个时期的开始，每个正确的节点可以处于两种状态之一:或者(类型1) tx出现在它的队列的前面(即，前B个元素)，或者(类型2)它的队列在tx前面放置了多于B个元素。

主要思想是，在每个时期中，对手必须包括至少dN/6e类型1节点(类型1时期)或至少dN/6e类型2节点(类型2时期)的提议。在类型1时期中，tx以至少1 E1/6的概率被提交。显然，在O(λ)这样的时期之后，tx将可能已经被提交。然而，在类型2时期，我们期望从初始积压中清除至少B(1e 1/6)个事务。

因此，我们将表明，在O(T /B + λ)类型2时期之后，所有T个事务都将被提交的概率很高。

引理1。在最多O(T /B + λ)个类型2的时期之后，来自积压的T个事务将以高概率被提交。

At the beginning of each epoch, each correct node can be in one of two states: either (Type 1) tx appears in the front of its queue (i.e., the first B elements), or else (Type 2) it queue has more than B elements placed in front of tx.

The main idea is that in each epoch the adversary must include the proposals of either at least dN/6e Type 1 nodes (a Type 1 epoch), or at least dN/6e Type 2 nodes (a Type 2 epoch). In a Type 1 epoch, tx is committed with probability at least 1 − e−1/6. Clearly after O(λ) such epochs, tx will likely have been committed. However, in a Type 2 epoch, we expect to clear at least B(1−e−1/6) transactions from the initial backlog.

We will therefore show that after O(T /B + λ) Type 2 epochs, with high probability all T transactions will have been committed.

LEMMA 1. After at most O(T /B + λ) Type 2 epochs, T transactions from the backlog will have been committed with high probability.

设ε > 0是一个常数，我们将用它作为尾界分析的安全裕度。设X表示如上所述的k个时期后提交的事务总数。利用定理3的期望分析，X的期望值为E[X] ≥ kB8。

当k = max(λ，8T(1ε)B)时，我们选择等待的历元数，以确保k ≥ λ且E[X]T≥εE[X]。

虽然对手可能将其行为从一个时期关联到下一个时期，但E[X]的界限仅取决于各方的随机选择，这是独立的。因此，利用赫夫丁不等式，我们有

在这里插入图片描述
给了我们想要的界限。

Let ε > 0 be a constant, which we will use as a safety margin for our tail bound analysis. Let X denote total number of committed transactions after k epochs as described. Using the expectation analysis from Theorem 3, the expected value of X is E[X] ≥ kB8 .

We choose the number of epochs to wait as k = max(λ, 8T (1−ε)B ), which ensures that k ≥ λ and that E[X] − T ≥ εE[X].

Although the adversary may correlate its behavior from one epoch to the next, the bound on E[X] depends only on the random choices of the parties, which are independent. Therefore using Hoeffding’s inequality, we have
在这里插入图片描述
giving us the desired bound.

在这里插入图片描述
图11:来自一枚普通硬币的二进制拜占庭协议。

注意，在算法中，b的范围在{0，1}上。该协议利用了一系列普通硬币，标记为Coinr。

C.异步二进制拜占庭协议。

C. ASYNCHRONOUS BINARY BYZANTINE AGREEMENT

从一枚普通硬币实现二进制协议。二进制协议允许节点对单个位的值达成一致。更正式地说，二元协定保证三个性质:(协定)如果任何正确的节点输出比特b，那么每个正确的节点输出b。

(终止)如果所有正确的节点都接收输入，那么每个正确的节点都输出一位。

(V有效性)如果任何正确的节点输出b，则至少一个正确的节点接收b作为输入。

Realizing binary agreement from a common coin. Binary agreement allows nodes to agree on the value of a single bit. More formally, binary agreement guarantees three properties: • (Agreement) If any correct node outputs the bit b, then every correct node outputs b.

• (Termination) If all correct nodes receive input, then every correct node outputs a bit.

• (V alidity) If any correct node outputs b, then at least one correct node received b as input.

有效性属性意味着一致性:如果所有正确的节点接收相同的输入值b，那么b必须是决定的值。

另一方面，如果在任何时候两个节点接收到不同的输入，那么对手可能甚至在其余节点接收到输入之前就强制决定任一值。

我们用一个基于加密公共硬币的协议来实例化这个原语，它本质上充当同步小工具。对手只有在大多数正确的节点参与投票后才知道下一枚硬币的价值——如果硬币与大多数投票相匹配，那么这就是决定的价值。对手可以影响每一轮的多数票，但只能到硬币被揭开为止。

Moustefaoui等人[42]的拜占庭协议算法如图11所示。它的预期运行时间为O(1)，实际上以12k的概率在O(k)轮内完成。当使用下面定义的公共硬币实例化时，总通信复杂度为O(λN2)，因为它使用恒定数量的公共硬币。

The validity property implies unanimity: if all of the correct nodes receive the same input value b, then b must be the decided value.

On the other hand, if at any point two nodes receive different inputs, then the adversary may force the decision to either value even before the remaining nodes receive input.

We instantiate this primitive with a protocol based on cryptographic common coin, which essentially act as synchronizing gadgets. The adversary only learns the value of the next coin after a majority of correct nodes have committed to a vote — if the coin matches the majority vote, then that is the decided value. The adversary can influence the majority vote each round, but only until the coin is revealed.

The Byzantine agreement algorithm from Moustefaoui et al [42] is shown in Figure 11. Its expected running time is O(1), and in fact completes within O(k) rounds with probability 1 − 2−k. When instantiated with the common coin defined below, the total communication complexity is O(λN2), since it uses a constant number of common coins.

从门限签名方案实现普通硬币。公共硬币是满足以下性质的协议:如果f + 1方调用GetCoin()，那么各方最终都会收到一个公共值s。

值s在{0，1}λ范围内均匀采样，并且不会受到对手的影响。

在至少一方调用GetCoin()之前，不会向对手透露有关s的信息。

按照Cachin等人[16]的思路，一个普通硬币可以由一个唯一的门限签名方案实现。(N，f)-门限签名方案涉及将签名密钥ski的份额分发给N方的每一方。给定消息，使用秘密密钥ski的一方可以计算任意消息m上的签名份额。给定消息m的f + 1个这样的签名份额，任何人都可以组合这些份额以产生有效的签名，该签名在公共密钥pk下验证。对于少于f + 1的份额，(即，除非至少一个诚实方故意计算并披露份额)，对手什么也学不到。我们依赖于一个额外的唯一性属性，它保证对于一个给定的公钥pk，在每个消息m上正好存在一个有效的签名。

Cachin等人[16]的想法是简单地使用阈值签名作为随机位的来源，通过签署一个字符串作为硬币的“名称”。这自然允许该协议用于生成硬币序列(或随机访问表)，并使其便于在模块化子协议中使用。

Realizing a common coin from a threshold signature scheme. A common coin is a protocol that satisfies the following properties: • If f + 1 parties call GetCoin(), then all parties eventually receive a common value, s.

• The value s is uniformly sampled in the range {0,1}λ , and cannot be influenced by the adversary.

• Until at least one party calls GetCoin(), no information about s is revealed to the adversary.

Following Cachin et al [16], a common coin can be realized from a unique threshold signature scheme. An (N, f )-threshold signature scheme involves distributing shares of a signing key ski to each of N parties. Given a message, a party using secret key ski can compute a signature share on an arbitrary message m. Given f + 1 such signature shares for message m, anyone can combine the shares to produce a valid signature, which verifies under the public key pk. With fewer than f + 1 shares, (i.e., unless at least one honest party deliberately computes and reveals a share), the adversary learns nothing. We rely on an additional uniqueness property, which guarantees that for a given public key pk, there exists exactly one valid signature on each message m.

The idea of Cachin et al [16] is simply to use the threshold signature as a source of random bits, by signing a string that serves as the “name” of the coin. This naturally allows the protocol to be used to generate a sequence (or random-access table) of coins, and makes it convenient to use in modular subprotocols.

在这里插入图片描述
图12:基于门限签名的普通硬币[48]

我们假设ThresholdCombine是健壮的，也就是说，如果它用一组多于f + 1个签名部分运行，它会拒绝任何无效的部分。特别地，如果提供了2个f + 1份额，则f + 1的有效子集肯定在其中。在实践中，以这种方式检测到的任何不正确的份额都可以用作指控节点的证据。

具体地，我们使用基于双线性群和Gap Diffie Hellman假设的有效阈值方案[11]。我们用TSIG来指代这个方案。普通硬币只需要一轮异步完成，每个节点的通信开销为O(Nλ)。

We assume that ThresholdCombine is robust, in the sense that if it is run with a set of more than f + 1 signature shares, it rejects any invalid ones. In particular, if 2 f + 1 shares are provided, certainly a valid subset of f + 1 is among them. In practice, any incorrect shares detected this way can be used as evidence to incriminate a node.

Concretely, we use an efficient threshold scheme [11] based on bilinear groups and the Gap Diffie Hellman assumption. We use TSIG to refer to this scheme. The common coin requires only one asynchronous round to complete, and the communication cost is O(Nλ) per node.