The Hyperledger Fabric architecture delivers the following advantages:
- Chaincode trust flexibility. The architecture separates trustassumptions for chaincodes (blockchain applications) from trustassumptions for ordering. In other words, the ordering service may beprovided by one set of nodes (orderers) and tolerate some of them tofail or misbehave, and the endorsers may be different for eachchaincode.
- Scalability. As the endorser nodes responsible for particularchaincode are orthogonal to the orderers, the system may scalebetter than if these functions were done by the same nodes. Inparticular, this results when different chaincodes specify disjointendorsers, which introduces a partitioning of chaincodes betweenendorsers and allows parallel chaincode execution (endorsement).Besides, chaincode execution, which can potentially be costly, isremoved from the critical path of the ordering service.
- Confidentiality. The architecture facilitates deployment ofchaincodes that have confidentiality requirements with respect tothe content and state updates of its transactions.
- Consensus modularity. The architecture is modular and allowspluggable consensus (i.e., ordering service) implementations.
Part I: Elements of the architecture relevant to Hyperledger Fabricv1
System architecture
Basic workflow of transaction endorsement
Endorsement policies
Part II: Post-v1 elements of the architecture
Ledger checkpointing (pruning)
1. System architecture
The blockchain is a distributed system consisting of many nodes thatcommunicate with each other. The blockchain runs programs calledchaincode, holds state and ledger data, and executes transactions. Thechaincode is the central element as transactions are operations invokedon the chaincode. Transactions have to be “endorsed” and only endorsedtransactions may be committed and have an effect on the state. There mayexist one or more special chaincodes for management functions andparameters, collectively called system chaincodes.
1.1. Transactions
Transactions may be of two types:
- Deploy transactions create new chaincode and take a program asparameter. When a deploy transaction executes successfully, thechaincode has been installed “on” the blockchain.
- Invoke transactions perform an operation in the context ofpreviously deployed chaincode. An invoke transaction refers to achaincode and to one of its provided functions. When successful, thechaincode executes the specified function - which may involvemodifying the corresponding state, and returning an output.
As described later, deploy transactions are special cases of invoketransactions, where a deploy transaction that creates new chaincode,corresponds to an invoke transaction on a system chaincode.
Remark: This document currently assumes that a transaction eithercreates new chaincode or invokes an operation provided by *one alreadydeployed chaincode. This document does not yet describe: a)optimizations for query (read-only) transactions (included in v1), b)support for cross-chaincode transactions (post-v1 feature).*
1.2. Blockchain datastructures
1.2.1. State
The latest state of the blockchain (or, simply, state) is modeled as aversioned key-value store (KVS), where keys are names and values arearbitrary blobs. These entries are manipulated by the chaincodes(applications) running on the blockchain through put
and get
KVS-operations. The state is stored persistently and updates to thestate are logged. Notice that versioned KVS is adopted as state model,an implementation may use actual KVSs, but also RDBMSs or any othersolution.
More formally, state s
is modeled as an element of a mappingK -> (V X N)
, where:
K
is a set of keysV
is a set of valuesN
is an infinite ordered set of version numbers. Injectivefunctionnext: N -> N
takes an element ofN
and returns thenext version number.
Both V
and N
contain a special element ⊥ (empty type), which isin case of N
the lowest element. Initially all keys are mapped to(⊥, ⊥). For s(k)=(v,ver)
we denote v
by s(k).value
,and ver
by s(k).version
.
KVS operations are modeled as follows:
put(k,v)
fork
∈K
andv
∈V
, takes the blockchainstates
and changes it tos'
such thats'(k)=(v,next(s(k).version))
withs'(k')=s(k')
for allk'!=k
.get(k)
returnss(k)
.
State is maintained by peers, but not by orderers and clients.
State partitioning. Keys in the KVS can be recognized from theirname to belong to a particular chaincode, in the sense that onlytransaction of a certain chaincode may modify the keys belonging to thischaincode. In principle, any chaincode can read the keys belonging toother chaincodes. Support for cross-chaincode transactions, that modifythe state belonging to two or more chaincodes is a post-v1 feature.
1.2.2 Ledger
Ledger provides a verifiable history of all successful state changes (wetalk about valid transactions) and unsuccessful attempts to changestate (we talk about invalid transactions), occurring during theoperation of the system.
Ledger is constructed by the ordering service (see Sec 1.3.3) as atotally ordered hashchain of blocks of (valid or invalid)transactions. The hashchain imposes the total order of blocks in aledger and each block contains an array of totally ordered transactions.This imposes total order across all transactions.
Ledger is kept at all peers and, optionally, at a subset of orderers. Inthe context of an orderer we refer to the Ledger as toOrdererLedger
, whereas in the context of a peer we refer to theledger as to PeerLedger
. PeerLedger
differs from theOrdererLedger
in that peers locally maintain a bitmask that tellsapart valid transactions from invalid ones (see Section XX for moredetails).
Peers may prune PeerLedger
as described in Section XX (post-v1feature). Orderers maintain OrdererLedger
for fault-tolerance andavailability (of the PeerLedger
) and may decide to prune it atanytime, provided that properties of the ordering service (see Sec.1.3.3) are maintained.
The ledger allows peers to replay the history of all transactions and toreconstruct the state. Therefore, state as described in Sec 1.2.1 is anoptional datastructure.
1.3. Nodes
Nodes are the communication entities of the blockchain. A “node” is onlya logical function in the sense that multiple nodes of different typescan run on the same physical server. What counts is how nodes aregrouped in “trust domains” and associated to logical entities thatcontrol them.
There are three types of nodes:
- Client or submitting-client: a client that submits an actualtransaction-invocation to the endorsers, and broadcaststransaction-proposals to the ordering service.
- Peer: a node that commits transactions and maintains the stateand a copy of the ledger (see Sec, 1.2). Besides, peers can have aspecial endorser role.
- Ordering-service-node or orderer: a node running thecommunication service that implements a delivery guarantee, such asatomic or total order broadcast.
The types of nodes are explained next in more detail.
1.3.1. Client
The client represents the entity that acts on behalf of an end-user. Itmust connect to a peer for communicating with the blockchain. The clientmay connect to any peer of its choice. Clients create and thereby invoketransactions.
As detailed in Section 2, clients communicate with both peers and theordering service.
1.3.2. Peer
A peer receives ordered state updates in the form of blocks from theordering service and maintain the state and the ledger.
Peers can additionally take up a special role of an endorsing peer,or an endorser. The special function of an endorsing peer occurswith respect to a particular chaincode and consists in endorsing atransaction before it is committed. Every chaincode may specify anendorsement policy that may refer to a set of endorsing peers. Thepolicy defines the necessary and sufficient conditions for a validtransaction endorsement (typically a set of endorsers’ signatures), asdescribed later in Sections 2 and 3. In the special case of deploytransactions that install new chaincode the (deployment) endorsementpolicy is specified as an endorsement policy of the system chaincode.
1.3.3. Ordering service nodes (Orderers)
The orderers form the ordering service, i.e., a communication fabricthat provides delivery guarantees. The ordering service can beimplemented in different ways: ranging from a centralized service (usede.g., in development and testing) to distributed protocols that targetdifferent network and node fault models.
Ordering service provides a shared communication channel to clientsand peers, offering a broadcast service for messages containingtransactions. Clients connect to the channel and may broadcast messageson the channel which are then delivered to all peers. The channelsupports atomic delivery of all messages, that is, messagecommunication with total-order delivery and (implementation specific)reliability. In other words, the channel outputs the same messages toall connected peers and outputs them to all peers in the same logicalorder. This atomic communication guarantee is also called total-orderbroadcast, atomic broadcast, or consensus in the context ofdistributed systems. The communicated messages are the candidatetransactions for inclusion in the blockchain state.
Partitioning (ordering service channels). Ordering service maysupport multiple channels similar to the topics of apublish/subscribe (pub/sub) messaging system. Clients can connect to agiven channel and can then send messages and obtain the messages thatarrive. Channels can be thought of as partitions - clients connecting toone channel are unaware of the existence of other channels, but clientsmay connect to multiple channels. Even though some ordering serviceimplementations included with Hyperledger Fabric support multiplechannels, for simplicity of presentation, in the rest of thisdocument, we assume ordering service consists of a single channel/topic.
Ordering service API. Peers connect to the channel provided by theordering service, via the interface provided by the ordering service.The ordering service API consists of two basic operations (moregenerally asynchronous events):
TODO add the part of the API for fetching particular blocks underclient/peer specified sequence numbers.
broadcast(blob)
: a client calls this to broadcast an arbitrarymessageblob
for dissemination over the channel. This is alsocalledrequest(blob)
in the BFT context, when sending a requestto a service.deliver(seqno, prevhash, blob)
: the ordering service calls thison the peer to deliver the messageblob
with the specifiednon-negative integer sequence number (seqno
) and hash of the mostrecently delivered blob (prevhash
). In other words, it is anoutput event from the ordering service.deliver()
is alsosometimes callednotify()
in pub-sub systems orcommit()
inBFT systems.
Ledger and block formation. The ledger (see also Sec. 1.2.2)contains all data output by the ordering service. In a nutshell, it is asequence of deliver(seqno, prevhash, blob)
events, which form a hashchain according to the computation of prevhash
described before.
Most of the time, for efficiency reasons, instead of outputtingindividual transactions (blobs), the ordering service will group (batch)the blobs and output blocks within a single deliver
event. In thiscase, the ordering service must impose and convey a deterministicordering of the blobs within each block. The number of blobs in a blockmay be chosen dynamically by an ordering service implementation.
In the following, for ease of presentation, we define ordering serviceproperties (rest of this subsection) and explain the workflow oftransaction endorsement (Section 2) assuming one blob per deliver
event. These are easily extended to blocks, assuming that a deliver
event for a block corresponds to a sequence of individual deliver
events for each blob within a block, according to the above mentioneddeterministic ordering of blobs within a block.
Ordering service properties
The guarantees of the ordering service (or atomic-broadcast channel)stipulate what happens to a broadcasted message and what relations existamong delivered messages. These guarantees are as follows:
Safety (consistency guarantees): As long as peers are connectedfor sufficiently long periods of time to the channel (they candisconnect or crash, but will restart and reconnect), they will seean identical series of delivered
(seqno, prevhash, blob)
messages. This means the outputs (deliver()
events) occur in thesame order on all peers and according to sequence number and carryidentical content (blob
andprevhash
) for the same sequencenumber. Note this is only a logical order, and adeliver(seqno, prevhash, blob)
on one peer is not required tooccur in any real-time relation todeliver(seqno, prevhash, blob)
that outputs the same message at another peer. Put differently, givena particularseqno
, no two correct peers deliver differentprevhash
orblob
values. Moreover, no valueblob
isdelivered unless some client (peer) actually calledbroadcast(blob)
and, preferably, every broadcasted blob is onlydelivered once.Furthermore, the
deliver()
event contains the cryptographic hashof the data in the previousdeliver()
event (prevhash
). Whenthe ordering service implements atomic broadcast guarantees,prevhash
is the cryptographic hash of the parameters from thedeliver()
event with sequence numberseqno-1
. Thisestablishes a hash chain acrossdeliver()
events, which is usedto help verify the integrity of the ordering service output, asdiscussed in Sections 4 and 5 later. In the special case of the firstdeliver()
event,prevhash
has a default value.Liveness (delivery guarantee): Liveness guarantees of theordering service are specified by a ordering service implementation.The exact guarantees may depend on the network and node fault model.
In principle, if the submitting client does not fail, the orderingservice should guarantee that every correct peer that connects to theordering service eventually delivers every submitted transaction.
To summarize, the ordering service ensures the following properties:
- Agreement. For any two events at correct peers
deliver(seqno, prevhash0, blob0)
anddeliver(seqno, prevhash1, blob1)
with the sameseqno
,prevhash0==prevhash1
andblob0==blob1
; - Hashchain integrity. For any two events at correct peers
deliver(seqno-1, prevhash0, blob0)
anddeliver(seqno, prevhash, blob)
,prevhash = HASH(seqno-1||prevhash0||blob0)
. - No skipping. If an ordering service outputs
deliver(seqno, prevhash, blob)
at a correct peer p, such thatseqno>0
, then p already delivered an eventdeliver(seqno-1, prevhash0, blob0)
. - No creation. Any event
deliver(seqno, prevhash, blob)
at acorrect peer must be preceded by abroadcast(blob)
event at some(possibly distinct) peer; - No duplication (optional, yet desirable). For any two events
broadcast(blob)
andbroadcast(blob')
, when two eventsdeliver(seqno0, prevhash0, blob)
anddeliver(seqno1, prevhash1, blob')
occur at correct peers andblob == blob'
, thenseqno0==seqno1
andprevhash0==prevhash1
. - Liveness. If a correct client invokes an event
broadcast(blob)
then every correct peer “eventually” issues an eventdeliver(*, *, blob)
, where*
denotes an arbitrary value.
2. Basic workflow of transaction endorsement
In the following we outline the high-level request flow for atransaction.
Remark: Notice that the following protocol *does not assume thatall transactions are deterministic, i.e., it allows fornon-deterministic transactions.*
2.1. The client creates a transaction and sends it to endorsing peers of its choice
To invoke a transaction, the client sends a PROPOSE
message to a setof endorsing peers of its choice (possibly not at the same time - seeSections 2.1.2. and 2.3.). The set of endorsing peers for a givenchaincodeID
is made available to client via peer, which in turnknows the set of endorsing peers from endorsement policy (see Section3). For example, the transaction could be sent to all endorsers of agiven chaincodeID
. That said, some endorsers could be offline,others may object and choose not to endorse the transaction. Thesubmitting client tries to satisfy the policy expression with theendorsers available.
In the following, we first detail PROPOSE
message format and thendiscuss possible patterns of interaction between submitting client andendorsers.
2.1.1. PROPOSE
message format
The format of a PROPOSE
message is <PROPOSE,tx,[anchor]>
, wheretx
is a mandatory and anchor
optional argument explained in thefollowing.
tx=<clientID,chaincodeID,txPayload,timestamp,clientSig>
, whereclientID
is an ID of the submitting client,chaincodeID
refers to the chaincode to which the transactionpertains,txPayload
is the payload containing the submitted transactionitself,timestamp
is a monotonically increasing (for every newtransaction) integer maintained by the client,clientSig
is signature of a client on other fields oftx
.
The details of
txPayload
will differ between invoke transactionsand deploy transactions (i.e., invoke transactions referring to adeploy-specific system chaincode). For an invoke transaction,txPayload
would consist of two fieldstxPayload = <operation, metadata>
, whereoperation
denotes the chaincode operation (function) andarguments,metadata
denotes attributes related to the invocation.
For a deploy transaction,
txPayload
would consist of threefieldstxPayload = <source, metadata, policies>
, wheresource
denotes the source code of the chaincode,metadata
denotes attributes related to the chaincode andapplication,policies
contains policies related to the chaincode thatare accessible to all peers, such as the endorsement policy.Note that endorsement policies are not supplied withtxPayload
in adeploy
transaction, buttxPayload
of adeploy
contains endorsement policy ID andits parameters (see Section 3).
anchor
contains read version dependencies, or morespecifically, key-version pairs (i.e.,anchor
is a subset ofKxN
), that binds or “anchors” thePROPOSE
request tospecified versions of keys in a KVS (see Section 1.2.). If the clientspecifies theanchor
argument, an endorser endorses a transactiononly upon read version numbers of corresponding keys in its localKVS matchanchor
(see Section 2.2. for more details).
Cryptographic hash of tx
is used by all nodes as a uniquetransaction identifier tid
(i.e., tid=HASH(tx)
). The clientstores tid
in memory and waits for responses from endorsing peers.
2.1.2. Message patterns
The client decides on the sequence of interaction with endorsers. Forexample, a client would typically send <PROPOSE, tx>
(i.e., withoutthe anchor
argument) to a single endorser, which would then producethe version dependencies (anchor
) which the client can later on useas an argument of its PROPOSE
message to other endorsers. As anotherexample, the client could directly send <PROPOSE, tx>
(withoutanchor
) to all endorsers of its choice. Different patterns ofcommunication are possible and client is free to decide on those (seealso Section 2.3.).
2.2. The endorsing peer simulates a transaction and produces an endorsement signature
On reception of a <PROPOSE,tx,[anchor]>
message from a client, theendorsing peer epID
first verifies the client’s signatureclientSig
and then simulates a transaction. If the client specifiesanchor
then endorsing peer simulates the transactions only upon readversion numbers (i.e., readset
as defined below) of correspondingkeys in its local KVS match those version numbers specified byanchor
.
Simulating a transaction involves endorsing peer tentatively executinga transaction (txPayload
), by invoking the chaincode to which thetransaction refers (chaincodeID
) and the copy of the state that theendorsing peer locally holds.
As a result of the execution, the endorsing peer computes read versiondependencies (readset
) and state updates (writeset
), alsocalled MVCC+postimage info in DB language.
Recall that the state consists of key-value pairs. All key-value entriesare versioned; that is, every entry contains ordered versioninformation, which is incremented each time the value stored undera key is updated. The peer that interprets the transaction records allkey-value pairs accessed by the chaincode, either for reading or for writing,but the peer does not yet update its state. More specifically:
- Given state
s
before an endorsing peer executes a transaction,for every keyk
read by the transaction, pair(k,s(k).version)
is added toreadset
. - Additionally, for every key
k
modified by the transaction to thenew valuev'
, pair(k,v')
is added towriteset
.Alternatively,v'
could be the delta of the new value to previousvalue (s(k).value
).
If a client specifies anchor
in the PROPOSE
message then clientspecified anchor
must equal readset
produced by endorsing peerwhen simulating the transaction.
Then, the peer forwards internally tran-proposal
(and possiblytx
) to the part of its (peer’s) logic that endorses a transaction,referred to as endorsing logic. By default, endorsing logic at apeer accepts the tran-proposal
and simply signs thetran-proposal
. However, endorsing logic may interpret arbitraryfunctionality, to, e.g., interact with legacy systems withtran-proposal
and tx
as inputs to reach the decision whether toendorse a transaction or not.
If endorsing logic decides to endorse a transaction, it sends<TRANSACTION-ENDORSED, tid, tran-proposal,epSig>
message to thesubmitting client(tx.clientID
), where:
tran-proposal := (epID,tid,chaincodeID,txContentBlob,readset,writeset)
,where
txContentBlob
is chaincode/transaction specificinformation. The intention is to havetxContentBlob
used as somerepresentation oftx
(e.g.,txContentBlob=tx.txPayload
).epSig
is the endorsing peer’s signature ontran-proposal
Else, in case the endorsing logic refuses to endorse the transaction, anendorser may send a message (TRANSACTION-INVALID, tid, REJECTED)
to the submitting client.
Notice that an endorser does not change its state in this step, theupdates produced by transaction simulation in the context of endorsementdo not affect the state!
2.3. The submitting client collects an endorsement for a transaction and broadcasts it through ordering service
The submitting client waits until it receives “enough” messages andsignatures on (TRANSACTION-ENDORSED, tid, *, *)
statements toconclude that the transaction proposal is endorsed. As discussed inSection 2.1.2., this may involve one or more round-trips of interactionwith endorsers.
The exact number of “enough” depend on the chaincode endorsement policy(see also Section 3). If the endorsement policy is satisfied, thetransaction has been endorsed; note that it is not yet committed. Thecollection of signed TRANSACTION-ENDORSED
messages from endorsingpeers which establish that a transaction is endorsed is called anendorsement and denoted by endorsement
.
If the submitting client does not manage to collect an endorsement for atransaction proposal, it abandons this transaction with an option toretry later.
For transaction with a valid endorsement, we now start using theordering service. The submitting client invokes ordering service usingthe broadcast(blob)
, where blob=endorsement
. If the client doesnot have capability of invoking ordering service directly, it may proxyits broadcast through some peer of its choice. Such a peer must betrusted by the client not to remove any message from the endorsement
or otherwise the transaction may be deemed invalid. Notice that,however, a proxy peer may not fabricate a valid endorsement
.
2.4. The ordering service delivers a transactions to the peers
When an event deliver(seqno, prevhash, blob)
occurs and a peer hasapplied all state updates for blobs with sequence number lower thanseqno
, a peer does the following:
- It checks that the
blob.endorsement
is valid according to thepolicy of the chaincode (blob.tran-proposal.chaincodeID
) to whichit refers. - In a typical case, it also verifies that the dependencies(
blob.endorsement.tran-proposal.readset
) have not been violatedmeanwhile. In more complex use cases,tran-proposal
fields inendorsement may differ and in this case endorsement policy (Section3) specifies how the state evolves.
Verification of dependencies can be implemented in different ways,according to a consistency property or “isolation guarantee” that ischosen for the state updates. Serializability is a default isolationguarantee, unless chaincode endorsement policy specifies a differentone. Serializability can be provided by requiring the version associatedwith every key in the readset
to be equal to that key’s version inthe state, and rejecting transactions that do not satisfy thisrequirement.
- If all these checks pass, the transaction is deemed valid orcommitted. In this case, the peer marks the transaction with 1 inthe bitmask of the
PeerLedger
, appliesblob.endorsement.tran-proposal.writeset
to blockchain state (iftran-proposals
are the same, otherwise endorsement policy logicdefines the function that takesblob.endorsement
). - If the endorsement policy verification of
blob.endorsement
fails,the transaction is invalid and the peer marks the transaction with 0in the bitmask of thePeerLedger
. It is important to note thatinvalid transactions do not change the state.
Note that this is sufficient to have all (correct) peers have the samestate after processing a deliver event (block) with a given sequencenumber. Namely, by the guarantees of the ordering service, all correctpeers will receive an identical sequence ofdeliver(seqno, prevhash, blob)
events. As the evaluation of theendorsement policy and evaluation of version dependencies in readset
are deterministic, all correct peers will also come to the sameconclusion whether a transaction contained in a blob is valid. Hence,all peers commit and apply the same sequence of transactions and updatetheir state in the same way.
Figure 1. Illustration of one possible transaction flow (common-case path).
3. Endorsement policies
3.1. Endorsement policy specification
An endorsement policy, is a condition on what endorses atransaction. Blockchain peers have a pre-specified set of endorsementpolicies, which are referenced by a deploy
transaction that installsspecific chaincode. Endorsement policies can be parametrized, and theseparameters can be specified by a deploy
transaction.
To guarantee blockchain and security properties, the set of endorsementpolicies should be a set of proven policies with limited set offunctions in order to ensure bounded execution time (termination),determinism, performance and security guarantees.
Dynamic addition of endorsement policies (e.g., by deploy
transaction on chaincode deploy time) is very sensitive in terms ofbounded policy evaluation time (termination), determinism, performanceand security guarantees. Therefore, dynamic addition of endorsementpolicies is not allowed, but can be supported in future.
3.2. Transaction evaluation against endorsement policy
A transaction is declared valid only if it has been endorsed accordingto the policy. An invoke transaction for a chaincode will first have toobtain an endorsement that satisfies the chaincode’s policy or it willnot be committed. This takes place through the interaction between thesubmitting client and endorsing peers as explained in Section 2.
Formally the endorsement policy is a predicate on the endorsement, andpotentially further state that evaluates to TRUE or FALSE. For deploytransactions the endorsement is obtained according to a system-widepolicy (for example, from the system chaincode).
An endorsement policy predicate refers to certain variables. Potentiallyit may refer to:
- keys or identities relating to the chaincode (found in the metadataof the chaincode), for example, a set of endorsers;
- further metadata of the chaincode;
- elements of the
endorsement
andendorsement.tran-proposal
; - and potentially more.
The above list is ordered by increasing expressiveness and complexity,that is, it will be relatively simple to support policies that onlyrefer to keys and identities of nodes.
The evaluation of an endorsement policy predicate must bedeterministic. An endorsement shall be evaluated locally by every peersuch that a peer does not need to interact with other peers, yet allcorrect peers evaluate the endorsement policy in the same way.
3.3. Example endorsement policies
The predicate may contain logical expressions and evaluates to TRUE orFALSE. Typically the condition will use digital signatures on thetransaction invocation issued by endorsing peers for the chaincode.
Suppose the chaincode specifies the endorser setE = {Alice, Bob, Charlie, Dave, Eve, Frank, George}
. Some examplepolicies:
- A valid signature from on the same
tran-proposal
from all membersof E. - A valid signature from any single member of E.
- Valid signatures on the same
tran-proposal
from endorsing peersaccording to the condition(Alice OR Bob) AND (any two of: Charlie, Dave, Eve, Frank, George)
. - Valid signatures on the same
tran-proposal
by any 5 out of the 7endorsers. (More generally, for chaincode withn > 3f
endorsers,valid signatures by any2f+1
out of then
endorsers, or byany group of more than(n+f)/2
endorsers.) - Suppose there is an assignment of “stake” or “weights” to theendorsers, like
{Alice=49, Bob=15, Charlie=15, Dave=10, Eve=7, Frank=3, George=1}
,where the total stake is 100: The policy requires valid signaturesfrom a set that has a majority of the stake (i.e., a group withcombined stake strictly more than 50), such as{Alice, X}
withanyX
different from George, or{everyone together except Alice}
. And so on. - The assignment of stake in the previous example condition could bestatic (fixed in the metadata of the chaincode) or dynamic (e.g.,dependent on the state of the chaincode and be modified during theexecution).
- Valid signatures from (Alice OR Bob) on
tran-proposal1
and validsignatures from(any two of: Charlie, Dave, Eve, Frank, George)
ontran-proposal2
, wheretran-proposal1
andtran-proposal2
differ only in their endorsing peers and stateupdates.
How useful these policies are will depend on the application, on thedesired resilience of the solution against failures or misbehavior ofendorsers, and on various other properties.
4 (post-v1). Validated ledger and PeerLedger
checkpointing (pruning)
4.1. Validated ledger (VLedger)
To maintain the abstraction of a ledger that contains only valid andcommitted transactions (that appears in Bitcoin, for example), peersmay, in addition to state and Ledger, maintain the Validated Ledger (orVLedger). This is a hash chain derived from the ledger by filtering outinvalid transactions.
The construction of the VLedger blocks (called here vBlocks) proceedsas follows. As the PeerLedger
blocks may contain invalidtransactions (i.e., transactions with invalid endorsement or withinvalid version dependencies), such transactions are filtered out bypeers before a transaction from a block becomes added to a vBlock. Everypeer does this by itself (e.g., by using the bitmask associated withPeerLedger
). A vBlock is defined as a block without the invalidtransactions, that have been filtered out. Such vBlocks are inherentlydynamic in size and may be empty. An illustration of vBlock constructionis given in the figure below.
Figure 2. Illustration of validated ledger block (vBlock) formation from ledger (PeerLedger) blocks.
vBlocks are chained together to a hash chain by every peer. Morespecifically, every block of a validated ledger contains:
- The hash of the previous vBlock.
- vBlock number.
- An ordered list of all valid transactions committed by the peerssince the last vBlock was computed (i.e., list of valid transactionsin a corresponding block).
- The hash of the corresponding block (in
PeerLedger
) from whichthe current vBlock is derived.
All this information is concatenated and hashed by a peer, producing thehash of the vBlock in the validated ledger.
4.2. PeerLedger
Checkpointing
The ledger contains invalid transactions, which may not necessarily berecorded forever. However, peers cannot simply discard PeerLedger
blocks and thereby prune PeerLedger
once they establish thecorresponding vBlocks. Namely, in this case, if a new peer joins thenetwork, other peers could not transfer the discarded blocks (pertainingto PeerLedger
) to the joining peer, nor convince the joining peer ofthe validity of their vBlocks.
To facilitate pruning of the PeerLedger
, this document describes acheckpointing mechanism. This mechanism establishes the validity ofthe vBlocks across the peer network and allows checkpointed vBlocks toreplace the discarded PeerLedger
blocks. This, in turn, reducesstorage space, as there is no need to store invalid transactions. Italso reduces the work to reconstruct the state for new peers that jointhe network (as they do not need to establish validity of individualtransactions when reconstructing the state by replaying PeerLedger
,but may simply replay the state updates contained in the validatedledger).
4.2.1. Checkpointing protocol
Checkpointing is performed periodically by the peers every CHK blocks,where CHK is a configurable parameter. To initiate a checkpoint, thepeers broadcast (e.g., gossip) to other peers message<CHECKPOINT,blocknohash,blockno,stateHash,peerSig>
, whereblockno
is the current blocknumber and blocknohash
is itsrespective hash, stateHash
is the hash of the latest state (producedby e.g., a Merkle hash) upon validation of block blockno
andpeerSig
is peer’s signature on(CHECKPOINT,blocknohash,blockno,stateHash)
, referring to thevalidated ledger.
A peer collects CHECKPOINT
messages until it obtains enoughcorrectly signed messages with matching blockno
, blocknohash
andstateHash
to establish a valid checkpoint (see Section 4.2.2.).
Upon establishing a valid checkpoint for block number blockno
withblocknohash
, a peer:
- if
blockno>latestValidCheckpoint.blockno
, then a peer assignslatestValidCheckpoint=(blocknohash,blockno)
, - stores the set of respective peer signatures that constitute a validcheckpoint into the set
latestValidCheckpointProof
, - stores the state corresponding to
stateHash
tolatestValidCheckpointedState
, - (optionally) prunes its
PeerLedger
up to block numberblockno
(inclusive).
4.2.2. Valid checkpoints
Clearly, the checkpointing protocol raises the following questions:When can a peer prune its ``PeerLedger``? How many ``CHECKPOINT``messages are “sufficiently many”?. This is defined by a checkpointvalidity policy, with (at least) two possible approaches, which mayalso be combined:
- Local (peer-specific) checkpoint validity policy (LCVP). A localpolicy at a given peer p may specify a set of peers which peer ptrusts and whose
CHECKPOINT
messages are sufficient to establisha valid checkpoint. For example, LCVP at peer Alice may define thatAlice needs to receiveCHECKPOINT
message from Bob, or fromboth Charlie and Dave. - Global checkpoint validity policy (GCVP). A checkpoint validitypolicy may be specified globally. This is similar to a local peerpolicy, except that it is stipulated at the system (blockchain)granularity, rather than peer granularity. For instance, GCVP mayspecify that:
- each peer may trust a checkpoint if confirmed by 11 differentpeers.
- in a specific deployment in which every orderer is collocated witha peer in the same machine (i.e., trust domain) and where up tof orderers may be (Byzantine) faulty, each peer may trust acheckpoint if confirmed by f+1 different peers collocated withorderers.