1.基本概念
SHA-3哈希加密
Ethereum 中用到的哈希函数全部采用SHA-3(Secure Hash Algorithm 3,wikipedia)。SHA-3在2015年8月由美国标准技术协会(NIST)正式发布,作为Secure Hash Algorithm家族的最新一代标准,它相比于SHA-2和SHA-1,采用了完全不同的设计思路,性能也比较好。需要注意的是,SHA-2目前并没有出现被成功攻克的案例,SHA-3也没有要立即取代SHA-2的趋势,NIST只是考虑到SHA-1有过被攻克的案例,未雨绸缪的征选了采用全新结构和思路的SHA-3来作为一种最新的SHA方案。
RLP编码
RLP(Recursive Length Prefix)编码,其定义可见wiki,它可以将一个任意嵌套的字节数组([]byte),编码成一个“展平”无嵌套的[]byte。1 byte取值范围0x00 ~ 0xff,可以表示任意字符,所以[]byte可以线性的表示任意的数据。最简单比如一个字符串,如果每个字符用ASCII码的二进制表示,整个字符串就变成一个[]byte。 RLP 编码其实提供了一种序列化的编码方法,无论输入是何种嵌套形式的元素或数组,编码输出形式都是[]byte。RLP是可逆的,它提供了互逆的编码、解码方法。
Ethereum 中具体使用的哈希算法,就是对某个类型对象的RLP编码值做了SHA3哈希运算,可称为RLP Hash。 Ethereum 在底层存储中特意选择了专门存储和读取[k, v] 键值对的第三方数据库,[k, v] 中的v 就是某个结构体对象的RLP编码值([]byte),k大多数情况就是v的RLP编码后的SHA-3哈希值。
2. Block和Header
Block(区块)是Ethereum的核心数据结构之一。所有账户的相关活动,以交易(Transaction)的格式存储,每个Block有一个交易对象的列表;每个交易的执行结果,由一个Receipt对象与其包含的一组Log对象记录;所有交易执行完后生成的Receipt列表,存储在Block中(经过压缩加密)。不同Block之间,通过前向指针ParentHash一个一个串联起来成为一个单向链表,BlockChain 结构体管理着这个链表。
以太坊的主要数据结构
2.1BlockHeader
BlockHeader是Block的核心,Header的成员变量全都很重要,值得细细理解:
- m_parentHash:指向父区块(parentBlock)的指针。除了创世块(Genesis Block)外,每个区块有且只有一个父区块。
- m_author:挖掘出这个区块的作者地址。在每次执行交易时系统会给与一定补偿的Ether,这笔金额就是发给这个地址的。
- m_sha3Uncles:Block结构体的成员uncles的RLP哈希值。uncles是一个Header数组,它的存在,颇具匠心。
- m_stateRoot:StateDB中的“state Trie”的根节点的RLP哈希值。Block中,每个账户以stateObject对象表示,账户以Address为唯一标示,其信息在相关交易(Transaction)的执行中被修改。所有账户对象可以逐个插入一个Merkle-PatricaTrie(MPT)结构里,形成“state Trie”。
- m_transactionsRoot: Block中 “tx Trie”的根节点的RLP哈希值。Block的成员变量transactions中所有的tx对象,被逐个插入一个MPT结构,形成“tx Trie”。
- m_receiptsRoot:Block中的 "Receipt Trie”的根节点的RLP哈希值。Block的所有Transaction执行完后会生成一个Receipt数组,这个数组中的所有Receipt被逐个插入一个MPT结构中,形成"Receipt Trie"。
- m_logBloom:Bloom过滤器(Filter),用来快速判断一个参数Log对象是否存在于一组已知的Log集合中。
- m_difficulty:区块的难度。Block的Difficulty由共识算法基于parentBlock的Time和Difficulty计算得出,它会应用在区块的‘挖掘’阶段。
- m_number:区块的序号。Block的Number等于其父区块Number +1。
- m_timestamp:区块“应该”被创建的时间。由共识算法确定,一般来说,要么等于parentBlock.Time + 10s,要么等于当前系统时间。
- m_gasLimit:区块内所有Gas消耗的理论上限。该数值在区块创建时设置,与父区块有关。具体来说,根据父区块的GasUsed同GasLimit * 2/3的大小关系来计算得出。
- m_gasUsed:区块内所有Transaction执行时所实际消耗的Gas总和。
- Nonce:一个64bit的哈希数,它被应用在区块的"挖掘"阶段,并且在使用中会被修改。在C++版本无nonce
代码中写道:To determine the header hash without the nonce (for sealing), the method hash(WithoutNonce) is
* provided.
class BlockHeader
{
friend class BlockChain;
public:
static const unsigned BasicFields = 13;
BlockHeader();
explicit BlockHeader(bytesConstRef _data, BlockDataType _bdt = BlockData, h256 const& _hashWith = h256());
explicit BlockHeader(bytes const& _data, BlockDataType _bdt = BlockData, h256 const& _hashWith = h256()): BlockHeader(&_data, _bdt, _hashWith) {}
BlockHeader(BlockHeader const& _other);
BlockHeader& operator=(BlockHeader const& _other);
static h256 headerHashFromBlock(bytes const& _block) { return headerHashFromBlock(&_block); }
static h256 headerHashFromBlock(bytesConstRef _block);
static RLP extractHeader(bytesConstRef _block);
explicit operator bool() const { return m_timestamp != Invalid256; }
bool operator==(BlockHeader const& _cmp) const
{
return m_parentHash == _cmp.parentHash() &&
m_sha3Uncles == _cmp.sha3Uncles() &&
m_author == _cmp.author() &&
m_stateRoot == _cmp.stateRoot() &&
m_transactionsRoot == _cmp.transactionsRoot() &&
m_receiptsRoot == _cmp.receiptsRoot() &&
m_logBloom == _cmp.logBloom() &&
m_difficulty == _cmp.difficulty() &&
m_number == _cmp.number() &&
m_gasLimit == _cmp.gasLimit() &&
m_gasUsed == _cmp.gasUsed() &&
m_timestamp == _cmp.timestamp() &&
m_extraData == _cmp.extraData();
}
bool operator!=(BlockHeader const& _cmp) const { return !operator==(_cmp); }
void clear();
void noteDirty() const { Guard l(m_hashLock); m_hashWithout = m_hash = h256(); }
void populateFromParent(BlockHeader const& parent);
// TODO: pull out into abstract class Verifier.
void verify(Strictness _s = CheckEverything, BlockHeader const& _parent = BlockHeader(), bytesConstRef _block = bytesConstRef()) const;
void verify(Strictness _s, bytesConstRef _block) const { verify(_s, BlockHeader(), _block); }
h256 hash(IncludeSeal _i = WithSeal) const;
void streamRLP(RLPStream& _s, IncludeSeal _i = WithSeal) const;
void setParentHash(h256 const& _v) { m_parentHash = _v; noteDirty(); }
void setSha3Uncles(h256 const& _v) { m_sha3Uncles = _v; noteDirty(); }
void setTimestamp(int64_t _v) { m_timestamp = _v; noteDirty(); }
void setAuthor(Address const& _v) { m_author = _v; noteDirty(); }
void setRoots(h256 const& _t, h256 const& _r, h256 const& _u, h256 const& _s) { m_transactionsRoot = _t; m_receiptsRoot = _r; m_stateRoot = _s; m_sha3Uncles = _u; noteDirty(); }
void setGasUsed(u256 const& _v) { m_gasUsed = _v; noteDirty(); }
void setNumber(int64_t _v) { m_number = _v; noteDirty(); }
void setGasLimit(u256 const& _v) { m_gasLimit = _v; noteDirty(); }
void setExtraData(bytes const& _v) { m_extraData = _v; noteDirty(); }
void setLogBloom(LogBloom const& _v) { m_logBloom = _v; noteDirty(); }
void setDifficulty(u256 const& _v) { m_difficulty = _v; noteDirty(); }
template <class T> void setSeal(unsigned _offset, T const& _value) { Guard l(m_sealLock); if (m_seal.size() <= _offset) m_seal.resize(_offset + 1); m_seal[_offset] = rlp(_value); noteDirty(); }
template <class T> void setSeal(T const& _value) { setSeal(0, _value); }
h256 const& parentHash() const { return m_parentHash; }
h256 const& sha3Uncles() const { return m_sha3Uncles; }
bool hasUncles() const { return m_sha3Uncles != EmptyListSHA3; }
int64_t timestamp() const { return m_timestamp; }
Address const& author() const { return m_author; }
h256 const& stateRoot() const { return m_stateRoot; }
h256 const& transactionsRoot() const { return m_transactionsRoot; }
h256 const& receiptsRoot() const { return m_receiptsRoot; }
u256 const& gasUsed() const { return m_gasUsed; }
int64_t number() const { return m_number; }
u256 const& gasLimit() const { return m_gasLimit; }
bytes const& extraData() const { return m_extraData; }
LogBloom const& logBloom() const { return m_logBloom; }
u256 const& difficulty() const { return m_difficulty; }
template <class T> T seal(unsigned _offset = 0) const { T ret; Guard l(m_sealLock); if (_offset < m_seal.size()) ret = RLP(m_seal[_offset]).convert<T>(RLP::VeryStrict); return ret; }
private:
void populate(RLP const& _header);
void streamRLPFields(RLPStream& _s) const;
std::vector<bytes> seal() const
{
Guard l(m_sealLock);
return m_seal;
}
h256 hashRawRead() const
{
Guard l(m_hashLock);
return m_hash;
}
h256 hashWithoutRawRead() const
{
Guard l(m_hashLock);
return m_hashWithout;
}
h256 m_parentHash;
h256 m_sha3Uncles;
h256 m_stateRoot;
h256 m_transactionsRoot;
h256 m_receiptsRoot;
LogBloom m_logBloom;
int64_t m_number = 0;
u256 m_gasLimit;
u256 m_gasUsed;
bytes m_extraData;
int64_t m_timestamp = -1;
Address m_author;
u256 m_difficulty;
std::vector<bytes> m_seal; ///< Additional (RLP-encoded) header fields.
mutable Mutex m_sealLock;
mutable h256 m_hash; ///< (Memoised) SHA3 hash of the block header with seal.
mutable h256 m_hashWithout; ///< (Memoised) SHA3 hash of the block header without seal.
mutable Mutex m_hashLock; ///< A lock for both m_hash and m_hashWithout.
};
h256类型
与hash相关的基本都是h256类型,在以太坊中,h256类型定义如下:
using h256 = FixedHash<32>;
class FixedHash
{
public:
/// The corresponding arithmetic type.
using Arith = boost::multiprecision::number<boost::multiprecision::cpp_int_backend<N * 8, N * 8, boost::multiprecision::unsigned_magnitude, boost::multiprecision::unchecked, void>>;
/// The size of the container.
enum { size = N };
/// A dummy flag to avoid accidental construction from pointer.
enum ConstructFromPointerType { ConstructFromPointer };
/// Method to convert from a string.
enum ConstructFromStringType { FromHex, FromBinary };
/// Method to convert from a string.
enum ConstructFromHashType { AlignLeft, AlignRight, FailIfDifferent };
/// Construct an empty hash.
FixedHash() { m_data.fill(0); }
/// Construct from another hash, filling with zeroes or cropping as necessary.
template <unsigned M> explicit FixedHash(FixedHash<M> const& _h, ConstructFromHashType _t = AlignLeft) { m_data.fill(0); unsigned c = std::min(M, N); for (unsigned i = 0; i < c; ++i) m_data[_t == AlignRight ? N - 1 - i : i] = _h[_t == AlignRight ? M - 1 - i : i]; }
/// Convert from the corresponding arithmetic type.
FixedHash(Arith const& _arith) { toBigEndian(_arith, m_data); }
/// Convert from unsigned
explicit FixedHash(unsigned _u) { toBigEndian(_u, m_data); }
/// Explicitly construct, copying from a byte array.
explicit FixedHash(bytes const& _b, ConstructFromHashType _t = FailIfDifferent) { if (_b.size() == N) memcpy(m_data.data(), _b.data(), std::min<unsigned>(_b.size(), N)); else { m_data.fill(0); if (_t != FailIfDifferent) { auto c = std::min<unsigned>(_b.size(), N); for (unsigned i = 0; i < c; ++i) m_data[_t == AlignRight ? N - 1 - i : i] = _b[_t == AlignRight ? _b.size() - 1 - i : i]; } } }
/// Explicitly construct, copying from a byte array.
explicit FixedHash(bytesConstRef _b, ConstructFromHashType _t = FailIfDifferent) { if (_b.size() == N) memcpy(m_data.data(), _b.data(), std::min<unsigned>(_b.size(), N)); else { m_data.fill(0); if (_t != FailIfDifferent) { auto c = std::min<unsigned>(_b.size(), N); for (unsigned i = 0; i < c; ++i) m_data[_t == AlignRight ? N - 1 - i : i] = _b[_t == AlignRight ? _b.size() - 1 - i : i]; } } }
/// Explicitly construct, copying from a bytes in memory with given pointer.
explicit FixedHash(byte const* _bs, ConstructFromPointerType) { memcpy(m_data.data(), _bs, N); }
/// Explicitly construct, copying from a string.
explicit FixedHash(std::string const& _s, ConstructFromStringType _t = FromHex, ConstructFromHashType _ht = FailIfDifferent): FixedHash(_t == FromHex ? fromHex(_s, WhenError::Throw) : dev::asBytes(_s), _ht) {}
/// Convert to arithmetic type.
operator Arith() const { return fromBigEndian<Arith>(m_data); }
/// @returns true iff this is the empty hash.
explicit operator bool() const { return std::any_of(m_data.begin(), m_data.end(), [](byte _b) { return _b != 0; }); }
// The obvious comparison operators.
bool operator==(FixedHash const& _c) const { return m_data == _c.m_data; }
bool operator!=(FixedHash const& _c) const { return m_data != _c.m_data; }
bool operator<(FixedHash const& _c) const { for (unsigned i = 0; i < N; ++i) if (m_data[i] < _c.m_data[i]) return true; else if (m_data[i] > _c.m_data[i]) return false; return false; }
bool operator>=(FixedHash const& _c) const { return !operator<(_c); }
bool operator<=(FixedHash const& _c) const { return operator==(_c) || operator<(_c); }
bool operator>(FixedHash const& _c) const { return !operator<=(_c); }
// The obvious binary operators.
FixedHash& operator^=(FixedHash const& _c) { for (unsigned i = 0; i < N; ++i) m_data[i] ^= _c.m_data[i]; return *this; }
FixedHash operator^(FixedHash const& _c) const { return FixedHash(*this) ^= _c; }
FixedHash& operator|=(FixedHash const& _c) { for (unsigned i = 0; i < N; ++i) m_data[i] |= _c.m_data[i]; return *this; }
FixedHash operator|(FixedHash const& _c) const { return FixedHash(*this) |= _c; }
FixedHash& operator&=(FixedHash const& _c) { for (unsigned i = 0; i < N; ++i) m_data[i] &= _c.m_data[i]; return *this; }
FixedHash operator&(FixedHash const& _c) const { return FixedHash(*this) &= _c; }
FixedHash operator~() const { FixedHash ret; for (unsigned i = 0; i < N; ++i) ret[i] = ~m_data[i]; return ret; }
// Big-endian increment.
FixedHash& operator++() { for (unsigned i = size; i > 0 && !++m_data[--i]; ) {} return *this; }
/// @returns true if all one-bits in @a _c are set in this object.
bool contains(FixedHash const& _c) const { return (*this & _c) == _c; }
/// @returns a particular byte from the hash.
byte& operator[](unsigned _i) { return m_data[_i]; }
/// @returns a particular byte from the hash.
byte operator[](unsigned _i) const { return m_data[_i]; }
/// @returns an abridged version of the hash as a user-readable hex string.
std::string abridged() const { return toHex(ref().cropped(0, 4)) + "\342\200\246"; }
/// @returns a version of the hash as a user-readable hex string that leaves out the middle part.
std::string abridgedMiddle() const { return toHex(ref().cropped(0, 4)) + "\342\200\246" + toHex(ref().cropped(N - 4)); }
/// @returns the hash as a user-readable hex string.
std::string hex() const { return toHex(ref()); }
/// @returns a mutable byte vector_ref to the object's data.
bytesRef ref() { return bytesRef(m_data.data(), N); }
/// @returns a constant byte vector_ref to the object's data.
bytesConstRef ref() const { return bytesConstRef(m_data.data(), N); }
/// @returns a mutable byte pointer to the object's data.
byte* data() { return m_data.data(); }
/// @returns a constant byte pointer to the object's data.
byte const* data() const { return m_data.data(); }
/// @returns begin iterator.
auto begin() const -> typename std::array<byte, N>::const_iterator { return m_data.begin(); }
/// @returns end iterator.
auto end() const -> typename std::array<byte, N>::const_iterator { return m_data.end(); }
/// @returns a copy of the object's data as a byte vector.
bytes asBytes() const { return bytes(data(), data() + N); }
/// @returns a mutable reference to the object's data as an STL array.
std::array<byte, N>& asArray() { return m_data; }
/// @returns a constant reference to the object's data as an STL array.
std::array<byte, N> const& asArray() const { return m_data; }
/// Populate with random data.
template <class Engine>
void randomize(Engine& _eng)
{
for (auto& i: m_data)
i = (uint8_t)std::uniform_int_distribution<uint16_t>(0, 255)(_eng);
}
/// @returns a random valued object.
static FixedHash random() { FixedHash ret; ret.randomize(s_fixedHashEngine); return ret; }
struct hash
{
/// Make a hash of the object's data.
size_t operator()(FixedHash const& _value) const { return boost::hash_range(_value.m_data.cbegin(), _value.m_data.cend()); }
};
template <unsigned P, unsigned M> inline FixedHash& shiftBloom(FixedHash<M> const& _h)
{
return (*this |= _h.template bloomPart<P, N>());
}
template <unsigned P, unsigned M> inline bool containsBloom(FixedHash<M> const& _h)
{
return contains(_h.template bloomPart<P, N>());
}
template <unsigned P, unsigned M> inline FixedHash<M> bloomPart() const
{
unsigned const c_bloomBits = M * 8;
unsigned const c_mask = c_bloomBits - 1;
unsigned const c_bloomBytes = (StaticLog2<c_bloomBits>::result + 7) / 8;
static_assert((M & (M - 1)) == 0, "M must be power-of-two");
static_assert(P * c_bloomBytes <= N, "out of range");
FixedHash<M> ret;
byte const* p = data();
for (unsigned i = 0; i < P; ++i)
{
unsigned index = 0;
for (unsigned j = 0; j < c_bloomBytes; ++j, ++p)
index = (index << 8) | *p;
index &= c_mask;
ret[M - 1 - index / 8] |= (1 << (index % 8));
}
return ret;
}
/// Returns the index of the first bit set to one, or size() * 8 if no bits are set.
inline unsigned firstBitSet() const
{
unsigned ret = 0;
for (auto d: m_data)
if (d)
for (;; ++ret, d <<= 1)
if (d & 0x80)
return ret;
else {}
else
ret += 8;
return ret;
}
void clear() { m_data.fill(0); }
private:
std::array<byte, N> m_data; ///< The binary data.
};
可以看到h256类型,本质是一个字节数组,长度为32个字节
std::array<byte, N> m_data; ///< The binary data.
重载了运算符,封装了转换为16进制等相关的的函数
std::string hex() const { return toHex(ref()); }
Address 类型
Address m_author; 定义了挖掘出这个区块的作者地址。Address定义如下:
using Address = h160;
using h160 = FixedHash<20>;
和hash256类似,长度为20个字节,以太坊钱包公钥的地址刚好就是160位
比如 0xf3bb1c682c48a99f81e62b9a200eb4ccd6b22403,这个钱包地址,40位16进制串,40*4=160位
2.2 Block
区块(Block)是Ethereum的核心结构体之一。在整个区块链(BlockChain)中,一个个Block是以单向链表的形式相互关联起来的。Block中带有一个Header(指针), Header结构体带有Block的所有属性信息,其中的ParentHash 表示该区块的父区块哈希值, 亦即Block之间关联起来的前向指针.
Block的成员如下:
State m_state; ///< Our state tree, as an OverlayDB DB.
Transactions m_transactions; ///< The current list of transactions that we've included in the state.
Transaction(简称tx),是Ethereum里标示一次交易的结构体
using Transactions = std::vector<Transaction>;
TransactionReceipts m_receipts; ///< The corresponding list of transaction receipts.
using TransactionReceipts = std::vector<TransactionReceipt>;
State m_precommit; ///< State at the point immediately prior to rewards.
BlockHeader m_previousBlock; ///< The previous block's information.
BlockHeader m_currentBlock; ///< The current block's information.
bytes m_currentBytes; ///< The current block's bytes.
bool m_committedToSeal = false; ///< Have we committed to mine on the present m_currentBlock?
bytes m_currentTxs; ///< The RLP-encoded block of transactions.
bytes m_currentUncles; ///< The RLP-encoded block of uncles.
Address m_author; ///< Our address (i.e. the address to which fees go).
SealEngineFace* m_sealEngine = nullptr; ///< The chain's seal engine.