1. 引言
- tick:为a ledger entry that estimates wallclock duration。
- tick height:为the Nth tick in the ledger。
- transactions entry:为可并行执行的a set of transactions。
- entry:为an entry on the ledger,要么是a tick,要么是a transactions entry。
/// Typed entry to distinguish between transaction and tick entries
pub enum EntryType<'a> {
Transactions(Vec<SanitizedTransaction<'a>>),
Tick(Hash),
}
// Entries without transactions are used to track real-time passing in the ledger and
// cannot be generated by `record()`
- entry id:为对应the final contents of an entry的hash值。entry id 是 entry的全局唯一标识。
entry id hash值可用于证明:
1)该entry being generated after a duration of time。
2)在该entry中包含特定的transactions。
3)该entry与ledger中其他entry的相对位置。 sdk/program/src/clock.rs
中,定义了组成ticks, slots等的网络时钟:
// The default tick rate that the cluster attempts to achieve. Note that the actual tick
// rate at any given time should be expected to drift
pub const DEFAULT_TICKS_PER_SECOND: u64 = 160; // 每秒对应有160个ticks。
// At 160 ticks/s, 64 ticks per slot implies that leader rotation and voting will happen
// every 400 ms. A fast voting cadence ensures faster finality and convergence
pub const DEFAULT_TICKS_PER_SLOT: u64 = 64; //每个slot对应为64个ticks
// GCP n1-standard hardware and also a xeon e5-2520 v4 are about this rate of hashes/s
pub const DEFAULT_HASHES_PER_SECOND: u64 = 2_000_000; // 硬件每秒可做200万次hash运算。
#[cfg(test)]
static_assertions::const_assert_eq!(DEFAULT_HASHES_PER_TICK, 12_500);
pub const DEFAULT_HASHES_PER_TICK: u64 = DEFAULT_HASHES_PER_SECOND / DEFAULT_TICKS_PER_SECOND;
entry/src/entry.rs
:为fundamental building block of PoH。
其中num_hashes对应下图PoH sequence中的Index,hash对应Output Hash,transactions为unordered list of transactions。
pub struct PohEntry {
pub num_hashes: u64,
pub hash: Hash,
}
/// Each Entry contains three pieces of data. The `num_hashes` field is the number
/// of hashes performed since the previous entry. The `hash` field is the result
/// of hashing `hash` from the previous entry `num_hashes` times. The `transactions`
/// field points to Transactions that took place shortly before `hash` was generated.
///
/// If you divide `num_hashes` by the amount of time it takes to generate a new hash, you
/// get a duration estimate since the last Entry. Since processing power increases
/// over time, one should expect the duration `num_hashes` represents to decrease proportionally.
/// An upper bound on Duration can be estimated by assuming each hash was generated by the
/// world's fastest processor at the time the entry was recorded. Or said another way, it
/// is physically not possible for a shorter duration to have occurred if one assumes the
/// hash was computed by the world's fastest processor at that time. The hash chain is both
/// a Verifiable Delay Function (VDF) and a Proof of Work (not to be confused with Proof of
/// Work consensus!)
#[derive(Serialize, Deserialize, Debug, Default, PartialEq, Eq, Clone)]
pub struct Entry {
/// The number of hashes since the previous Entry ID.
pub num_hashes: u64,
/// The SHA-256 hash `num_hashes` after the previous Entry ID.
pub hash: Hash,
/// An unordered list of transactions that were observed before the Entry ID was
/// generated. They may have been observed before a previous Entry ID but were
/// pushed back into this list to ensure deterministic interpretation of the ledger.
pub transactions: Vec<Transaction>,
}
PoH会借助target_poh_time
来将tick_number和num_hashes映射到真实的时间(ns级):【若没达到target_time,会继续等待。】
pub fn target_poh_time(&self, target_ns_per_tick: u64) -> Instant {
assert!(self.hashes_per_tick > 0);
let offset_tick_ns = target_ns_per_tick * self.tick_number;
let offset_ns = target_ns_per_tick * self.num_hashes / self.hashes_per_tick;
self.slot_start_time + Duration::from_nanos(offset_ns + offset_tick_ns)
}
// Number of hashes to batch together.
// * If this number is too small, PoH hash rate will suffer.
// * The larger this number is from 1, the speed of recording transactions will suffer due to lock
// contention with the PoH hashing within `tick_producer()`.
//
// Can use test_poh_service to calibrate this
pub const DEFAULT_HASHES_PER_BATCH: u64 = 64;
pub const DEFAULT_PINNED_CPU_CORE: usize = 0;
const TARGET_SLOT_ADJUSTMENT_NS: u64 = 50_000_000;
poh/src/poh_services.rs
:实现了a service,用于records the passing of “ticks”,a measure of time in the PoH stream。
struct PohTiming {
num_ticks: u64,
num_hashes: u64,
total_sleep_us: u64,
total_lock_time_ns: u64,
total_hash_time_ns: u64,
total_tick_time_ns: u64,
last_metric: Instant,
total_record_time_us: u64,
}
- Solana中的PoH定义为:
pub struct TransactionRecorder {
// shared by all users of PohRecorder
pub record_sender: CrossbeamSender<Record>,
pub is_exited: Arc<AtomicBool>,
}
// 有:
poh: &TransactionRecorder,
根据PohRecorderd recoder()函数,可定义TransactionRecorder:
poh: &Arc<Mutex<PohRecorder>>,
pub fn recorder(&self) -> TransactionRecorder {
TransactionRecorder::new(self.record_sender.clone(), self.is_exited.clone())
}
pub struct PohRecorder {
pub poh: Arc<Mutex<Poh>>,
tick_height: u64,
clear_bank_signal: Option<SyncSender<bool>>,
start_slot: Slot, // parent slot
start_tick_height: u64, // first tick_height this recorder will observe
tick_cache: Vec<(Entry, u64)>, // cache of entry and its tick_height
working_bank: Option<WorkingBank>,
sender: Sender<WorkingBankEntry>,
leader_first_tick_height: Option<u64>,
leader_last_tick_height: u64, // zero if none
grace_ticks: u64,
id: Pubkey,
blockstore: Arc<Blockstore>,
leader_schedule_cache: Arc<LeaderScheduleCache>,
poh_config: Arc<PohConfig>,
ticks_per_slot: u64,
target_ns_per_tick: u64,
record_lock_contention_us: u64,
flush_cache_no_tick_us: u64,
flush_cache_tick_us: u64,
prepare_send_us: u64,
send_us: u64,
tick_lock_contention_us: u64,
tick_overhead_us: u64,
total_sleep_us: u64,
record_us: u64,
ticks_from_record: u64,
last_metric: Instant,
record_sender: CrossbeamSender<Record>,
pub is_exited: Arc<AtomicBool>,
}
pub struct Poh {
pub hash: Hash,
num_hashes: u64,
hashes_per_tick: u64,
remaining_hashes: u64,
ticks_per_slot: u64,
tick_number: u64,
slot_start_time: Instant,
}
2. PoH generator (leader) 选举
具体见 ledger/src/ledger_schedule_utils.rs
和 ledger/src/leader_schedule.rs
中:【随机数种子为epoch编号,leader schedule中权重与stakes质押量有关。且对于相同的种子和质押量,产生的slot_leaders是一模一样的。】
/// Return the leader schedule for the given epoch.
pub fn leader_schedule(epoch: Epoch, bank: &Bank) -> Option<LeaderSchedule> {
bank.epoch_staked_nodes(epoch).map(|stakes| {
let mut seed = [0u8; 32];
seed[0..8].copy_from_slice(&epoch.to_le_bytes());
let mut stakes: Vec<_> = stakes.into_iter().collect();
sort_stakes(&mut stakes);
LeaderSchedule::new(
&stakes,
seed,
bank.get_slots_in_epoch(epoch),
NUM_CONSECUTIVE_LEADER_SLOTS,
)
})
}
/// Stake-weighted leader schedule for one epoch.
#[derive(Debug, Default, PartialEq)]
pub struct LeaderSchedule {
slot_leaders: Vec<Pubkey>,
// Inverted index from pubkeys to indices where they are the leader.
index: HashMap<Pubkey, Arc<Vec<usize>>>,
}
// Note: passing in zero stakers will cause a panic.
pub fn new(ids_and_stakes: &[(Pubkey, u64)], seed: [u8; 32], len: u64, repeat: u64) -> Self {
let (ids, stakes): (Vec<_>, Vec<_>) = ids_and_stakes.iter().cloned().unzip();
let rng = &mut ChaChaRng::from_seed(seed);
let weighted_index = WeightedIndex::new(stakes).unwrap();
let mut current_node = Pubkey::default();
let slot_leaders = (0..len)
.map(|i| {
if i % repeat == 0 {
current_node = ids[weighted_index.sample(rng)];
}
current_node
})
.collect();
Self::new_from_schedule(slot_leaders)
}
3. PohRecorder
poh/src/poh_recorder.rs
中,实现了poh_recorder
模块:
- 提供了an object for synchronizing with Proof of History
- 会同步PoH
- 会同步bank’s register_tick
- 会同步ledger
若当前的range of ticks 在指定的WorkingBank range范围内,PohRecoder会发送ticks或entries到该WorkingBank。
PohRecorder会借助PoH同步2种数据结构:
- bank - the LastId’s queue is updated on
tick
andrecord
events - sender - the Entry channel that outputs to the ledger
4. 时间和状态分离
所谓的时间和状态分离,体现在:
- transaction_status_sender:为Crossbeam_channel unbounded sender。对应的接收函数为:
write_transaction_status_batch()-》write_transaction_status()
。Validator在启动时就会一直接收交易的状态信息:
let transaction_status_service = Some(TransactionStatusService::new(
transaction_status_receiver,
max_complete_transaction_status_slot.clone(),
blockstore.clone(),
exit,
));
对应的发送为:
transaction_status_sender.send_transaction_status_batch(
bank.clone(),
txs,
tx_results.execution_results,
TransactionBalancesSet::new(pre_balances, post_balances),
TransactionTokenBalancesSet::new(pre_token_balances, post_token_balances),
inner_instructions,
transaction_logs,
tx_results.rent_debits,
);
TVU中的replay_stage,有:
replay_blockstore_into_bank -》 confirm_slot :
- verify_ticks:Verify that a segment of entries has the correct number of ticks and hashes
- start_verify:Verifies the hashes and counts of a slice of transactions are all consistent. 即对PoH sequence进行验证。
- verify_and_hash_transactions:验证交易有效性,并hash。
- process_entries_with_callback:批量执行
- finish_verify:
runtime/src/bank_forks.rs
:implements BankForks a DAG of checkpointed Banks。【有向无环图】
/// After setting a new root, prune the banks that are no longer on rooted paths
///
/// Given the following banks and slots...
///
/// ```text
/// slot 6 * (G)
/// /
/// slot 5 (F) * /
/// | /
/// slot 4 (E) * | /
/// | |/
/// slot 3 | * (D) <-- root, from set_root()
/// | |
/// slot 2 (C) * |
/// \ |
/// slot 1 \ * (B)
/// \ |
/// slot 0 * (A) <-- highest confirmed root [1]
/// ```
///
/// ...where (D) is set as root, clean up (C) and (E), since they are not rooted.
///
/// (A) is kept because it is greater-than-or-equal-to the highest confirmed root, and (D) is
/// one of its descendants
/// (B) is kept for the same reason as (A)
/// (C) is pruned since it is a lower slot than (D), but (D) is _not_ one of its descendants
/// (D) is kept since it is the root
/// (E) is pruned since it is not a descendant of (D)
/// (F) is kept since it is a descendant of (D)
/// (G) is kept for the same reason as (F)
///
/// and in table form...
///
/// ```text
/// | | is root a | is a descendant ||
/// slot | is root? | descendant? | of root? || keep?
/// ------+----------+-------------+-----------------++-------
/// (A) | N | Y | N || Y
/// (B) | N | Y | N || Y
/// (C) | N | N | N || N
/// (D) | Y | N | N || Y
/// (E) | N | N | N || N
/// (F) | N | N | Y || Y
/// (G) | N | N | Y || Y
/// ```
///
/// [1] RPC has the concept of commitment level, which is based on the highest confirmed root,
/// i.e. the cluster-confirmed root. This commitment is stronger than the local node's root.
/// So (A) and (B) are kept to facilitate RPC at different commitment levels. Everything below
/// the highest confirmed root can be pruned.
pub const INITIAL_LOCKOUT: usize = 2;
// The number of slots for which this vote is locked
pub fn lockout(&self) -> u64 {
(INITIAL_LOCKOUT as u64).pow(self.confirmation_count)
}
// The last slot at which a vote is still locked out. Validators should not
// vote on a slot in another fork which is less than or equal to this slot
// to avoid having their stake slashed.
pub fn last_locked_out_slot(&self) -> Slot {
self.slot + self.lockout()
}
pub const VOTE_THRESHOLD_DEPTH: usize = 8;
pub const SWITCH_FORK_THRESHOLD: f64 = 0.38;
pub const VOTE_THRESHOLD_SIZE: f64 = 2f64 / 3f64;
pub fn calculate_highest_confirmed_slot(&self) -> Slot {
self.highest_slot_with_confirmation_count(1)
}
pub fn get_confirmation_count(&self, slot: Slot) -> Option<usize> {
self.get_lockout_count(slot, VOTE_THRESHOLD_SIZE)
}
fn highest_slot_with_confirmation_count(&self, confirmation_count: usize) -> Slot {
assert!(confirmation_count > 0 && confirmation_count <= MAX_LOCKOUT_HISTORY);
for slot in (self.root()..self.slot()).rev() {
if let Some(count) = self.get_confirmation_count(slot) {
if count >= confirmation_count {
return slot;
}
}
}
self.commitment_slots.root
}
$ solana rent 15000
Rent per byte-year: 0.00000348 SOL
Rent per epoch: 0.000288276 SOL
Rent-exempt minimum: 0.10529088 SOL