Reading Ingestion-Spanner: Google’s Globally-Distributed Database

最新推荐文章于 2023-10-17 16:22:48 发布

pourtheworld

最新推荐文章于 2023-10-17 16:22:48 发布

阅读量970

点赞数

分类专栏：论文总结文章标签：分布式系统 Spanner 分布式事务并发控制

本文链接：https://blog.csdn.net/pourtheworld/article/details/103150190

版权

Google Spanner是一个全球分布式数据库，它支持跨数据中心的复制数据管理，并自动进行数据分区和负载均衡。Spanner提供了外部一致性读写、全局一致性的读取时间戳，支持事务处理和SQL查询。其核心技术包括TrueTime API、Paxos副本管理和锁表的并发控制。文章详细介绍了Spanner的实现、数据模型、TrueTime时间系统、并发控制策略以及读写事务的细节。

摘要由CSDN通过智能技术生成

0. Essence

(1) Spannerserver’s stack(2.1Spannerserver Software Stack): consists of file system(Colossus)，storing data structure(tablet)，replica(paxos state machine)，concurrency control(lock table)，distributed transactions(transaction manager).
在这里插入图片描述
(2) Data Model(2.3 Data Model): introduce INTERLEAVE IN ，ON DELETE CASCADE to achive the interleaving of tables to form directories is significant because it allows clients to describe the locality relationships that exist between multiple tables.

在这里插入图片描述

(3）TrueTime(3 TrueTime): The underlying time references used by TrueTime are GPS and atomic clocks.TrueTime explicitly represents time as a TTinterval, which is an interval with bounded time uncertainty. 在这里插入图片描述
(4) Leader Leases(4.1.1 Paxos Leader Leases): prove the leader leases and disjoint from other leaders:

在这里插入图片描述
(5) RW transactions’ monotonicy invarient，external-consistency invariant，two phase-blocking
(4.1.2 Assigning Timestamps to RWTransactions,Read-Write Transactions).

(6) t(safe)，t(Paxos,safe)，t(TM,safe)(4.1.3 Serving Reads at a Timestamp):
Every replica tracks a value called safe time t safe which is the maximum timestamp at which a replica is up-to-date.

(7) s(read)'s single group(LastTS()) or multiple group(TT.now.latest)
(4.1.4 Assigning Timestamps to RO Transactions，4.2.2 Read-Only Transactions)

1. Introduction

Highest level of abstraction

It is a database that shards data across many sets of Paxos state machines in datacenters spread all over the world.

Spanner’s main focus
Managing cross-datacenter replicated data.Spanner automatically reshards data across machines as the amount of data or the number of servers changes, and it automatically migrates data across machines (even across datacenters) to balance load and in response to failures.

Spanner’s focus of distributed-systems infrastructure

Spanner has evolved from a Bigtable-like versioned key-value store into a temporal multi-version database;
Data is stored in schematized semi-relational tables;
Data is versioned, and each version is automatically timestamped with its commit time; old versions of data are subject to configurable garbage-collection policies; and applications can read data at old timestamps;
Spanner supports general-purpose transactions, and provides a SQL-based query language.

Spanner’s focus of globally-distributed database’s features

(1) First, the replication configurations for data can be dynamically controlled at a fine grain by applications. Applications can specify constraints to control which datacenters contain which data, how far data is from its users (to control read latency), how far replicas are from each other (to control write latency), and how many replicas are maintained (to control durability, availability, and read performance). Data can also be dynamically and transparently moved between datacenters by the system to balance resource usage across datacenters.

(2) Second, Spanner has two features that are difficult to implement in a distributed database: it provides externally consistent reads and writes, and globally-consistent reads across the database at a timestamp.These features enable Spanner to support consistent backups, consistent MapReduce executions , and atomic schema updates, all at global scale, and even
in the presence of ongoing transactions.

(2)’ External Consistency. ——[16] Information Storage in a DecentralizedComputer System
External consistency guarantees that a transaction will always receive current information. Using the concepts we have just introduced, we can provide a formal definition of external consistency. The actual time order in which transactions complete defines a unique serial schedule. This serial schedule is called the external schedule. A system is said to provide external consistency if it guarantees that the schedule it will use to process a set of transactions is equivalent to its external schedule.

(3) Spanner’s Globally-meaningful commit timestamps

Two features are enabled by the fact that Spanner assigns globally-meaningful commit timestamps to transactions, even though transactions may be distributed;
The timestamps reflect serialization order;
In addition, the serialization order satisfies external consistency (or equivalently, linearizability) : if a transaction T1 commits before another transaction T2 starts, then T1’s commit timestamp is smaller than T2’s. Spanner is the first system to provide such guarantees at global scale.

(3)’ Spanner’s new TrueTime API

The API directly exposes clock uncertainty, and the guarantees on Spanner’s timestamps depend on the bounds that the implementation provides;
If the uncertainty is large, Spanner slows down to wait out that uncertainty.This implementation keeps uncertainty small (generally less than 10ms) by using multiple modern clock references (GPS and atomic clocks).

Directory

(1) Section 2: describes the structure of Spanner’s implementation,its feature set, and the engineering decisions that went into their design.

(2) Section 3: describes our new TrueTime API and sketches its implementation.

(3) Section 4: describes how Spanner uses TrueTime to implement externally-consistent distributed transactions, lock-free read-only transactions, and atomic schema updates.

(4) Section 5: provides some benchmarks on Spanner’s performance and TrueTime behavior, and discusses the experiences of F1.

(5) Sections 6, 7, and 8 :describe related and future work, and summarize our conclusions.

2.1 Implemention

This section describes the structure of and rationale underlying Spanner’s implementation;
It then describes the directory abstraction, which is used to manage replication and locality, and is the unit of data movement;
Finally, it describes our data model, why Spanner looks like a relational database instead of a key-value store, and how applications can control data locality.

Universe

A Spanner deployment is called a universe.
We currently run a test/playground universe, a development/production universe, and a production-only universe.

Zone

(1) Spanner is organized as a set of zones;
(2) Zones are the unit of administrative deployment;
(3) The set of zones is also the set of locations acrosswhich data can be replicated.——？？？
(4) Zones can be added to or removed from a running system as new datacenters are brought into service and old ones are turned off, respectively；
(5) Zones are also the unit of physical isolation: there may be one or more zones in a datacenter, for example, if different applications’ data must be partitioned across different sets