Postgresql-xl 调研

Postgresql-xl 调研

来历

这个项目的背后是一家叫做stormDB的公司。整个代买基于postgres-xc。开源版本应该是stormdb的一个分支。

In 2010, NTT's Open Source Software Center approached EnterpriseDB to
build off of NTT OSSC's experience with a project called RitaDB and
EnterpriseDB's experience with a project called GridSQL, and the
result was a new project, Postgres-XC.

In 2012, a company called StormDB was formed with some of the original
key Postgres-XC developers. StormDB added enhancements, including MPP
parallelism for performance and multi-tenant security.

In 2013, TransLattice acquired StormDB, and in 2014, open sourced it
as Postgres-XL.

个人观感

纯个人理解,不代表是正确的,如果理解有偏差,抱歉

  • 代码的整体质量不错,大部分的改动都有注释,注释可读性也很好,个别注释时效性有问题,但不影响理解代码。所有在pg代码中的改动都用idef做了有效隔离。理论上跟上PG的升级问题不大
  • postgresql xc修改了一些postgresql的代码,postgresql xl又把他们改了过来,然后又加了好多代码。注意区分#idef和#ifndef
  • Postgresql-xc的原则是能下推到dataNode的就下推到dataNode,实在推不下去的就把所有的数据集中在在聚集节点做。而xl做了MPP。

分布式架构

Postgresql-xl的官方主页在。注意这个网站引用的googleapi的某些资源,所以有时候比较慢。注意OLAP是排在OLTP的前面。

Features

Fully ACID
Open Source
Cluster-wide Consistency
Multi-tenant Security
PostgreSQL-based

Workloads:

OLAP with MPP Parallelism
Online Transaction Processing
Mixed
Operational Data Store
Key-value including JSON

首先请仔细读官方overview,这篇review中概要地描述了整个系统的大概的状况。注意这个架构中dataNode和coordinators都可以部署多个,GTM(global Transcation Manager)只有一个,图中画了两个的原因是有一个是standby。
OverView

和Postgresql-xc的关系

这个问题官方的答案是

Q. How does Postgres-XL relate to Postgres-XC and Stado?
The project includes architects and developers who previously worked
on both Postgres­-XC and Stado, and Postgres-XL contains code from
Postgres-­XC. The Postgres-XL project has its own philosophy and
approach. Postgres-XL values stability, correctness and performance
over new functionality. The Postgres-XL project ultimately strives to
track and merge in code from PostgreSQL. Postgres-XL adds some
significant performance improvements like MPP parallelism and replan
avoidance on the data nodes that are not part of Postgres­-XC.
Postgres-­XC currently focuses on OLTP workloads. Postgres-XL is more
flexible in terms of the types of workloads it can handle including
Big Data processing thanks to its parallelism. Additionally,
Postgres-XL is more secure for multi­-tenant environments. The
Postgres-XL community is also very open and welcoming to those who
wish to become more involved and contribute, whether on the mailing
lists, participating in developer meetings, or meeting in person.
Users will help drive development priorities and the project roadmap.

实际上在Postgresql-xl的src中包含的一个文件夹就叫pgxc。由于代码是基于pgxc的,所以大量的注释和代码都是pgxc的。

xl和xc最大的不同在于:xc的逻辑是如果SQL可以下推到datanode上做,那么就下推,否则把所有数据读到coordinator上面统一做。而xl则是真正意义上MPP。

代码改动方法和实现

相对于postgresql来说,在pgxl的基本逻辑是尽量少的修改代码,某些核心组件必须要做出调整,但是大部分保持一致,新增的文件都放在新的位置。
他们做的比较好的一点是,所有的改动地方都用ifdef处理过了。

#ifdef PGXC (PG-xc的改动)
#ifndef XCP(PG-xl基于xc的改动)
....
#endif
#endif

GTM

GTM stands for Global Transaction Manager. It provides global
transaction ID and snapshot to each transaction in Postgres-XL
database cluster. It also provide several global value such as
sequence and global timestamp.

GTM itself can be configured as a backup of other GTM as GTM-Standby
so that GTM can continue to run even if main GTM fails. You may want
to install GTM-Standby to separate server.

从代码(src/gtm)上看,这部分主要功能就是提供global的事务管理,给出global_txn_id和timestamp等等,考虑到这是一个单点,standby的相关代码也在这一部分。

snapshot
/*
 * Get snapshot for the given transactions. If this is the first call in the
 * transaction, a fresh snapshot is taken and returned back. For a serializable
 * transaction, repeated calls to the function will return the same snapshot.
 * For a read-committed transaction, fresh snapshot is taken every time and
 * returned to the caller.
 *
 * The returned snapshot includes xmin (lowest still-running xact ID),
 * xmax (highest completed xact ID + 1), and a list of running xact IDs
 * in the range xmin <= xid < xmax.  It is used as follows:
 *		All xact IDs < xmin are considered finished.
 *		All xact IDs >= xmax are considered still running.
 *		For an xact ID xmin <= xid < xmax, consult list to see whether
 *		it is considered running or not.
 * This ensures that the set of transactions seen as "running" by the
 * current xact will not change after it takes the snapshot.
 *
 * All running top-level XIDs are included in the snapshot.
 *
 * We also update the following global variables:
 *		RecentGlobalXmin: the global xmin (oldest TransactionXmin across all
 *			
  • 1
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值