postgres 源码解析3 二阶段提交实践+源码学习

  开启一个事务,后通过GDB进行跟踪,断点在 PrepareTransaction

postgres=# begin;
BEGIN
postgres=*# select pg_backend_pid();
 pg_backend_pid 
----------------
         226252
(1 row)

postgres=*# insert into wp_shy values(11,'redacancy');
INSERT 0 1
postgres=*# prepare transaction 'prepare_12';

PrepareTransaction()的执行流程如下:
1 从当前事务块获取事务号; xid = 779;

(gdb) p xid
$21 = 779

2 判断当前事务转态,如果不在 TRANS_INPROCGRESS,会发出警告消息,即其他转态不能进行 prepare;
3 执行所有被延迟的触发器、关闭 打开的Portal;
4 关闭大对象,对可串行化事务进行合法性检查;
5 将事务状态置为 TRANS_PREPARE;
6 告知资源管理器和缓冲管理器做好 Commit 准备;
7 在 free list 中 TwoPhaseState->freeGXacts 申请一个 gxact槽位填充相关信息:xid、gid等重要标识;在此过程中以LW_EXCLUSIVE模式持有 TwoPhaseStateLock

(gdb) p *gxact 
$20 = {next = 0x7f7c7f8f71e0, pgprocno = 1034, dummyBackendId = 1030, prepared_at = 706697115158785, 
  prepare_start_lsn = 0, prepare_end_lsn = 0, xid = 779, owner = 10, locking_backend = 3, valid = false, ondisk = false, 
  inredo = false, gid = "prepare_12", '\000' <repeats 189 times>}

8 调用 StartPrepare 初始化二阶段提交相关数据结构 + 2 PC 头记录;
 1)变量初始化,根据步骤7中gxact结构体中 pgprocno字段在ProcGlobal->allProcs数组中找到对象的PROC结构体;
 2)创建TwoPhaseFileHeader,并填充相关上述已有的字段信息:

(gdb) p hdr
$22 = {magic = 1475953972, total_len = 0, xid = 779, database = 13835, prepared_at = 706697115158785, owner = 10, 
  nsubxacts = 0, ncommitrels = 0, nabortrels = 0, ninvalmsgs = 0, initfileinval = false, gidlen = 11, 
  origin_lsn = 59423552209248, origin_timestamp = 706697115158785}
(gdb) 

:database = 13835为该事务涉及表对应的数据库OID;
 prepare_at:为prepare操作开始时的时间戳;
gidlen为用户输入的prepare transaction命令指定的字符串:strlen(prepare_12) + ‘\0’ == 11
 3)为后续 事务 prepare 准备号运行环境: 初始化二阶段锁、谓词锁、relationMap;
 4) 调用Endprepare 完成2pc prepare 转态数据并写入WAL日志(WAL日志后续内容详解);

  1. 将结束信息注册至2PC记录列表;
  2. 填充记录信息,如下:

typedef struct xl_xact_prepare
{
	uint32		magic = TWOPHASE_MAGIC;			/* format identifier */
	uint32		total_len;		/* actual file length */
	TransactionId xid = 779;			/* original transaction XID */
	Oid			database = 13835;		/* OID of database it was in */
	TimestampTz prepared_at;	/* time of preparation */
	Oid			owner;			/* user running the transaction */
	int32		nsubxacts = 0;		/* number of following subxact XIDs */
	int32		ncommitrels = 0;	/* number of delete-on-commit rels */
	int32		nabortrels = 0;		/* number of delete-on-abort rels */
	int32		ninvalmsgs = 0;		/* number of cache invalidation messages */
	bool		initfileinval;	/* does relcache init file need invalidation? */
	uint16		gidlen = 11;			/* length of the GID - GID follows the header */
	XLogRecPtr	origin_lsn = InvalidXLogRecPtr;		/* lsn of this record at origin node */
	TimestampTz origin_timestamp = 0;	/* time of prepare at origin node */
} xl_xact_prepare;
  1. 向WAL中插入2PC日志记录:将注册的rdata数据组装成完成的日志记录调用插入XLogInsert插入WAL缓冲区中,后将该日志记录Flush至磁盘;
  2. 更新相关字段信息XLogCtl结构体信息,标记prepared,将该事务的GXACT结构体加入全局的 global ProcArray;
  3. 将该事务的PGPROC结构中全局的 global ProcArray移除 [该事务仍在运行,可以通过扫描ProArray结构体结构体中的GXACT字段获取xid,若有效即视为仍在运行];
  4. 释放当前事务所占用的资源管理器、pined 缓冲区,清理RelationCache、释放所占用的锁资源等;

我们会看到在pg_twophase目录下记录着2阶段prepared 成功相关的信息

[root@node199 pg_twophase]# ll
-rw------- 1 postgres postgres 236 525 20:37 00000312
[root@node199 pg_twophase]# pwd 
/home/postgres/data/pg_twophase

(gdb) p path
$8 = "pg_twophase/00000312...
/*
 * Header for each record in a state file
 *
 * NOTE: len counts only the rmgr data, not the TwoPhaseRecordOnDisk header.
 * The rmgr data will be stored starting on a MAXALIGN boundary.
 */
typedef struct TwoPhaseRecordOnDisk
{
	uint32		len;			/* length of rmgr data */
	TwoPhaseRmgrId rmid;		/* resource manager for this record */
	uint16		info;			/* flag bits for use by rmgr */
} TwoPhaseRecordOnDisk;

/*
 * During prepare, the state file is assembled in memory before writing it
 * to WAL and the actual state file.  We use a chain of StateFileChunk blocks
 * for that.
 */
typedef struct StateFileChunk
{
	char	   *data;
	uint32		len;
	struct StateFileChunk *next;
} StateFileChunk;

static struct xllist
{
	StateFileChunk *head;		/* first data block in the chain */
	StateFileChunk *tail;		/* last block in chain */
	uint32		num_chunks;
	uint32		bytes_free;		/* free bytes left in tail block */
	uint32		total_len;		/* total data bytes in chain */
}

FinishPrepareTransaction

1 变量初始化,进入临界区;
2 验证 gid ,确保两个后端进程尝试提交相同 gid 事务,并获取该事务的事务xid
3 如果一阶段prepared 成功且相关信息持久化至磁盘,则调用ReadTwoPhaseFile读取 2PC状态数据信息,反之从buffer读取;
4 调用 RecordTransactionCommitPrepared 构建XLOG_XACT_COMMIT_PREPARED 日志并写入WAL文件, 后写CLOG标识该事务已成功提交;
5 从全局 ProcAcrry数组中移除该事务对应的PGPROC槽位,清理所有打开relations 的文件信息;
6 以排他模式获取TwoPhaseStateLock,为每个2PC记录注册回调函数防止两个后端进程使用相同的gid进而造成冲突 ;
7 释放2PC 过程中锁获取谓词锁资源,从全局 ProcAcrry数组中移除该事务对应的PGXACT槽位,;
8 释放TwoPhaseStateLock,将持久化的2PC文件信息移除,释放申请的内存资源;

[root@node199 pg_twophase]# ll

  • 2
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值