postgres 源码解析5 Truncate relations源码解析

介绍

  当我们想清除表中数据但又想保留表原有格式属性时,可以通过 Truncate command实现。接下来一起学习 postgres Truncate command的实现原理。

1 ExecuteTruncate

功能介绍: /src/backend/command/tablescmds.c

 1. This is a multi-relation truncate.  We first open and grab exclusive
 2. lock on all relations involved, checking permissions and otherwise
 3. verifying that the relation is OK for truncation.  Note that if relations
 4. are foreign tables, at this stage, we have not yet checked that their
 5. foreign data in external data sources are OK for truncation.  These are
 6. checked when foreign data are actually truncated later.  In CASCADE mode,
 7. relations having FK references to the targeted relations are automatically
 8. added to the group; in RESTRICT mode, we check that all FK references are
 9. internal to the group that's being truncated.  Finally all the relations
 10. are truncated and reindexed.
 11. 
 12. 可知,该命令是一个多relations truncate.首先打开所有涉及relations并获取排他锁 exclusive lock,
 13. 后续检查权限并验证relation是否能被 Truncate. 在 CASCADE 模式下,对目标表的FK引用 relations 
 14. 自动加入 Truncate组; 在RESTRICT模式下,我们先需要检查所有FK引用是否加入被 Truncate组内,然后执行
 15. Truncated和重建 index操作。 ==若未指定,默认为 RESTRICT 模式==

  首先我们熟悉下该流程涉及的关键数据结构:
  == TruncateStmt ==

typedef struct TruncateStmt
{
	NodeTag		type;           // 节点类型
	List	   *relations;		/* relations (RangeVars) to be truncated */
	bool		restart_seqs;	/* restart owned sequences? */
	DropBehavior behavior;		/* RESTRICT or CASCADE behavior */
} TruncateStmt;

RelationsData


typedef struct RelationData
{
	RelFileNode rd_node;		/* relation physical identifier */
	/* use "struct" here to avoid needing to include smgr.h: */
	struct SMgrRelationData *rd_smgr;	/* cached file handle, or NULL */
	int			rd_refcnt;		/* reference count */
	BackendId	rd_backend;		/* owning backend id, if temporary relation */
	bool		rd_islocaltemp; /* rel is a temp rel of this session */
	bool		rd_isnailed;	/* rel is nailed in cache */
	bool		rd_isvalid;		/* relcache entry is valid */
	bool		rd_indexvalid;	/* is rd_indexlist valid? (also rd_pkindex and
								 * rd_replidindex) */
	bool		rd_statvalid;	/* is rd_statlist valid? */

	/*----------
	 * rd_createSubid is the ID of the highest subtransaction the rel has
	 * survived into or zero if the rel or its rd_node was created before the
	 * current top transaction.  (IndexStmt.oldNode leads to the case of a new
	 * rel with an old rd_node.)  rd_firstRelfilenodeSubid is the ID of the
	 * highest subtransaction an rd_node change has survived into or zero if
	 * rd_node matches the value it had at the start of the current top
	 * transaction.  (Rolling back the subtransaction that
	 * rd_firstRelfilenodeSubid denotes would restore rd_node to the value it
	 * had at the start of the current top transaction.  Rolling back any
	 * lower subtransaction would not.)  Their accuracy is critical to
	 * RelationNeedsWAL().
	 *
	 * rd_newRelfilenodeSubid is the ID of the highest subtransaction the
	 * most-recent relfilenode change has survived into or zero if not changed
	 * in the current transaction (or we have forgotten changing it).  This
	 * field is accurate when non-zero, but it can be zero when a relation has
	 * multiple new relfilenodes within a single transaction, with one of them
	 * occurring in a subsequently aborted subtransaction, e.g.
	 *		BEGIN;
	 *		TRUNCATE t;
	 *		SAVEPOINT save;
	 *		TRUNCATE t;
	 *		ROLLBACK TO save;
	 *		-- rd_newRelfilenodeSubid is now forgotten
	 *
	 * If every rd_*Subid field is zero, they are read-only outside
	 * relcache.c.  Files that trigger rd_node changes by updating
	 * pg_class.reltablespace and/or pg_class.relfilenode call
	 * RelationAssumeNewRelfilenode() to update rd_*Subid.
	 *
	 * rd_droppedSubid is the ID of the highest subtransaction that a drop of
	 * the rel has survived into.  In entries visible outside relcache.c, this
	 * is always zero.
	 */
	SubTransactionId rd_createSubid;	/* rel was created in current xact */
	SubTransactionId rd_newRelfilenodeSubid;	/* highest subxact changing
												 * rd_node to current value */
	SubTransactionId rd_firstRelfilenodeSubid;	/* highest subxact changing
												 * rd_node to any value */
	SubTransactionId rd_droppedSubid;	/* dropped with another Subid set */

	Form_pg_class rd_rel;		/* RELATION tuple */
	TupleDesc	rd_att;			/* tuple descriptor */
	Oid			rd_id;			/* relation's object id */
	LockInfoData rd_lockInfo;	/* lock mgr's info for locking relation */
	RuleLock   *rd_rules;		/* rewrite rules */
	MemoryContext rd_rulescxt;	/* private memory cxt for rd_rules, if any */
	TriggerDesc *trigdesc;		/* Trigger info, or NULL if rel has none */
	/* use "struct" here to avoid needing to include rowsecurity.h: */
	struct RowSecurityDesc *rd_rsdesc;	/* row security policies, or NULL */
	... 
	/*
	 * pg_inherits.xmin of the partition that was excluded in
	 * rd_partdesc_nodetached.  This informs a future user of that partdesc:
	 * if this value is not in progress for the active snapshot, then the
	 * partdesc can be used, otherwise they have to build a new one.  (This
	 * matches what find_inheritance_children_extended would do).
	 */
	TransactionId rd_partdesc_nodetached_xmin;

	/* data managed by RelationGetPartitionQual: */
	List	   *rd_partcheck;	/* partition CHECK quals */
	bool		rd_partcheckvalid;	/* true if list has been computed */
	MemoryContext rd_partcheckcxt;	/* private cxt for rd_partcheck, if any */

	/* data managed by RelationGetIndexList: */
	List	   *rd_indexlist;	/* list of OIDs of indexes on relation */
	Oid			rd_pkindex;		/* OID of primary key, if any */
	Oid			rd_replidindex; /* OID of replica identity index, if any */

	/* data managed by RelationGetStatExtList: */
	List	   *rd_statlist;	/* list of OIDs of extended stats */

	/* data managed by RelationGetIndexAttrBitmap: */
	Bitmapset  *rd_indexattr;	/* identifies columns used in indexes */
	Bitmapset  *rd_keyattr;		/* cols that can be ref'd by foreign keys */
	Bitmapset  *rd_pkattr;		/* cols included in primary key */
	Bitmapset  *rd_idattr;		/* included in replica identity index */

	PublicationActions *rd_pubactions;	/* publication actions */

	/*
	 * rd_options is set whenever rd_rel is loaded into the relcache entry.
	 * Note that you can NOT look into rd_rel for this data.  NULL means "use
	 * defaults".
	 */
	bytea	   *rd_options;		/* parsed pg_class.reloptions */

	/*
	 * Oid of the handler for this relation. For an index this is a function
	 * returning IndexAmRoutine, for table like relations a function returning
	 * TableAmRoutine.  This is stored separately from rd_indam, rd_tableam as
	 * its lookup requires syscache access, but during relcache bootstrap we
	 * need to be able to initialize rd_tableam without syscache lookups.
	 */
	Oid			rd_amhandler;	/* OID of index AM's handler function */

	/*
	 * Table access method.
	 */
	const struct TableAmRoutine *rd_tableam;

	/* These are non-NULL only for an index relation: */
	Form_pg_index rd_index;		/* pg_index tuple describing this index */
	/* use "struct" here to avoid needing to include htup.h: */
	struct HeapTupleData *rd_indextuple;	/* all of pg_index tuple */
     
	/*
	 * rd_amcache is available for index and table AMs to cache private data
	 * about the relation.  This must be just a cache since it may get reset
	 * at any time (in particular, it will get reset by a relcache inval
	 * message for the relation).  If used, it must point to a single memory
	 * chunk palloc'd in CacheMemoryContext, or in rd_indexcxt for an index
	 * relation.  A relcache reset will include freeing that chunk and setting
	 * rd_amcache = NULL.
	 */
	void	   *rd_amcache;		/* available for use by index/table AM */

	/*
	 * foreign-table support
	 *
	 * rd_fdwroutine must point to a single memory chunk palloc'd in
	 * CacheMemoryContext.  It will be freed and reset to NULL on a relcache
	 * reset.
	 */

	/* use "struct" here to avoid needing to include fdwapi.h: */
	struct FdwRoutine *rd_fdwroutine;	/* cached function pointers, or NULL */

	/*
	 * Hack for CLUSTER, rewriting ALTER TABLE, etc: when writing a new
	 * version of a table, we need to make any toast pointers inserted into it
	 * have the existing toast table's OID, not the OID of the transient toast
	 * table.  If rd_toastoid isn't InvalidOid, it is the OID to place in
	 * toast pointers inserted into this rel.  (Note it's set on the new
	 * version of the main heap, not the toast table itself.)  This also
	 * causes toast_save_datum() to try to preserve toast value OIDs.
	 */
	Oid			rd_toastoid;	/* Real TOAST table's OID, or InvalidOid */

	/* use "struct" here to avoid needing to include pgstat.h: */
	struct PgStat_TableStatus *pgstat_info; /* statistics collection area */
} RelationData;

1 ExecuteTruncate

执行流程如下:
1 遍历 TruncateStmt 中relations链表,以exclusive 模式打开realtions;
2 Truncate 检查[普通表/分区表、执行权限等],将符合条件的relations追加至truncate链表,对于分区表,需要添
加 only关键字;
3 上述步骤完成后调用 ExecuteTruncateGuts 接口完成relations的真正truncate;

2 ExecuteTruncateGuts

  1) 变量初始化,检查明确的指定关系;
  2) CASCADE模式,根据truncate链表中的元素查找与自身FK相关的子relations链表,与上述步骤一样,以AccessExclusiveLock打开关系进行Truncate检查,添加至truncate链表尾。
  3)RESTRICT模式,调用 heap_truncate_checkFKs 检查其FK被引用的关系列表。该函数内部调用 heap_truncate_find_FKs 获取truncate链表中每个元素自身FK被引用的relations,添加至truncate链表尾。
  4)如果 restart_seqs == true,则需要查找所有的序列,以AccessExclusiveLock锁住,并检查权限;
  5)准备获取 after 触发器,自增查询栈深度 afterTriggers.query_depth++;
  6)为truncate链表中的每一个元素创建EState存放ResultRelInfo等信息;
  7)在开始 truncate 前处理所有的 BEFORE STATEMENT TRUNCATE triggers
  8)truncate运行环境准备完成,开始处理truncate链表中的每个元素。
    1. 如果是分区表则跳过,不进行任何处理;
    2. 如果是外部表,需构建属于每个外部服务器的外部表列表,并将每个
    列表传递给外部数据包装器的回调函数,以便每个服务器可以批量截断其所
    有外部表。每个列表保存为哈希表中的单个条目,以服务器 OID 作为查找键,
    将relation添加至truncate 外部表链表尾。
    3. 如果该表在当期事务(子事务)或拥有新的 relfilenode 则可以原地truncate,
    执行函数为 heap_truncate_one_rel;
    4. 如果上述情况都不满足,会进行如下处理
      1)首先调用CheckTableForSerializableConflictIn 进行可串行冲突检查,
      其目的是判断是否与其他可串行化事务存在 rw 冲突,若存在采取相应
       措施。
      2)为该relation创建新的 storage file [也就是解释为何在执行truncate前后,
       表的relfilenode发生了变化
]
      3)如果该relation含有toast表,也要为toastrel分配新的relfilenode;
      4) 调用reindex_relation 重建索引;
  9)判断步骤2中是否存在外部表,存在则truncate 外部表,反之,跳过;
  10)判断 sep_relids标记是否为true,是则重设序列sequence;
  11)对于逻辑解析涉及的relations,需要写相关的TRUNCATE WAL日志记录;
  12)处理所有 after语句 truncater 触发器和 after 排队触发器;
  13)调用 FreeExecutorState释放所申请的 EState 资源;
  14)关闭所有 relations 的 relcache;

RelationSetNewRelfilenode函数是truncate的关键接口,它的任务有以下两点:
1 为指定的 relation 创建一个新的空物理文件(storage file)并分配一个 relfilenode ;
2 将旧的物理文件添加至deletion链表,在所有事务提交后,资源管理器调用 smgrDoPendingDeletes 将所有需要删除的表删除
/*
* Need the full transaction-safe pushups.
*
* Create a new empty storage file for the relation, and assign it
* as the relfilenode value. The old storage file is scheduled for
* deletion at commit.
*/

该文为本人原创,转载请表明出处。

  • 2
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 2
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值