Postgresql源码（144）LockRelease常规锁释放流程分析

高铭杰

于 2025-04-30 15:18:35 发布

阅读量755

点赞数 14

分类专栏： pgsql 文章标签： postgresql 数据库 LockRelease 锁 regularlock

本文链接：https://blog.csdn.net/jackgo73/article/details/147626537

版权

pgsql 专栏收录该内容

294 篇文章

订阅专栏

相关
《Postgresql源码（69）常规锁细节分析》

最新遇到一个共享内存损坏导致常规锁释放报错warning "you don’t own a lock of type"的问题。

本篇对常规锁的概念做一些回顾，顺便分析下释放锁的流程。

SpinLock：❎
LWLock：❎
RegularLock：✅

基础概念回顾

《Postgresql源码（69）常规锁细节分析》

LockMethodLockHash：共享内存中的全局哈希表，所有进程可见。
- 事务申请强锁、检测锁冲突或需要与其他事务协调时访问主锁表。
FastPathStrongRelationLocks：强锁标记表，共享内存中的计数器数组，用于快速判断是否存在强锁。
- 快速筛选弱锁是否需进入主锁表流程，减少共享内存竞争。
LockMethodLocalHash：本地锁表，进程本地内存，每个后台进程独立维护。
- 高频弱锁操作（如DML）的快速加锁与释放。

关于fastpath

先检查是否符合条件，EligibleForRelationFastPath，例如锁级别小于四，是表锁等。
去共享内存，FastPathStrongRelationLocks，查tag是否已经加过强锁了，加过就不能fastpath了。
如果都满足了，执行下面的，FastPathGrantRelationLock，开始加fast锁。加锁信息记录到PGPROC中。

《Postgresql源码（69）常规锁细节分析》

FastPathGrantRelationLock函数：

使用PGPROC的数组Oid fpRelId[16];来保存OID。
使用PGPROC的变量uint64 fpLockBits当做位图来记录锁级别。

FastPathGrantRelationLock逻辑：

3个bit一组按顺序查位图是不是空的，是空的就记录下来位置，不是空的就看下oid里面记的是不是需要的，如果正好Oid也是需要的，把当前请求锁模式或进去就可以返回了。
如果查了一遍位图，所有Oid都不是需要的，那就找一个空的位置，把锁级别记录到位图，OID记录到数组，然后返回。
如果查了一遍位图，没有一个空余位置，就返回false了。

LockRelease详细流程分析

bool
LockRelease(const LOCKTAG *locktag, LOCKMODE lockmode, bool sessionLock)
{
	LOCKMETHODID lockmethodid = locktag->locktag_lockmethodid;
	LockMethod	lockMethodTable;
	LOCALLOCKTAG localtag;
	LOCALLOCK  *locallock;
	LOCK	   *lock;
	PROCLOCK   *proclock;
	LWLock	   *partitionLock;
	bool		wakeupNeeded;

	...

第一步 查本地锁表。
先查本地锁表，申请过的任何锁，一定会在本地锁表中存放，如果没有就会报错you don't own a lock of type，这里一般不会报错，本地锁表LockMethodLocalHash不在共享内存中，损坏的概率比较低。

	MemSet(&localtag, 0, sizeof(localtag)); /* must clear padding */
	localtag.lock = *locktag;
	localtag.mode = lockmode;
	locallock = (LOCALLOCK *) hash_search(LockMethodLocalHash,
										  &localtag,
										  HASH_FIND, NULL);
	if (!locallock || locallock->nLocks <= 0)
	{
		elog(WARNING, "you don't own a lock of type %s",
			 lockMethodTable->lockModeNames[lockmode]);
		return false;
	}

第二步 本地锁引用-1

锁在一个事务中可能被锁多次，但只有第一次锁才会走fastpath、主锁表，后面再加锁只会走本地锁表。
所以放锁时，如果发现是本地多次加锁的，只需要本地锁表-1即可。
这里还增加了多resowner的机制，因为可能有savepoint造成事务内有多层子事务，这种情况下锁会被记录在多个resowner下，所以这里加了循环，遍历resowner，找到申请时使用的resowner，然后再ResourceOwnerForgetLock。
删完了一个顺便把数组最后一个挪到空洞位置，使数组始终是紧凑的。

	{
		LOCALLOCKOWNER *lockOwners = locallock->lockOwners;
		ResourceOwner owner;
		int			i;

		/* Identify owner for lock */
		if (sessionLock)
			owner = NULL;
		else
			owner = CurrentResourceOwner;

		for (i = locallock->numLockOwners - 1; i >= 0; i--)
		{
			if (lockOwners[i].owner == owner)
			{
				Assert(lockOwners[i].nLocks > 0);
				if (--lockOwners[i].nLocks == 0)
				{
					if (owner != NULL)
						ResourceOwnerForgetLock(owner, locallock);
					/* compact out unused slot */
					locallock->numLockOwners--;
					if (i < locallock->numLockOwners)
						lockOwners[i] = lockOwners[locallock->numLockOwners];
				}
				break;
			}
		}
		if (i < 0)
		{
			/* don't release a lock belonging to another owner */
			elog(WARNING, "you don't own a lock of type %s",
				 lockMethodTable->lockModeNames[lockmode]);
			return false;
		}
	}
	locallock->nLocks--;

	if (locallock->nLocks > 0)
		return true;

第三步 最后一次释放锁，需要真正把锁释放了。

尝试fastpath release，因为fastpath都是所在本地的，所以如果能释放也不需要知会主锁表。

这里需要给MyProc->fpInfoLock加LWLock的原因是：其他进程加强锁时，会遍历所有进程的PGPROC中fastpath记录的弱锁，将冲突的迁移到主锁表中。所以这里有可能被其他进程并发更新，所以需要LWLock。

	...
	if (EligibleForRelationFastPath(locktag, lockmode) &&
		FastPathLocalUseCount > 0)
	{
		bool		released;

		LWLockAcquire(&MyProc->fpInfoLock, LW_EXCLUSIVE);
		released = FastPathUnGrantRelationLock(locktag->locktag_field2,
											   lockmode);
		LWLockRelease(&MyProc->fpInfoLock);
		if (released)
		{
			RemoveLocalLock(locallock);
			return true;
		}
	}

第四步：主锁表操作。

fastpath没删成，说明锁在主锁表中。

【共享内存】主锁表：LockMethodLockHash（存储所有锁对象）
【共享内存】锁进程关系表：LockMethodProcLockHash（查询当前进程阻塞了哪些进程，死锁检测）

开始操作主锁表，为了增加并发，这里按hashcode做了分区。如果不在一个分区中的锁可以并发。
在主锁表中定位锁对象。
在LockMethodProcLockHash定位PROCLOCK，用于查询当前进程阻塞了哪些进程

	partitionLock = LockHashPartitionLock(locallock->hashcode);

	LWLockAcquire(partitionLock, LW_EXCLUSIVE);

	lock = locallock->lock;
	if (!lock)
	{
		PROCLOCKTAG proclocktag;

		Assert(EligibleForRelationFastPath(locktag, lockmode));
		lock = (LOCK *) hash_search_with_hash_value(LockMethodLockHash,
													locktag,
													locallock->hashcode,
													HASH_FIND,
													NULL);
		if (!lock)
			elog(ERROR, "failed to re-find shared lock object");
		locallock->lock = lock;

		proclocktag.myLock = lock;
		proclocktag.myProc = MyProc;
		locallock->proclock = (PROCLOCK *) hash_search(LockMethodProcLockHash,
													   &proclocktag,
													   HASH_FIND,
													   NULL);
		if (!locallock->proclock)
			elog(ERROR, "failed to re-find shared proclock object");
	}
	LOCK_PRINT("LockRelease: found", lock, lockmode);
	proclock = locallock->proclock;
	PROCLOCK_PRINT("LockRelease: found", proclock);

如果下面报错，说明在主锁表和LockMethodProcLockHash都查到了，但是proclock->holdMask记录的lockmode和当前要release的不对应，报一个告警，把本地锁释放了，但不释放主锁表的信息。

	/*
	 * Double-check that we are actually holding a lock of the type we want to
	 * release.
	 */
	if (!(proclock->holdMask & LOCKBIT_ON(lockmode)))
	{
		PROCLOCK_PRINT("LockRelease: WRONGTYPE", proclock);
		LWLockRelease(partitionLock);
		elog(WARNING, "you don't own a lock of type %s",
			 lockMethodTable->lockModeNames[lockmode]);
		RemoveLocalLock(locallock);
		return false;
	}

调用 UnGrantLock 清除锁的授予状态（grantMask）和等待队列（waitMask），并标记是否需要唤醒等待进程
通过 CleanUpLock 处理锁状态，若有等待进程，触发 LWLockWakeup 唤醒。

	/*
	 * Do the releasing.  CleanUpLock will waken any now-wakable waiters.
	 */
	wakeupNeeded = UnGrantLock(lock, lockmode, proclock, lockMethodTable);

	CleanUpLock(lock, proclock,
				lockMethodTable, locallock->hashcode,
				wakeupNeeded);

	LWLockRelease(partitionLock);

	RemoveLocalLock(locallock);
	return true;
}

LockRelease实例

这里截取drop table的一个中间态，给一个lock释放、并能走到主锁表释放的案例。

某一时刻查询到

postgres=# select * from pg_locks where relation=1214;
-[ RECORD 1 ]------+-----------------
locktype           | relation
database           | 0
relation           | 1214
page               |
tuple              |
virtualxid         |
transactionid      |
classid            |
objid              |
objsubid           |
virtualtransaction | 3/7
pid                | 1131328
mode               | RowExclusiveLock
granted            | t
fastpath           | f
waitstart          |

释放三级锁，没有走fastpath。

LockRelease (locktag=0x7ffd581db190, lockmode=3, sessionLock=false)

为什么没走fastpath？因为locktag_field1=0，表示这个表不属于某一个库，一般是是共享系统表（1214指的是pg_shdepend确实是共享系统表）。

(gdb) p	*locktag
$17 = {
locktag_field1 = 0, 
locktag_field2 = 1214, 
locktag_field3 = 0, 
locktag_field4 = 0, 
locktag_type = 0 '\000', 
locktag_lockmethodid = 1 '\001'}

释放锁时，两个主要数据结构的值：

(gdb) p	*lock
$19 = {
  tag = {locktag_field1 = 0, locktag_field2 = 1214, locktag_field3 = 0, locktag_field4 = 0, locktag_type = 0 '\000', locktag_lockmethodid = 1 '\001'}, 
  grantMask = 8,
  waitMask = 0, 
  procLocks = {head = {prev = 0x7f487fd34340, next = 0x7f487fd34340}}, 
  waitProcs = {dlist = {head = {prev = 0x7f487e715988, next = 0x7f487e715988}}, count = 0},
  requested = {0, 0, 0, 1, 0, 0, 0, 0, 0, 0}, 
  nRequested = 1, 
  granted = {0, 0, 0, 1, 0, 0, 0, 0, 0, 0}, 
  nGranted = 1}
(gdb) p	*proclock
$20 = {tag = {myLock = 0x7f487e715960, myProc = 0x7f4884dc4590}, 
  groupLeader = 0x7f4884dc4590, 
  holdMask = 8, 
  releaseMask = 0, 
  lockLink = {prev = 0x7f487e715978, next = 0x7f487e715978}, 
  procLink = {prev = 0x7f4884dc4668, next = 0x7f4884dc4668}}

Lock

grantMask = 8：8 对应二进制 1000 对应 RowExclusiveLock。
waitMask = 0：资源无冲突请求
procLocks ：指向自己，说明只有一个PROCLOCK关联当前这个锁。
requested[3] = 1：当前锁对象上有一个 RowExclusiveLock 模式的请求。
nRequested = 1：总请求次数为 1，与 requested[3] 一致。
granted[3] = 1：该请求已被成功授予

PROCLOCK

tag：关联的LOCK对象
holdMask = 8：与LOCK的grantMask一致，表示该进程在锁对象上持有RowExclusiveLock
releaseMask = 0：进程未触发锁释放操作
lockLink：指向LOCK的procLocks链表，表示该PROCLOCK是链表中唯一的节点。
procLink：指向PGPROC的本地锁链表，表示该锁属于进程的本地锁管理范围。

/*
 * Per-locked-object lock information:
 *
 * tag -- uniquely identifies the object being locked
 * grantMask -- bitmask for all lock types currently granted on this object.
 * waitMask -- bitmask for all lock types currently awaited on this object.
 * procLocks -- list of PROCLOCK objects for this lock.
 * waitProcs -- queue of processes waiting for this lock.
 * requested -- count of each lock type currently requested on the lock
 *		(includes requests already granted!!).
 * nRequested -- total requested locks of all types.
 * granted -- count of each lock type currently granted on the lock.
 * nGranted -- total granted locks of all types.
 *
 * Note: these counts count 1 for each backend.  Internally to a backend,
 * there may be multiple grabs on a particular lock, but this is not reflected
 * into shared memory.
 */
typedef struct LOCK
{
	/* hash key */
	LOCKTAG		tag;			/* unique identifier of lockable object */

	/* data */
	LOCKMASK	grantMask;		/* bitmask for lock types already granted */
	LOCKMASK	waitMask;		/* bitmask for lock types awaited */
	dlist_head	procLocks;		/* list of PROCLOCK objects assoc. with lock */
	dclist_head waitProcs;		/* list of PGPROC objects waiting on lock */
	int			requested[MAX_LOCKMODES];	/* counts of requested locks */
	int			nRequested;		/* total of requested[] array */
	int			granted[MAX_LOCKMODES]; /* counts of granted locks */
	int			nGranted;		/* total of granted[] array */
} LOCK;


/*
 * We may have several different backends holding or awaiting locks
 * on the same lockable object.  We need to store some per-holder/waiter
 * information for each such holder (or would-be holder).  This is kept in
 * a PROCLOCK struct.
 *
 * PROCLOCKTAG is the key information needed to look up a PROCLOCK item in the
 * proclock hashtable.  A PROCLOCKTAG value uniquely identifies the combination
 * of a lockable object and a holder/waiter for that object.  (We can use
 * pointers here because the PROCLOCKTAG need only be unique for the lifespan
 * of the PROCLOCK, and it will never outlive the lock or the proc.)
 *
 * Internally to a backend, it is possible for the same lock to be held
 * for different purposes: the backend tracks transaction locks separately
 * from session locks.  However, this is not reflected in the shared-memory
 * state: we only track which backend(s) hold the lock.  This is OK since a
 * backend can never block itself.
 *
 * The holdMask field shows the already-granted locks represented by this
 * proclock.  Note that there will be a proclock object, possibly with
 * zero holdMask, for any lock that the process is currently waiting on.
 * Otherwise, proclock objects whose holdMasks are zero are recycled
 * as soon as convenient.
 *
 * releaseMask is workspace for LockReleaseAll(): it shows the locks due
 * to be released during the current call.  This must only be examined or
 * set by the backend owning the PROCLOCK.
 *
 * Each PROCLOCK object is linked into lists for both the associated LOCK
 * object and the owning PGPROC object.  Note that the PROCLOCK is entered
 * into these lists as soon as it is created, even if no lock has yet been
 * granted.  A PGPROC that is waiting for a lock to be granted will also be
 * linked into the lock's waitProcs queue.
 */
typedef struct PROCLOCKTAG
{
	/* NB: we assume this struct contains no padding! */
	LOCK	   *myLock;			/* link to per-lockable-object information */
	PGPROC	   *myProc;			/* link to PGPROC of owning backend */
} PROCLOCKTAG;

typedef struct PROCLOCK
{
	/* tag */
	PROCLOCKTAG tag;			/* unique identifier of proclock object */

	/* data */
	PGPROC	   *groupLeader;	/* proc's lock group leader, or proc itself */
	LOCKMASK	holdMask;		/* bitmask for lock types currently held */
	LOCKMASK	releaseMask;	/* bitmask for lock types to be released */
	dlist_node	lockLink;		/* list link in LOCK's list of proclocks */
	dlist_node	procLink;		/* list link in PGPROC's list of proclocks */
} PROCLOCK;