Postgresql源码(144)LockRelease常规锁释放流程分析

相关
《Postgresql源码(69)常规锁细节分析》

最新遇到一个共享内存损坏导致常规锁释放报错warning "you don’t own a lock of type"的问题。

本篇对常规锁的概念做一些回顾,顺便分析下释放锁的流程。

  • SpinLock:❎
  • LWLock:❎
  • RegularLock:✅

基础概念回顾

《Postgresql源码(69)常规锁细节分析》

  • LockMethodLockHash:共享内存中的全局哈希表,所有进程可见。
    • 事务申请强锁、检测锁冲突或需要与其他事务协调时访问主锁表。
  • FastPathStrongRelationLocks:强锁标记表,共享内存中的计数器数组,用于快速判断是否存在强锁。
    • 快速筛选弱锁是否需进入主锁表流程,减少共享内存竞争。
  • LockMethodLocalHash:本地锁表,进程本地内存,每个后台进程独立维护。
    • 高频弱锁操作(如DML)的快速加锁与释放。

关于fastpath

  1. 先检查是否符合条件,EligibleForRelationFastPath,例如锁级别小于四,是表锁等。
  2. 去共享内存,FastPathStrongRelationLocks,查tag是否已经加过强锁了,加过就不能fastpath了。
  3. 如果都满足了,执行下面的,FastPathGrantRelationLock,开始加fast锁。加锁信息记录到PGPROC中。

《Postgresql源码(69)常规锁细节分析》

FastPathGrantRelationLock函数:

  • 使用PGPROC的数组Oid fpRelId[16];来保存OID。
  • 使用PGPROC的变量uint64 fpLockBits当做位图来记录锁级别。
    在这里插入图片描述
    FastPathGrantRelationLock逻辑:
  1. 3个bit一组按顺序查位图是不是空的,是空的就记录下来位置,不是空的就看下oid里面记的是不是需要的,如果正好Oid也是需要的,把当前请求锁模式或进去就可以返回了。
  2. 如果查了一遍位图,所有Oid都不是需要的,那就找一个空的位置,把锁级别记录到位图,OID记录到数组,然后返回。
  3. 如果查了一遍位图,没有一个空余位置,就返回false了。

LockRelease详细流程分析

bool
LockRelease(const LOCKTAG *locktag, LOCKMODE lockmode, bool sessionLock)
{
	LOCKMETHODID lockmethodid = locktag->locktag_lockmethodid;
	LockMethod	lockMethodTable;
	LOCALLOCKTAG localtag;
	LOCALLOCK  *locallock;
	LOCK	   *lock;
	PROCLOCK   *proclock;
	LWLock	   *partitionLock;
	bool		wakeupNeeded;

	...

第一步 查本地锁表。
先查本地锁表,申请过的任何锁,一定会在本地锁表中存放,如果没有就会报错you don't own a lock of type,这里一般不会报错,本地锁表LockMethodLocalHash不在共享内存中,损坏的概率比较低。

	MemSet(&localtag, 0, sizeof(localtag)); /* must clear padding */
	localtag.lock = *locktag;
	localtag.mode = lockmode;
	locallock = (LOCALLOCK *) hash_search(LockMethodLocalHash,
										  &localtag,
										  HASH_FIND, NULL);
	if (!locallock || locallock->nLocks <= 0)
	{
		elog(WARNING, "you don't own a lock of type %s",
			 lockMethodTable->lockModeNames[lockmode]);
		return false;
	}

第二步 本地锁引用-1

  1. 锁在一个事务中可能被锁多次,但只有第一次锁才会走fastpath、主锁表,后面再加锁只会走本地锁表。
  2. 所以放锁时,如果发现是本地多次加锁的,只需要本地锁表-1即可。
  3. 这里还增加了多resowner的机制,因为可能有savepoint造成事务内 有多层子事务,这种情况下锁会被记录在多个resowner下,所以这里加了循环,遍历resowner,找到申请时使用的resowner,然后再ResourceOwnerForgetLock。
  4. 删完了一个顺便把数组最后一个挪到空洞位置,使数组始终是紧凑的。
	{
		LOCALLOCKOWNER *lockOwners = locallock->lockOwners;
		ResourceOwner owner;
		int			i;

		/* Identify owner for lock */
		if (sessionLock)
			owner = NULL;
		else
			owner = CurrentResourceOwner;

		for (i = locallock->numLockOwners - 1; i >= 0; i--)
		{
			if (lockOwners[i].owner == owner)
			{
				Assert(lockOwners[i].nLocks > 0);
				if (--lockOwners[i].nLocks == 0)
				{
					if (owner != NULL)
						ResourceOwnerForgetLock(owner, locallock);
					/* compact out unused slot */
					locallock->numLockOwners--;
					if (i < locallock->numLockOwners)
						lockOwners[i] = lockOwners[locallock->numLockOwners];
				}
				break;
			}
		}
		if (i < 0)
		{
			/* don't release a lock belonging to another owner */
			elog(WARNING, "you don't own a lock of type %s",
				 lockMethodTable->lockModeNames[lockmode]);
			return false;
		}
	}
	locallock->nLocks--;

	if (locallock->nLocks > 0)
		return true;

第三步 最后一次释放锁,需要真正把锁释放了。

尝试fastpath release,因为fastpath都是所在本地的,所以如果能释放也不需要知会主锁表。

这里需要给MyProc->fpInfoLock加LWLock的原因是:其他进程加强锁时,会遍历所有进程的PGPROC中fastpath记录的弱锁,将冲突的迁移到主锁表中。所以这里有可能被其他进程并发更新,所以需要LWLock。

	...
	if (EligibleForRelationFastPath(locktag, lockmode) &&
		FastPathLocalUseCount > 0)
	{
		bool		released;

		LWLockAcquire(&MyProc->fpInfoLock, LW_EXCLUSIVE);
		released = FastPathUnGrantRelationLock(locktag->locktag_field2,
											   lockmode);
		LWLockRelease(&MyProc->fpInfoLock);
		if (released)
		{
			RemoveLocalLock(locallock);
			return true;
		}
	}

第四步:主锁表操作。

fastpath没删成,说明锁在主锁表中。

【共享内存】主锁表:LockMethodLockHash(存储所有锁对象)
【共享内存】锁进程关系表:LockMethodProcLockHash(查询当前进程阻塞了哪些进程,死锁检测)

  1. 开始操作主锁表,为了增加并发,这里按hashcode做了分区。如果不在一个分区中的锁可以并发。
  2. 在主锁表中定位锁对象。
  3. 在LockMethodProcLockHash定位PROCLOCK,用于查询当前进程阻塞了哪些进程
	partitionLock = LockHashPartitionLock(locallock->hashcode);

	LWLockAcquire(partitionLock, LW_EXCLUSIVE);

	lock = locallock->lock;
	if (!lock)
	{
		PROCLOCKTAG proclocktag;

		Assert(EligibleForRelationFastPath(locktag, lockmode));
		lock = (LOCK *) hash_search_with_hash_value(LockMethodLockHash,
													locktag,
													locallock->hashcode,
													HASH_FIND,
													NULL);
		if (!lock)
			elog(ERROR, "failed to re-find shared lock object");
		locallock->lock = lock;

		proclocktag.myLock = lock;
		proclocktag.myProc = MyProc;
		locallock->proclock = (PROCLOCK *) hash_search(LockMethodProcLockHash,
													   &proclocktag,
													   HASH_FIND,
													   NULL);
		if (!locallock->proclock)
			elog(ERROR, "failed to re-find shared proclock object");
	}
	LOCK_PRINT("LockRelease: found", lock, lockmode);
	proclock = locallock->proclock;
	PROCLOCK_PRINT("LockRelease: found", proclock);

如果下面报错,说明在主锁表和LockMethodProcLockHash都查到了,但是proclock->holdMask记录的lockmode和当前要release的不对应,报一个告警,把本地锁释放了,但不释放主锁表的信息。

	/*
	 * Double-check that we are actually holding a lock of the type we want to
	 * release.
	 */
	if (!(proclock->holdMask & LOCKBIT_ON(lockmode)))
	{
		PROCLOCK_PRINT("LockRelease: WRONGTYPE", proclock);
		LWLockRelease(partitionLock);
		elog(WARNING, "you don't own a lock of type %s",
			 lockMethodTable->lockModeNames[lockmode]);
		RemoveLocalLock(locallock);
		return false;
	}
  • 调用 UnGrantLock 清除锁的授予状态(grantMask)和等待队列(waitMask),并标记是否需要唤醒等待进程

  • 通过 CleanUpLock 处理锁状态,若有等待进程,触发 LWLockWakeup 唤醒。

	/*
	 * Do the releasing.  CleanUpLock will waken any now-wakable waiters.
	 */
	wakeupNeeded = UnGrantLock(lock, lockmode, proclock, lockMethodTable);

	CleanUpLock(lock, proclock,
				lockMethodTable, locallock->hashcode,
				wakeupNeeded);

	LWLockRelease(partitionLock);

	RemoveLocalLock(locallock);
	return true;
}

LockRelease实例

这里截取drop table的一个中间态,给一个lock释放、并能走到主锁表释放的案例。

某一时刻查询到

postgres=# select * from pg_locks where relation=1214;
-[ RECORD 1 ]------+-----------------
locktype           | relation
database           | 0
relation           | 1214
page               |
tuple              |
virtualxid         |
transactionid      |
classid            |
objid              |
objsubid           |
virtualtransaction | 3/7
pid                | 1131328
mode               | RowExclusiveLock
granted            | t
fastpath           | f
waitstart          |

释放三级锁,没有走fastpath。

LockRelease (locktag=0x7ffd581db190, lockmode=3, sessionLock=false)

为什么没走fastpath?因为locktag_field1=0,表示这个表不属于某一个库,一般是是共享系统表(1214指的是pg_shdepend确实是共享系统表)。

(gdb) p	*locktag
$17 = {
locktag_field1 = 0, 
locktag_field2 = 1214, 
locktag_field3 = 0, 
locktag_field4 = 0, 
locktag_type = 0 '\000', 
locktag_lockmethodid = 1 '\001'}

释放锁时,两个主要数据结构的值:

(gdb) p	*lock
$19 = {
  tag = {locktag_field1 = 0, locktag_field2 = 1214, locktag_field3 = 0, locktag_field4 = 0, locktag_type = 0 '\000', locktag_lockmethodid = 1 '\001'}, 
  grantMask = 8,
  waitMask = 0, 
  procLocks = {head = {prev = 0x7f487fd34340, next = 0x7f487fd34340}}, 
  waitProcs = {dlist = {head = {prev = 0x7f487e715988, next = 0x7f487e715988}}, count = 0},
  requested = {0, 0, 0, 1, 0, 0, 0, 0, 0, 0}, 
  nRequested = 1, 
  granted = {0, 0, 0, 1, 0, 0, 0, 0, 0, 0}, 
  nGranted = 1}
(gdb) p	*proclock
$20 = {tag = {myLock = 0x7f487e715960, myProc = 0x7f4884dc4590}, 
  groupLeader = 0x7f4884dc4590, 
  holdMask = 8, 
  releaseMask = 0, 
  lockLink = {prev = 0x7f487e715978, next = 0x7f487e715978}, 
  procLink = {prev = 0x7f4884dc4668, next = 0x7f4884dc4668}}

Lock

  • grantMask = 8:8 对应二进制 1000 对应 RowExclusiveLock。
  • waitMask = 0:资源无冲突请求
  • procLocks :指向自己,说明只有一个PROCLOCK关联当前这个锁。
  • requested[3] = 1​​:当前锁对象上有一个 RowExclusiveLock 模式的请求。
  • nRequested = 1​​:总请求次数为 1,与 requested[3] 一致。
  • ​granted[3] = 1:该请求已被成功授予

PROCLOCK

  • tag:关联的LOCK对象
  • holdMask = 8:与LOCK的grantMask一致,表示该进程在锁对象上持有RowExclusiveLock
  • releaseMask = 0:进程未触发锁释放操作
  • ​​lockLink​​:指向LOCK的procLocks链表,表示该PROCLOCK是链表中唯一的节点。
  • procLink​​:指向PGPROC的本地锁链表,表示该锁属于进程的本地锁管理范围。
/*
 * Per-locked-object lock information:
 *
 * tag -- uniquely identifies the object being locked
 * grantMask -- bitmask for all lock types currently granted on this object.
 * waitMask -- bitmask for all lock types currently awaited on this object.
 * procLocks -- list of PROCLOCK objects for this lock.
 * waitProcs -- queue of processes waiting for this lock.
 * requested -- count of each lock type currently requested on the lock
 *		(includes requests already granted!!).
 * nRequested -- total requested locks of all types.
 * granted -- count of each lock type currently granted on the lock.
 * nGranted -- total granted locks of all types.
 *
 * Note: these counts count 1 for each backend.  Internally to a backend,
 * there may be multiple grabs on a particular lock, but this is not reflected
 * into shared memory.
 */
typedef struct LOCK
{
	/* hash key */
	LOCKTAG		tag;			/* unique identifier of lockable object */

	/* data */
	LOCKMASK	grantMask;		/* bitmask for lock types already granted */
	LOCKMASK	waitMask;		/* bitmask for lock types awaited */
	dlist_head	procLocks;		/* list of PROCLOCK objects assoc. with lock */
	dclist_head waitProcs;		/* list of PGPROC objects waiting on lock */
	int			requested[MAX_LOCKMODES];	/* counts of requested locks */
	int			nRequested;		/* total of requested[] array */
	int			granted[MAX_LOCKMODES]; /* counts of granted locks */
	int			nGranted;		/* total of granted[] array */
} LOCK;


/*
 * We may have several different backends holding or awaiting locks
 * on the same lockable object.  We need to store some per-holder/waiter
 * information for each such holder (or would-be holder).  This is kept in
 * a PROCLOCK struct.
 *
 * PROCLOCKTAG is the key information needed to look up a PROCLOCK item in the
 * proclock hashtable.  A PROCLOCKTAG value uniquely identifies the combination
 * of a lockable object and a holder/waiter for that object.  (We can use
 * pointers here because the PROCLOCKTAG need only be unique for the lifespan
 * of the PROCLOCK, and it will never outlive the lock or the proc.)
 *
 * Internally to a backend, it is possible for the same lock to be held
 * for different purposes: the backend tracks transaction locks separately
 * from session locks.  However, this is not reflected in the shared-memory
 * state: we only track which backend(s) hold the lock.  This is OK since a
 * backend can never block itself.
 *
 * The holdMask field shows the already-granted locks represented by this
 * proclock.  Note that there will be a proclock object, possibly with
 * zero holdMask, for any lock that the process is currently waiting on.
 * Otherwise, proclock objects whose holdMasks are zero are recycled
 * as soon as convenient.
 *
 * releaseMask is workspace for LockReleaseAll(): it shows the locks due
 * to be released during the current call.  This must only be examined or
 * set by the backend owning the PROCLOCK.
 *
 * Each PROCLOCK object is linked into lists for both the associated LOCK
 * object and the owning PGPROC object.  Note that the PROCLOCK is entered
 * into these lists as soon as it is created, even if no lock has yet been
 * granted.  A PGPROC that is waiting for a lock to be granted will also be
 * linked into the lock's waitProcs queue.
 */
typedef struct PROCLOCKTAG
{
	/* NB: we assume this struct contains no padding! */
	LOCK	   *myLock;			/* link to per-lockable-object information */
	PGPROC	   *myProc;			/* link to PGPROC of owning backend */
} PROCLOCKTAG;

typedef struct PROCLOCK
{
	/* tag */
	PROCLOCKTAG tag;			/* unique identifier of proclock object */

	/* data */
	PGPROC	   *groupLeader;	/* proc's lock group leader, or proc itself */
	LOCKMASK	holdMask;		/* bitmask for lock types currently held */
	LOCKMASK	releaseMask;	/* bitmask for lock types to be released */
	dlist_node	lockLink;		/* list link in LOCK's list of proclocks */
	dlist_node	procLink;		/* list link in PGPROC's list of proclocks */
} PROCLOCK;
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

高铭杰

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值