PG服务进程(Postgres)——Latch机制相关函数

这篇博客详细介绍了Unix系统中如何使用自管道(self-pipe)技巧来克服poll()和信号处理程序中的竞争条件,以及Windows环境下使用Windows事件进行进程间通信。初始化过程包括设置自管道、初始化Latch支持,以及处理Latch的创建、所有权转移和等待。在等待Latch时,使用WaitLatchOrSocket函数结合信号处理函数实现唤醒机制。SetLatch和ResetLatch分别用于设置和清除Latch的状态。整个机制确保了进程间的同步和通信效率。

inter-process latches
Unix 实现使用所谓的自管道(self-pipe)技巧来克服与 poll()(或 linux 上的 epoll_wait())和在信号处理程序中设置全局标志相关的竞争条件。当设置了闩锁并且当前进程正在等待它时,信号处理程序通过向管道写入一个字节来唤醒 WaitLatch 中的 poll()。信号本身不会在所有平台上中断 poll(),即使在它中断的平台上,在 poll() 调用之前到达的信号也不会阻止 poll() 进入睡眠。然而,管道上的传入字节可靠地中断睡眠,并导致 poll() 立即返回,即使信号在 poll() 开始之前到达。

当从拥有锁存器的同一进程调用 SetLatch 时,SetLatch 将字节直接写入管道。如果它由另一个进程拥有,则发送 SIGUSR1 并且等待进程中的信号处理程序代表信号进程将字节写入管道。

Windows 实现使用由所有 postmaster 子进程继承的 Windows 事件。那里不需要self-pipe技巧。

初始化process-local latch设施

InitializeLatchSupport

InitializeLatchSupport函数初始化process-local latch设施。该函数必须在任何进程启动过程中InitLactch或OwnLatch函数调用之前调用。
如果是在postmaster子进程中,selfpipe_owner_pid是用于设置该进程是否拥有self-pipe的标志(拥有selfpipe的进程pid)。如果selfpipe_owner_pid不为零,说明现在我们子进程继承了postmaster创建的self-pipe的连接。我们想要关闭postmaster创建的self-pipe的连接,然后才可以创建自己的self-pipes。
建立运行信号处理函数唤醒WaitLatch函数中的poll、epoll_wait的self-pipe。

void InitializeLatchSupport(void) {
	int			pipefd[2];
	if (IsUnderPostmaster) { // 在postmaster子进程中	
		/* We might have inherited connections to a self-pipe created by the
		 * postmaster.  It's critical that child processes create their own
		 * self-pipes, of course, and we really want them to close the
		 * inherited FDs for safety's sake. */
		if (selfpipe_owner_pid != 0) {
			/* Assert we go through here but once in a child process */
			Assert(selfpipe_owner_pid != MyProcPid);
			/* Release postmaster's pipe FDs; ignore any error */ // 释放postmaster的pipe FD
			(void) close(selfpipe_readfd);
			(void) close(selfpipe_writefd);
			/* Clean up, just for safety's sake; we'll set these below */ 
			selfpipe_readfd = selfpipe_writefd = -1;
			selfpipe_owner_pid = 0;
		}
		else
		{
			/* Postmaster didn't create a self-pipe ... or else we're in an
			 * EXEC_BACKEND build, in which case it doesn't matter since the
			 * postmaster's pipe FDs were closed by the action of FD_CLOEXEC. */
			Assert(selfpipe_readfd == -1);
		}
	} else {
		/* In postmaster or standalone backend, assert we do this but once */
		Assert(selfpipe_readfd == -1);
		Assert(selfpipe_owner_pid == 0);
	}

	/* Set up the self-pipe that allows a signal handler to wake up the
	 * poll()/epoll_wait() in WaitLatch. Make the write-end non-blocking, so
	 * that SetLatch won't block if the event has already been set many times
	 * filling the kernel buffer. Make the read-end non-blocking too, so that
	 * we can easily clear the pipe by reading until EAGAIN or EWOULDBLOCK.
	 * Also, make both FDs close-on-exec, since we surely do not want any
	 * child processes messing with them. */
	if (pipe(pipefd) < 0) elog(FATAL, "pipe() failed: %m");
	if (fcntl(pipefd[0], F_SETFL, O_NONBLOCK) == -1) elog(FATAL, "fcntl(F_SETFL) failed on read-end of self-pipe: %m");
	if (fcntl(pipefd[1], F_SETFL, O_NONBLOCK) == -1) elog(FATAL, "fcntl(F_SETFL) failed on write-end of self-pipe: %m");
	if (fcntl(pipefd[0], F_SETFD, FD_CLOEXEC) == -1) elog(FATAL, "fcntl(F_SETFD) failed on read-end of self-pipe: %m");
	if (fcntl(pipefd[1], F_SETFD, FD_CLOEXEC) == -1) elog(FATAL, "fcntl(F_SETFD) failed on write-end of self-pipe: %m");

	selfpipe_readfd = pipefd[0];
	selfpipe_writefd = pipefd[1];
	selfpipe_owner_pid = MyProcPid;

}

初始化单个Latch

InitLatch

InitLatch函数初始化进程本地的latch(自己初始化自己的latch)。Latch结构体包含is_set、is_stared、owner_pid三个成员。

typedef struct Latch {
 sig_atomic_t is_set;
 bool  is_shared;
 int   owner_pid;
} Latch;

InitLatch函数初始化Latch结构体,is_set设置为false,owner_pid为自己的进程pid,is_shared设置为false。

void InitLatch(Latch *latch){
	latch->is_set = false;
	latch->owner_pid = MyProcPid;
	latch->is_shared = false;
	/* Assert InitializeLatchSupport has been called in this process */
	Assert(selfpipe_readfd >= 0 && selfpipe_owner_pid == MyProcPid);
}

InitSharedLatch

InitSharedLatch初始化一个共享latch(可以从其他进程set这个latch),初始化完latch不属于任何进程,使用OwnLatch将该latch和当前进程关联。InitSharedLatch必须在postmasster fork子进程前由postmaster调用,通常在ShmemInitStruct分配包含Latch的共享内存块之后调用。
Note that other handles created in this module are never marked as inheritable. Thus we do not need to worry about cleaning up child process references to postmaster-private latches or WaitEventSets.

void InitSharedLatch(Latch *latch) {
	latch->is_set = false;
	latch->owner_pid = 0;
	latch->is_shared = true;
}

OwnLatch

OwnLatch将一个共享latch关联到当前进程上,允许该进程在该latch上等待。尽管有latch是否有所属的检查,但是我们在这里没有采用任何锁机制,所以我们不能在两个进程在同时对同一个锁竞争所属权时检测出来错误。这种情况下,调用者必须提供一个interlock来包含latch的所属权。任何进程调用OwnLatch函数,必须确保latch_sigusr1_handler函数由SIGUSR1信号处理handler调用,因为shared latch使用SIGUSER1作为进程间通信机制。

void OwnLatch(Latch *latch) {
	/* Sanity checks */
	Assert(latch->is_shared);
	/* Assert InitializeLatchSupport has been called in this process */
	Assert(selfpipe_readfd >= 0 && selfpipe_owner_pid == MyProcPid);
	if (latch->owner_pid != 0) elog(ERROR, "latch already owned");
	latch->owner_pid = MyProcPid;
}

DisownLatch

DisownLatch将一个共享latch和当前进程解关联

void DisownLatch(Latch *latch) {
	Assert(latch->is_shared);
	Assert(latch->owner_pid == MyProcPid);
	latch->owner_pid = 0;
}

在Latch上等待

WaitLatch

WaitLatch等待给定的latch被设置,或者是postmaster death或者是超时。wakeEvents是用于指定等待哪种类型事件的掩码。如果latch已经被设置(WL_LATCH_SET已经给定),函数立即返回。
超时时间以毫秒为单位,如果WL_TIMEOUT标志设置了的话,超时时间必须大于等于0。尽管超时时间的类型为log,我们并不支持超时时间长于INT_MAX毫秒。
latch必须由当前进程拥有,才能Wait。比如,it must be a process-local latch initialized with InitLatch, or a shared latch associated with the current process by calling OwnLatch.
返回指示哪种条件导致当前进程在该latch上被唤醒的掩码。如果是多个唤醒条件,我们不能保证在单次调用中返回所有的条件,但是会至少返回一条。

int WaitLatch(Latch *latch, int wakeEvents, long timeout, uint32 wait_event_info) {
	return WaitLatchOrSocket(latch, wakeEvents, PGINVALID_SOCKET, timeout, wait_event_info);
}

WaitLatchOrSocket

WaitLatch就是调用WaitLatchOrSocket函数,WaitLatchOrSocket比WaitLatch多了一个pgsocket形参(WL_SOCKET_*)。WaitLatch调用时使用PGINVALID_SOCKET,表示是在latch上wait。
当在一个socket上等待时,EOF和error conditions总是会导致socket被反馈为readable/writable/connected,所以调用者必须处理这种情况。
wakeEvents必须包含或者WL_EXIT_ON_PM_DEATH或者是WL_POSTMASTER_DEATH。WL_EXIT_ON_PM_DEATH用于在postmaster die时自动退出。
WL_POSTMASTER_DEATH用于在postmaster die时在WaitLatchOrSocket函数返回掩码中设置WL_POSTMASTER_DEATH标志,以表明postmaster die。

int WaitLatchOrSocket(Latch *latch, int wakeEvents, pgsocket sock, long timeout, uint32 wait_event_info) {
	int			ret = 0;
	int			rc;
	WaitEvent	event;
	WaitEventSet *set = CreateWaitEventSet(CurrentMemoryContext, 3);
	
	if (wakeEvents & WL_TIMEOUT) Assert(timeout >= 0);
	else timeout = -1;

	if (wakeEvents & WL_LATCH_SET)
		AddWaitEventToSet(set, WL_LATCH_SET, PGINVALID_SOCKET, latch, NULL);

	/* Postmaster-managed callers must handle postmaster death somehow. */
	Assert(!IsUnderPostmaster || (wakeEvents & WL_EXIT_ON_PM_DEATH) || (wakeEvents & WL_POSTMASTER_DEATH));

	if ((wakeEvents & WL_POSTMASTER_DEATH) && IsUnderPostmaster)
		AddWaitEventToSet(set, WL_POSTMASTER_DEATH, PGINVALID_SOCKET, NULL, NULL);

	if ((wakeEvents & WL_EXIT_ON_PM_DEATH) && IsUnderPostmaster)
		AddWaitEventToSet(set, WL_EXIT_ON_PM_DEATH, PGINVALID_SOCKET, NULL, NULL);

	if (wakeEvents & WL_SOCKET_MASK) {
		int			ev;
		ev = wakeEvents & WL_SOCKET_MASK;
		AddWaitEventToSet(set, ev, sock, NULL, NULL);
	}

	rc = WaitEventSetWait(set, timeout, &event, 1, wait_event_info);

	if (rc == 0) ret |= WL_TIMEOUT;
	else ret |= event.events & (WL_LATCH_SET | WL_POSTMASTER_DEATH |  WL_SOCKET_MASK);
	
	FreeWaitEventSet(set);

	return ret;
}

CreateWaitEventSet

FreeWaitEventSet

AddWaitEventToSet

ModifyWaitEvent

WaitEventSetWait

设置Latch

SetLatch

SetLatch函数设置一个latch,唤醒等待的任何进程。如果在信号处理函数中调用该函数,确保在该函数之前和之后保存和恢复errno。主要工作就是设置latch中的is_set,然后唤醒等待的进程(如果有的话)。
如果是当前进程在等待该latch,说明我们是在信号处理函数中设置的Latch,我们使用self-pipe唤醒poll或epoll_wait。如果是其他进程在等待该latch,则发送一个SIGUSR1信号。

void SetLatch(Latch *latch) {
	pid_t		owner_pid;
	/* The memory barrier has to be placed here to ensure that any flag variables possibly changed by this process have been flushed to main memory, before we check/set is_set. */
	pg_memory_barrier();
	/* Quick exit if already set */ //如果已经设置,直接返回
	if (latch->is_set) return;
	latch->is_set = true;

	/* See if anyone's waiting for the latch. It can be the current process if
	 * we're in a signal handler. We use the self-pipe to wake up the
	 * poll()/epoll_wait() in that case. If it's another process, send a
	 * signal.
	 *
	 * Fetch owner_pid only once, in case the latch is concurrently getting
	 * owned or disowned. XXX: This assumes that pid_t is atomic, which isn't
	 * guaranteed to be true! In practice, the effective range of pid_t fits
	 * in a 32 bit integer, and so should be atomic. In the worst case, we
	 * might end up signaling the wrong process. Even then, you're very
	 * unlucky if a process with that bogus pid exists and belongs to
	 * Postgres; and PG database processes should handle excess SIGUSR1
	 * interrupts without a problem anyhow.
	 *
	 * Another sort of race condition that's possible here is for a new
	 * process to own the latch immediately after we look, so we don't signal
	 * it. This is okay so long as all callers of ResetLatch/WaitLatch follow
	 * the standard coding convention of waiting at the bottom of their loops,
	 * not the top, so that they'll correctly process latch-setting events
	 * that happen before they enter the loop.
	 */
	owner_pid = latch->owner_pid;
	if (owner_pid == 0) return;
	else if (owner_pid == MyProcPid) {
		if (waiting) sendSelfPipeByte();
	}
	else kill(owner_pid, SIGUSR1);	
}

仅获取 owner_pid 一次,以防闩锁同时被拥有或被剥夺。 XXX:这假设 pid_t 是原子的,这不能保证是真的! 实际上,pid_t 的有效范围适合 32 位整数,因此应该是原子的。 在最坏的情况下,我们可能最终会发出错误的过程信号。 即便如此,如果存在带有该伪造 pid 的进程并且属于 Postgres,那你就很不走运了; PG 数据库进程应该可以毫无问题地处理过多的 SIGUSR1 中断。

另一种可能的竞争条件是新进程在我们查看后立即拥有锁存器,因此我们不发出信号。 只要 ResetLatch/WaitLatch 的所有调用者都遵循在循环底部而不是顶部等待的标准编码约定,这样就可以正确处理在进入循环之前发生的闩锁设置事件。

sendSelfPipeByte

latch_sigusr1_handler

SetLatch使用SIGUSR1唤醒在latch上等待的进程。如果我们在等待,唤醒WaitLatch。

void latch_sigusr1_handler(void) {
	if (waiting) sendSelfPipeByte();
}

在这里插入图片描述

ResetLatch

清除latch的is_set。在该函数调用之后调用WaitLatch会进入睡眠,除非在调用WaitLatch之前latch又被设置了。

void ResetLatch(Latch *latch) {	
	Assert(latch->owner_pid == MyProcPid); /* Only the owner should reset the latch */
	latch->is_set = false;
	/* Ensure that the write to is_set gets flushed to main memory before we
	 * examine any flag variables.  Otherwise a concurrent SetLatch might
	 * falsely conclude that it needn't signal us, even though we have missed
	 * seeing some flag updates that SetLatch was supposed to inform us of. */
	pg_memory_barrier();
}

drainSelfPipe

drainSelfPipe从self-pipe中读取所有可以获取的数据。只有当waiting为true,才会调用该函数。

static void drainSelfPipe(void) {
	/* There shouldn't normally be more than one byte in the pipe, or maybe a few bytes if multiple processes run SetLatch at the same instant. */
	char		buf[16];
	int			rc;
	for (;;) {
		rc = read(selfpipe_readfd, buf, sizeof(buf));
		if (rc < 0) {
			if (errno == EAGAIN || errno == EWOULDBLOCK) break;			/* the pipe is empty */
			else if (errno == EINTR) continue;		/* retry */
			else {
				waiting = false;
				elog(ERROR, "read() on self-pipe failed: %m");
			}
		}else if (rc == 0){
			waiting = false;
			elog(ERROR, "unexpected EOF on self-pipe");
		}else if (rc < sizeof(buf)) {			
			break; /* we successfully drained the pipe; no need to read() again */
		}
		/* else buffer wasn't big enough, so read again */
	}
}

在这里插入图片描述

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值