本节简单介绍了PostgreSQL缓存管理(Buffer Manager)中的Clock Sweep算法,其主要逻辑在StrategyGetBuffer函数中实现。
一、数据结构
BufferDesc
共享缓冲区的共享描述符(状态)数据
/*
* Flags for buffer descriptors
* buffer描述器标记
*
* Note: TAG_VALID essentially means that there is a buffer hashtable
* entry associated with the buffer's tag.
* 注意:TAG_VALID本质上意味着有一个与缓冲区的标记相关联的缓冲区散列表条目。
*/
//buffer header锁定
#define BM_LOCKED (1U << 22) /* buffer header is locked */
//数据需要写入(标记为DIRTY)
#define BM_DIRTY (1U << 23) /* data needs writing */
//数据是有效的
#define BM_VALID (1U << 24) /* data is valid */
//已分配buffer tag
#define BM_TAG_VALID (1U << 25) /* tag is assigned */
//正在R/W
#define BM_IO_IN_PROGRESS (1U << 26) /* read or write in progress */
//上一个I/O出现错误
#define BM_IO_ERROR (1U << 27) /* previous I/O failed */
//开始写则变DIRTY
#define BM_JUST_DIRTIED (1U << 28) /* dirtied since write started */
//存在等待sole pin的其他进程
#define BM_PIN_COUNT_WAITER (1U << 29) /* have waiter for sole pin */
//checkpoint发生,必须刷到磁盘上
#define BM_CHECKPOINT_NEEDED (1U << 30) /* must write for checkpoint */
//持久化buffer(不是unlogged或者初始化fork)
#define BM_PERMANENT (1U << 31) /* permanent buffer (not unlogged,
* or init fork) */
/*
* BufferDesc -- shared descriptor/state data for a single shared buffer.
* BufferDesc -- 共享缓冲区的共享描述符(状态)数据
*
* Note: Buffer header lock (BM_LOCKED flag) must be held to examine or change
* the tag, state or wait_backend_pid fields. In general, buffer header lock
* is a spinlock which is combined with flags, refcount and usagecount into
* single atomic variable. This layout allow us to do some operations in a
* single atomic operation, without actually acquiring and releasing spinlock;
* for instance, increase or decrease refcount. buf_id field never changes
* after initialization, so does not need locking. freeNext is protected by
* the buffer_strategy_lock not buffer header lock. The LWLock can take care
* of itself. The buffer header lock is *not* used to control access to the
* data in the buffer!
* 注意:必须持有Buffer header锁(BM_LOCKED标记)才能检查或修改tag/state/wait_backend_pid字段.
* 通常来说,buffer header lock是spinlock,它与标记位/参考计数/使用计数组合到单个原子变量中.
* 这个布局设计允许我们执行原子操作,而不需要实际获得或者释放spinlock(比如,增加或者减少参考计数).
* buf_id字段在初始化后不会出现变化,因此不需要锁定.
* freeNext通过buffer_strategy_lock锁而不是buffer header lock保护.
* LWLock可以很好的处理自己的状态.
* 务请注意的是:buffer header lock不用于控制buffer中的数据访问!
*
* It's assumed that nobody changes the state field while buffer header lock
* is held. Thus buffer header lock holder can do complex updates of the
* state variable in single write, simultaneously with lock release (cleaning
* BM_LOCKED flag). On the other hand, updating of state without holding
* buffer header lock is restricted to CAS, which insure that BM_LOCKED flag
* is not set. Atomic increment/decrement, OR/AND etc. are not allowed.
* 假定在持有buffer header lock的情况下,没有人改变状态字段.
* 持有buffer header lock的