由percona5.5参数innodb_adaptive_flushing_method想到的....

以下是本人凌乱的记录、杂乱无章,不堪入目啊................

-------------------

参数:innodb_adaptive_flushing_method.

控制脏页的刷新,可以动态修改

 

包括以下三个值:

0(native)

This setting causes checkpointing to operate exactly asit does in native InnoDB

1(estimate)

If the oldest modified age exceeds 1/2 of the maximum agecapacity, InnoDB starts flushing blocks every second. The number of blocksflushed is determined by [number of modified blocks], [LSN progress speed] and[average age of all modified blocks]. So, this behavior is independent of the innodb_io_capacityvariable.

 2(keep_average)

This method attempts to keep the I/O rate constant byusing a much shorter loop cycle (0.1second) than that of the other methods (1.0second). It is designed for use with SSD cards.

 

在Percona5.5.18里并没有reflex选项

 根据该选项的值,在线程函数srv_master_thread里会作出不同的选择分支。

Srv0srv.c:3286
if (UNIV_UNLIKELY(buf_get_modified_ratio_pct()  
                > srv_max_buf_pool_modified_pct)) {
    当脏页超过75%时,调用n_pages_flushed =buf_flush_list(            PCT_IO(100), IB_ULONGLONG_MAX)刷脏数据
} else if (srv_adaptive_flushing &&srv_adaptive_flushing_method == 0) {
设置为默认值native,则首先计算脏页产生的速度(buf_flush_get_desired_flush_rate),然后在进行刷盘操作
} else if(srv_adaptive_flushing && srv_adaptive_flushing_method == 1) {
设置为estimate
} else if(srv_adaptive_flushing && srv_adaptive_flushing_method == 2) {
设置为keep_average
}

从代码里可以看出来很多比例值是被定死了的。

 几个分支所做的事情,就是确定刷的脏页LSN范围,然后调用buf_flush_list

 一个有趣的宏:PCT_IO,展开来看看:

#define PCT_IO(p) ((ulong) (srv_io_capacity * ((double) p/ 100.0)))

其中srv_io_capacity默认值为200,代表服务器硬盘的IOPS能力,如果你的硬盘很牛叉,那就大胆的把innodb_io_capacity改的更大吧。

 

另外一个问题是在innodb里checkpoint是如何工作的呢。

以下摘自网络做了些蹩脚的翻译:

/*----------------------------------------------------------------begin--------------------------------------------------------------------------------------*/

我们知道,有两种类型的checkpoint,一种是sharp checkpoint,一种是fuzzy checkpoint。sharp checkpoint 会把所有提交的事务修改的页刷新到磁盘,并记录最近一次提交的事务LSN,没有提交的事务的修改页不会被刷新到磁盘。这样在crash恢复的时,我们可以从checkpoint记录的LSN开始。

A sharp checkpoint is called “sharp” because everythingthat is flushed to disk for the checkpoint is consistent as of a single pointin time — the checkpoint LSN

 

Fuzzy Checkpoint比SharpCheckpoint更加复杂。它会记录两个LSN:checkpoint的起始和结束的LSN号。Fuzzy CheckPoint是这么描述的:

A fuzzy checkpoint is morecomplex. It flushes pages as time passes, until it has flushed all pages that asharp checkpoint would have done. It completes by writing down two LSNs: whenthe checkpoint started and when it ended. But the pages it flushed might notall be consistent with each other as of a single point in time, which is whyit’s called “fuzzy.” A page that got flushed early might have been modifiedsince then, and a page that got flushed late might have a newer LSN than thestarting LSN. A fuzzy checkpoint can conceptually be converted into a sharpcheckpoint by performing REDO from the starting LSN to the ending LSN. Uponrecovery, then, REDO can begin from the LSN at which the checkpoint started

 

Innodb在shutdown的时候做sharp checkpoint,在正常操作时,做fuzzy checkpoint,并且并跟理论上的描述也有出入。

 

Innodb将文件页维持到一个大的bufferpool里,并且页面被修改后,不是立刻被写入到磁盘中。而是将脏页保持在内存中,以期待能够合并多次的修改。Innodb通过几个链表来跟踪buffer pool中的页:

the free list notes which pages are available to be used;

the LRU list notes which pages have been used leastrecently;

the flush list contains all of the dirty pages in LSNorder, least-recently-modified first.

当需要从磁盘读取页,而buffer pool中已没有空闲位置时,需要把脏页刷到磁盘来腾出空间,这是一种很慢的操作。

it flushes the oldest-modified pages from the flush liston a regular basis, trying to keep from hitting certain high-water marks. Itchooses the pages based on their physical locations on disk and their LSN(which is their modification time).

 

除了避免接近高水位,同样也要避免接触低水位,以免更高的I/O开销。Innodb循环的将日志写到固定大小的日志文件中。

 

当Innodb刷新脏页到磁盘中时,找到最老的LSN作为checkpoint的低水位。然后将该lsn写到事务头(log_checkpoint_margin()或log_checkpoint()函数)

Therefore, every time InnoDB flushes dirty pages from thehead of the flush list, it is actually making a checkpoint by advancing the oldestLSN in the system. And that is how continual fuzzy checkpointing isimplemented without ever “doing a checkpoint” as a separate event. If there isa server crash, then recovery simply proceeds from the oldest LSN onwards.

 

当innodb shut down的时,会首先停止所有对事务的更新,然后把所有的脏页刷新到磁盘,然后将当前的LSN写入到事务日志头。额外的还会将LSN写到每个数据文件的头部

/*----------------------------------------------------------------------------------------end--------------------------------------------------------------------------------------*/

 

Srv_master_thread会调用log_free_check来检查是否刷新logbuffer或更新checkpoint,注解如下:

/*Checks if there is need for a log buffer flush or a newcheckpoint, and does this if yes. Any database operation should call this whenit has modified more than about 4 pages. NOTE that thisfunction may only be called when the OS thread owns no synchronization objectsexcept the dictionary mutex.*/
UNIV_INLINE
void
log_free_check(void)
/*================*/
{
 
    if (log_sys->check_flush_or_checkpoint) {
        log_check_margins();
    }  
}

Log_sys->check_flush_or_checkpoint需要为true才会触发;

Log_sys是一个全局结构体(log_struct)

check_flush_or_checkpoint注释如下:

this is set to TRUE when there may be need to flush thelog buffer, or preflush buffer pool pages, or make a checkpoint; this MUST beTRUE when lsn - last_checkpoint_lsn > max_checkpoint_age;this flag is peeked at by log_free_check(), which does not reserve the logmutex

在以下几个函数里,check_flush_or_checkpoint可能会被设置为TRUE:

log_init(void)

log_close(void)

log_checkpoint_margin

在函数log_checkpoint_margin里会被设为FALSE。

 

log_check_margins()会做两件事情:

--------刷日志log:

log_flush_margin();

    如果当前存在flush操作,则什么也不做,否则,执行flush

    lsn =log->lsn

    log_write_up_to(lsn,LOG_NO_WAIT, FALSE);

 

-------设置checkpoint:

log_checkpoint_margin();主要做两件事:

刷脏的数据页

写checkpoint

 

oldest_lsn = log_buf_pool_get_oldest_modification();

首先从buf pool里找到最老的lsn,实际调用的函数是buf_pool_get_oldest_modification

for (i = 0; i < srv_buf_pool_instances; i++) {
         buf_pool_t*   buf_pool;
         buf_pool =buf_pool_from_array(i);
         buf_flush_list_mutex_enter(buf_pool);
         bpage =UT_LIST_GET_LAST(buf_pool->flush_list);
         if (bpage!= NULL) {
              ut_ad(bpage->in_flush_list);
              lsn =bpage->oldest_modification;
         }
         buf_flush_list_mutex_exit(buf_pool);
         if(!oldest_lsn || oldest_lsn > lsn) {
              oldest_lsn= lsn;
          }
     }

   这部分的逻辑很简单,就是从所有的Buffer pool实例中找到最老的lsn。我们回到函数 log_checkpoint_margin函数,继续分析:

    if (age >log->max_modified_age_sync) {

/*A flush is urgent: we have to do a synchronous preflush */

       sync = TRUE;

       advance = 2 *(age - log->max_modified_age_sync);

当前log->lsn - oldest_lsn >(日志空间大小 * 15/16)时,强制将2*(Buf age-Buf async)的脏页刷盘,此时事务停止执行

 

    } else if (age> log_max_modified_age_async()) {
/* A flush is not urgent: we do an asynchronous preflush*/

              advance= age - log_max_modified_age_async();

当age>7/8(min(日志空间大小,参数srv_checkpoint_age_target))时,异步刷盘,无需阻塞事务。

 

    } else {

       advance = 0;

    }

首先计算需要刷新的 LSN范围(advance)

 

if (checkpoint_age > log->max_checkpoint_age) {
       /* Acheckpoint is urgent: we do it synchronously */
       checkpoint_sync= TRUE;
       do_checkpoint= TRUE;
 
    } else if(checkpoint_age > log_max_checkpoint_age_async()) {
       /* Acheckpoint is not urgent: do it asynchronously */
 
       do_checkpoint= TRUE;
 
       log->check_flush_or_checkpoint= FALSE;
    } else {
       log->check_flush_or_checkpoint= FALSE;
    }

类似的,也要判断是否做checkpoint

然后,再做实际的操作:

    ib_uint64_t   new_oldest = oldest_lsn + advance;
    success =log_preflush_pool_modified_pages(new_oldest, sync);
刷日志文件

if (do_checkpoint) {
    log_checkpoint(checkpoint_sync,FALSE);
写checkpoint

 在函数log_preflush_pool_modified_pages里调用buf_flush_list->buf_flush_batch->buf_flush_buffered_writes将当前最老的lsn刷到新的位置。如果sync为true,则会阻塞直到刷新完成(buf_flush_wait_batch_end)

 在函数log_checkpoint()里执行记录checkpoint。该函数会检查在buffer pool中最早执行修改的LSN,然后将该LSN的信息写入到日志文件中。


题外话,以下摘自网络:

--------------------------------

从MySQL5.5.4开始增加了一个变量innodb_buffer_pool_instances,用来指定独立Buffer pool的数量

MySQL 5.5引入了innodb_buffer_pool_instances参数,设置该参数后InnoDB会将一个缓冲池划分为多个小的缓冲池,每个小缓冲池都有独立的LRU列表,空闲列表,刷新列表。以此来降低缓冲池资源的竞争。。

该参数通过HASH的方式来降低资源的竞争,然而有时我们可能知道大部分的竞争集中于一张表上,这时innodb_buffer_pool_instances就显得无能为力了。InnoDB independent buffer pool可以指定将某几张表放入指定大小的独立缓冲池中,以此来降低某几张具体表的资源竞争。目前使用independent buffer pool时,必须设置innodb_file_per_table为ON。



  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值