MogDB 学习笔记系列之 — 认识checkpoint

原文链接:http://www.killdb.com/2021/12/01/mogdb-%E5%AD%A6%E4%B9%A0%E7%AC%94%E8%AE%B0%E7%B3%BB%E5%88%97%E4%B9%8B-%E8%AE%A4%E8%AF%86checkpoint/

我们继续学习了解MogDB的核心线程。对于关系数据库而言,检查点是一个尤为重要的机制,检查点机制的优劣可以说决定了数据库性能的高低和稳定性。
这里我们继续来学习了解在MogDB数据库中,checkpoint机制是怎么样的。

7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
[omm@mogdb ~]$ gsql -d enmotech -p 26000
gsql ((MogDB 2.0.0 build b75b585a) compiled at 2021-05-28 17:20:47 commit 0 last mr  )
NOTICE : The password has been expired, please change the password. 
Non-SSL connection (SSL connection is recommended when requiring high-security)
Type "help" for help.
 
enmotech=# \copyright
MogDB Kernel Database Management System
Copyright (c) Yunhe Enmo (Beijing) Information Technology Co., Ltd. Copyright © 2009-2020 , All rights reserved.
 
enmotech=# select name,setting,category,context from pg_settings where name like '%checkpoint%'; 
              name              | setting |              category               |  context   
--------------------------------+---------+-------------------------------------+------------
 checkpoint_completion_target   | 0.5     | Write-Ahead Log / Checkpoints       | sighup
 checkpoint_flush_after         | 32      | Write-Ahead Log / Checkpoints       | sighup
 checkpoint_segments            | 64      | Write-Ahead Log / Checkpoints       | sighup
 checkpoint_timeout             | 900     | Write-Ahead Log / Checkpoints       | sighup
 checkpoint_wait_timeout        | 60      | Write-Ahead Log / Checkpoints       | sighup
 checkpoint_warning             | 300     | Write-Ahead Log / Checkpoints       | sighup
 enable_incremental_checkpoint  | on      | Write-Ahead Log / Checkpoints       | postmaster
 incremental_checkpoint_timeout | 60      | Write-Ahead Log / Checkpoints       | sighup
 log_checkpoints                | off     | Reporting and Logging / What to Log | sighup
(9 rows)
 
enmotech=#

首先我们从参数来看,涉及到检查点相关的参数一共有9个。这里针对上述几个参数进行简单解释;

checkpoint_completion_target 检查点完成的目标,默认参数为0.5,表示每个checkpoint需要在checkpoints间隔时间的50%内完成

checkpoint_flush_after 设置checkpointer线程在连续写多少个磁盘页后会进行异步刷盘操作,默认值为32,表示checkpointer线程连续写32个磁盘页,即32*8=256KB磁盘空间后会进行异步刷盘。

checkpoint_segments 设置checkpoint_timeout周期内所保留的最少WAL日志段文件数量

checkpoint_timeout 自动WAL检查点之间的最长时间,默认为900s

checkpoint_wait_timeout 请求检查点等待checkpointer线程启动的最长时间,默认为60s

checkpoint_warning 填充检查点段文件导致检查点发生的时间间隔接近这个参数表示的秒数,就向服务器日志发送一个建议增加checkpoint_segments值的消息;有大量warnning说明checkpoint_segments设置偏小

enable_incremental_checkpoint 表示是否启用增量检查点功能

incremental_checkpoint_timeout 增量检查点开关打开之后,设置自动WAL检查点之间的最长时间

对于增量检查点技术而言,我们Oracle DBA们并不陌生,因为在Oracle 8i就已经有了。通过增量检查点技术,可以让数据库相对均匀的进行脏数据刷新;
同时可以减少数据库实例崩溃后的恢复时间。实际上就现在而言,很多数据库都还没有实现增量检查点功能,比如PostgreSQL或者国产的达梦数据库。

就拿postgreSQL来讲,每次检查点出发后,脏页需要在checkpoint_timeout*checkpoint_completion_target的时间范围内必须写完,如果脏数据较多,而IO又不是太强劲的话,那么可能有一些问题。
因为每次进行大量脏页写出,势必会影响整个数据库。可能这也是为什么之前在测试PostgreSQL发现性能抖动比较厉害,峰刺较多。

我们通过创建测试表并跟踪checkpoint 线程来观察一下:

++++创建测试表
enmotech=# insert into test1123 select * from test1123;
INSERT 0 153856
enmotech=# insert into test1123 select * from test1123;
INSERT 0 307712
enmotech=# checkpoint;
CHECKPOINT
enmotech=#                 
enmotech=# SELECT OID,relname FROM pg_class where OID=16453;
  oid  | relname  
-------+----------
 16453 | test1123
(1 row)
 
 
 
++++获取checkpoint线程信息
 
[omm@mogdb ~]$ ps -ef|grep mogdb |grep -v grep
avahi      9129      1  0 01:02 ?        00:00:03 avahi-daemon: running [mogdb.local]
omm       16305      1 99 07:01 pts/1    00:15:26 /data/mogdb/bin/mogdb -D /data/mogdb_b75b585a/data/db1
[omm@mogdb ~]$ ps -T -p 16305
   PID   SPID TTY          TIME CMD
 16305  16305 pts/1    00:00:01 mogdb
 16305  16306 pts/1    00:00:00 jemalloc_bg_thd
 16305  16309 pts/1    00:00:00 mogdb
 16305  16310 pts/1    00:00:00 syslogger
 16305  16311 pts/1    00:00:00 alarm
 16305  16312 pts/1    00:00:00 jemalloc_bg_thd
 16305  16313 pts/1    00:00:00 jemalloc_bg_thd
 16305  16314 pts/1    00:00:00 reaper
 16305  16315 pts/1    00:00:00 jemalloc_bg_thd
 16305  16341 pts/1    00:00:00 checkpointer
 16305  16343 pts/1    00:00:02 pagewriter
 16305  16344 pts/1    00:00:00 pagewriter
 16305  16345 pts/1    00:00:00 bgwriter
 16305  16346 pts/1    00:00:00 bgwriter
 16305  16347 pts/1    00:00:00 CBMwriter
 16305  16348 pts/1    00:11:33 WALwriter
 16305  16349 pts/1    00:00:00 WALwriteraux
 16305  16350 pts/1    00:00:00 AVClauncher
 16305  16351 pts/1    00:00:00 Jobscheduler
 16305  16352 pts/1    00:00:00 statscollector
 16305  16353 pts/1    00:00:01 snapshotworker
 16305  16354 pts/1    00:03:45 percentworker
 16305  16355 pts/1    00:00:06 ashworker
 16305  16356 pts/1    00:00:02 TrackStmtWorker
 16305  16357 pts/1    00:00:00 auditor
 16305  16358 pts/1    00:00:00 2pccleaner
 16305  16359 pts/1    00:00:00 faultmonitor
[omm@mogdb ~]$ 
 
++++跟踪checkpoint线程
[root@mogdb ~]# strace -fr -o /tmp/16341.log -p 16341
strace: Process 16341 attached with 28 threads
strace: Process 16600 attached
strace: Process 16601 attached
strace: Process 16602 attached
 ^Cstrace: Process 16341 detached
strace: Process 16305 detached
strace: Process 16306 detached
strace: Process 16309 detached
strace: Process 16310 detached
strace: Process 16311 detached
strace: Process 16312 detached
strace: Process 16313 detached
strace: Process 16314 detached
strace: Process 16315 detached
strace: Process 16343 detached
strace: Process 16344 detached
strace: Process 16345 detached
strace: Process 16346 detached
strace: Process 16347 detached
strace: Process 16348 detached
strace: Process 16349 detached
strace: Process 16350 detached
strace: Process 16351 detached
strace: Process 16352 detached
strace: Process 16353 detached
strace: Process 16354 detached
strace: Process 16355 detached
strace: Process 16356 detached
strace: Process 16357 detached
strace: Process 16358 detached
strace: Process 16359 detached
strace: Process 16586 detached

通过查看跟踪文件,我们发现checkpoint线程即16341仅仅写了26号文件:

[root@mogdb tmp]# cat /tmp/16341.log |grep "pwrite"|grep 16341
16341      0.000004 pwrite64(26, "\0\0\0\0\256e\1\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192, 0 <unfinished ...>
16341      0.000016 <... pwrite64 resumed> ) = 8192
16586      0.000047 pwrite64(112, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192, 51634176 <unfinished ...>
16346      0.000007 pwrite64(112, "\0\0\0\0p\225\32\35RU@\0\240\0\340\0\0 \6 \0\0\0\0\215\25\1\0\0\0\0\0"..., 8192, 51634176 <unfinished ...>
16346      0.000010 pwrite64(112, "\0\0\0\0p\225\32\35\255\31@\0\240\0\340\0\0 \6 \0\0\0\0\215\25\1\0\0\0\0\0"..., 8192, 51634176 <unfinished ...>
16341      0.000011 pwrite64(26, "\0\0\1\0\r&\1\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192, 0 <unfinished ...>
16341      0.000003 <... pwrite64 resumed> ) = 8192
[root@mogdb tmp]#    
 
+++获取进程操作句柄信息 
 
[omm@mogdb ~]$  
[omm@mogdb ~]$ ls -ltr /proc/16305/fd
total 0
l-wx------. 1 omm dbgrp 64 Nov 23 07:13 2 -> pipe:[156914]
lrwx------. 1 omm dbgrp 64 Nov 23 07:15 99 -> /data/mogdb_b75b585a/data/db1/base/16423/14619
lrwx------. 1 omm dbgrp 64 Nov 23 07:15 98 -> /data/mogdb_b75b585a/data/db1/base/16423/14743
......
lr-x------. 1 omm dbgrp 64 Nov 23 07:15 31 -> pipe:[156210]
l-wx------. 1 omm dbgrp 64 Nov 23 07:15 30 -> pipe:[156920]
l-wx------. 1 omm dbgrp 64 Nov 23 07:15 3 -> /data/mogdb_b75b585a/data/db1/pg_ctl.lock
lr-x------. 1 omm dbgrp 64 Nov 23 07:15 29 -> pipe:[156920]
lrwx------. 1 omm dbgrp 64 Nov 23 07:15 28 -> /data/mogdb_b75b585a/data/db1/pg_cbm/pg_xlog_1_0000000009000258_0000000000000000.cbm
lrwx------. 1 omm dbgrp 64 Nov 23 07:15 27 -> /data/mogdb_b75b585a/data/db1/global/pg_dw_single
lrwx------. 1 omm dbgrp 64 Nov 23 07:15 26 -> /data/mogdb_b75b585a/data/db1/global/pg_dw
lr-x------. 1 omm dbgrp 64 Nov 23 07:15 25 -> pipe:[156209]
......

我们可以看到26号文件即double write文件。对于double write双写的情况,mogdb也提供了一个如下的性能试图,便于进行查看读写情况:

postgres=# select * from dbe_perf.global_double_write_status;
 node_name | curr_dwn | curr_start_page | file_trunc_num | file_reset_num | total_writes | low_threshold_writes | high_threshold_writes | total_pages | low_threshold_pages | high_threshold_pages 
-----------+----------+-----------------+----------------+----------------+--------------+----------------------+-----------------------+-------------+---------------------+----------------------
 dn_6001   |        1 |            9759 |             30 |              1 |           26 |                    1 |                    18 |       16342 |                   6 |                14411
(1 row)
 
postgres=# checkpoint;
CHECKPOINT
postgres=# select * from dbe_perf.global_double_write_status;
 node_name | curr_dwn | curr_start_page | file_trunc_num | file_reset_num | total_writes | low_threshold_writes | high_threshold_writes | total_pages | low_threshold_pages | high_threshold_pages 
-----------+----------+-----------------+----------------+----------------+--------------+----------------------+-----------------------+-------------+---------------------+----------------------
 dn_6001   |        1 |            9759 |             31 |              1 |           26 |                    1 |                    18 |       16342 |                   6 |                14411
(1 row)
 
postgres=#

最后简单总结一下检查点线程:

1、检查点线程是单线程,从功能上来讲跟其他关系型数据库类型,可以理解为一个触发事件;通知bgwriter和pagewriter等线程写脏页。
2、MogDB使用的是增量检查点,通过增量检查点技术,可以极大的降低实例崩溃后的恢复时间,同时由于让数据库写脏更加均匀,因此性能相对稳定(实际上目前很多国产数据库还没有增量检查点技术)。
3、MogDB提供了多种存储引擎,其中MOT 内存表和增量检查点技术是不兼容的。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值