AntDB 使用hugepage场景

最新推荐文章于 2023-08-29 15:42:21 发布

AntDB-billyzhu

最新推荐文章于 2023-08-29 15:42:21 发布

阅读量487

点赞数

分类专栏： AntDB 文章标签： AntDB hugepage pgbench linux

本文链接：https://blog.csdn.net/u011098015/article/details/80678557

版权

AntDB 专栏收录该内容

11 篇文章 3 订阅

订阅专栏

背景

HugePages是通过使用大页内存来取代传统的4kb内存页面，使得管理虚拟地址数变少，加快了从虚拟地址到物理地址的映射以及通过摒弃内存页面的换入换出以提高内存的整体性能。AntDB v3.1版本使用的是pg9.6内核版本，当sharebuffer很大时，可以考虑在虚拟内存与物理内存映射表相关的部分优化。HugePage是常驻内存的，不会被交换出去，这也是重度依赖内存的应用；页越大，映射表就越小。使用huge page可以减少页表大小。

Hugepage的相关术语

Page Table:page table也就是一种用于内存管理的实现方式，用于物理地址到虚拟之间的映射。因此对于内存的访问，先是访问Page Table，然后根据Page Table 中的映射关系，隐式的转移到物理地址来存取数据。
TLB:Translation Lookaside Buffer (TLB) ，CPU中的一块固定大小的cache，包含了部分page table的映射关系，用于快速实现虚拟地址到物理地址的转换。
hugetlb：hugetlb 是TLB中指向HugePage的一个entry(通常大于4k或预定义页面大小)。 HugePage 通过hugetlb entries来实现，也可以理解为HugePage：是hugetlb page entry的一个句柄。

hugetlbfs: 一个类似于tmpfs的新的in-memory filesystem，在2.6内核被提出。

AntDB v3.1版本hugepage使用配置

1、查看当前操作系统的hugepage大小：

[gd@intel175 ~]$ grep Huge /proc/meminfo
AnonHugePages:   1495040 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB

### HugePage说相关的几个值都为0，表明当前未使用HugePages，Hugepagesize为2MB

2、计算需要配置多少个页

AntDBv3.1版本模块中，datanode节点是用来存储数据的，所以hugepage主要在datanode节点上设置即可。

[gd@intel175 ~]$ adbmgr
psql (3.1.0 fa23c8c914 based on PG 9.6.2)
Type "help" for help.

postgres=# show coord1 shared_buffers;
           type            | status |       message        
---------------------------+--------+----------------------
 coordinator master coord1 | t      | shared_buffers = 2GB
(1 row)

postgres=# show gtm shared_buffers;
      type      | status |       message        
----------------+--------+----------------------
 gtm master gtm | t      | shared_buffers = 5GB
(1 row)

postgres=# list node;
  name  |    host    |        type        | mastername | port  | sync_state |           path            | initialized | incluster 
--------+------------+--------------------+------------+-------+------------+---------------------------+-------------+-----------
 gtm    | localhost1 | gtm master         |            |  7693 |            | /data/gd/data/gtm         | t           | t
 coord1 | localhost1 | coordinator master |            |  6604 |            | /data/gd/data/coord1      | t           | t
 coord2 | localhost2 | coordinator master |            |  6604 |            | /data/gd/pgxc_data/coord2 | t           | t
 db1    | localhost1 | datanode master    |            | 16323 |            | /data/gd/data/db1         | t           | t
 db2    | localhost2 | datanode master    |            | 16324 |            | /data/gd/pgxc_data/db2    | t           | t
(5 rows)

计算datanode节点需要多少hugepage：

10GB/2MB=5120

这里需要注意的是，agtm 和coordinator模块的huge_pages配置。PG9.6默认huge_pages=try，localhost1服务器上有1个gtm，1个coord和一个datanode节点。因此本次实际至少需要：

(10GB+2GB+5GB)/2MB=8704

3、设置linux 的hugepages个数

[root@intel175 gd]# sysctl -w vm.nr_hugepages=7120 ###操作系统设置hugepage要充足 > 节点所需要的个数
vm.nr_hugepages = 10000

4、设置datanode开启hugepage

[gd@intel175 ~]$ adbmgr
psql (3.1.0 fa23c8c914 based on PG 9.6.2)
Type "help" for help.
postgres=# set datanode all (huge_pages='on');
NOTICE:  parameter "huge_pages" cannot be changed without restarting the server
SET PARAM
postgres=# show db1 huge_pages;
        type         | status |     message     
---------------------+--------+-----------------
 datanode master db1 | t      | huge_pages = on
(1 row)
postgres=# stop datanode all ;
NOTICE:  10.21.20.175, pg_ctl  stop -D /data/gd/data/db1 -Z datanode -m smart -o -i -w -c -W
NOTICE:  10.21.20.176, pg_ctl  stop -D /data/gd/pgxc_data/db2 -Z datanode -m smart -o -i -w -c -W
NOTICE:  waiting max 90 seconds for datanode master to stop ...

    operation type    | nodename | status | description 
----------------------+----------+--------+-------------
 stop datanode master | db1      | t      | success
 stop datanode master | db2      | t      | success
(2 rows)

postgres=# start datanode all ;
NOTICE:  10.21.20.175, pg_ctl  start -D /data/gd/data/db1 -Z datanode -o -i -w -c -W -l /data/gd/data/db1/logfile
NOTICE:  10.21.20.176, pg_ctl  start -D /data/gd/pgxc_data/db2 -Z datanode -o -i -w -c -W -l /data/gd/pgxc_data/db2/logfile
NOTICE:  waiting max 90 seconds for datanode master to start ...

    operation type     | nodename | status | description 
-----------------------+----------+--------+-------------
 start datanode master | db1      | t      | success
 start datanode master | db2      | t      | success
(2 rows)

huge_pages参数需要服务重启才能生效。另外当huge_pages=on时，如果shared_buffers值大于等于操作系统huge_page总内存大小（huge_pagesz*huge_pages）,datanode 启动会失败，报错信息如下：

FATAL:  could not map anonymous shared memory: Cannot allocate memory
HINT:  This error usually means that PostgreSQL's request for a shared memory segment exceeded available memory, swap space, or huge pages. To reduce the request size (currently 1108097433
6 bytes), reduce PostgreSQL's shared memory usage, perhaps by reducing shared_buffers or max_connections.
LOG:  database system is shut down

解决办法：share_buffers设置小于操作系统huge_page总内存大小（huge_pagesz*huge_pages）。

另外，为了验证huge_page开启对性能的影响，本次实验将gtm 和coord的大页参数关闭，重启集群。

[gd@intel175 ~]$ adbmgr
psql (3.1.0 fa23c8c914 based on PG 9.6.2)
Type "help" for help.
postgres=# set gtm all (huge_pages='off');                        
NOTICE:  parameter "huge_pages" cannot be changed without restarting the server
SET PARAM
postgres=# set coordinator all (huge_pages='off');       
NOTICE:  parameter "huge_pages" cannot be changed without restarting the server
SET PARAM
postgres=# stop all mode f;
NOTICE:  10.21.20.175, pg_ctl  stop -D /data/gd/data/db1 -Z datanode -m fast -o -i -w -c -W
NOTICE:  10.21.20.176, pg_ctl  stop -D /data/gd/pgxc_data/db2 -Z datanode -m fast -o -i -w -c -W
NOTICE:  waiting max 90 seconds for datanode master to stop ...

NOTICE:  10.21.20.175, pg_ctl  stop -D /data/gd/data/coord1 -Z coordinator -m fast -o -i -w -c -W
NOTICE:  10.21.20.176, pg_ctl  stop -D /data/gd/pgxc_data/coord2 -Z coordinator -m fast -o -i -w -c -W
NOTICE:  waiting max 90 seconds for coordinator master to stop ...

NOTICE:  10.21.20.175, agtm_ctl  stop -D /data/gd/data/gtm -m fast -o -i -w -c -W
NOTICE:  waiting max 90 seconds for gtm master to stop ...

     operation type      | nodename | status | description 
-------------------------+----------+--------+-------------
 stop datanode master    | db1      | t      | success
 stop datanode master    | db2      | t      | success
 stop coordinator master | coord1   | t      | success
 stop coordinator master | coord2   | t      | success
 stop gtm master         | gtm      | t      | success
(5 rows)

postgres=# start all ;
NOTICE:  10.21.20.175, agtm_ctl  start -D /data/gd/data/gtm -o -i -w -c -W -l /data/gd/data/gtm/logfile
NOTICE:  waiting max 90 seconds for gtm master to start ...

NOTICE:  10.21.20.175, pg_ctl  start -D /data/gd/data/coord1 -Z coordinator -o -i -w -c -W -l /data/gd/data/coord1/logfile
NOTICE:  10.21.20.176, pg_ctl  start -D /data/gd/pgxc_data/coord2 -Z coordinator -o -i -w -c -W -l /data/gd/pgxc_data/coord2/logfile
NOTICE:  waiting max 90 seconds for coordinator master to start ...

NOTICE:  10.21.20.175, pg_ctl  start -D /data/gd/data/db1 -Z datanode -o -i -w -c -W -l /data/gd/data/db1/logfile
NOTICE:  10.21.20.176, pg_ctl  start -D /data/gd/pgxc_data/db2 -Z datanode -o -i -w -c -W -l /data/gd/pgxc_data/db2/logfile
NOTICE:  waiting max 90 seconds for datanode master to start ...

      operation type      | nodename | status | description 
--------------------------+----------+--------+-------------
 start gtm master         | gtm      | t      | success
 start coordinator master | coord1   | t      | success
 start coordinator master | coord2   | t      | success
 start datanode master    | db1      | t      | success
 start datanode master    | db2      | t      | success
(5 rows)

5、查看当前localhost1 hugepage的使用情况

[gd@intel175 ~]$ cat /proc/meminfo | grep Huge
AnonHugePages:   2488320 kB
HugePages_Total:   10000
HugePages_Free:     9857
HugePages_Rsvd:     5141
HugePages_Surp:        0
Hugepagesize:       2048 kB

测试验证

1、写测试场景测试

[gd@intel175 ~]$ coord1
psql (3.1.0 fa23c8c914 based on PG 9.6.2)
Type "help" for help.

postgres=# create table test(id serial8, c1 int8 default 0, c2 int8 default 0, c3 int8 default 0, c4 int8 default 0, c5 int8 default 0, c6 int8 default 0, c7 int8 default 0, c8 int8 default 0, c9 int8 default 0, crt_time timestamptz);             
CREATE TABLE
postgres=# \d+ test
                                                      Table "public.test"
  Column  |           Type           |                     Modifiers                     | Storage | Stats target | Description 
----------+--------------------------+---------------------------------------------------+---------+--------------+-------------
 id       | bigint                   | not null default nextval('test_id_seq'::regclass) | plain   |              | 
 c1       | bigint                   | default 0                                         | plain   |              | 
 c2       | bigint                   | default 0                                         | plain   |              | 
 c3       | bigint                   | default 0                                         | plain   |              | 
 c4       | bigint                   | default 0                                         | plain   |              | 
 c5       | bigint                   | default 0                                         | plain   |              | 
 c6       | bigint                   | default 0                                         | plain   |              | 
 c7       | bigint                   | default 0                                         | plain   |              | 
 c8       | bigint                   | default 0                                         | plain   |              | 
 c9       | bigint                   | default 0                                         | plain   |              | 
 crt_time | timestamp with time zone |                                                   | plain   |              | 
Distribute By: HASH(id)
Location Nodes: ALL DATANODES

pgbench测试脚本如下：

vi test.sql
    insert into test(crt_time) values(now());

vi exec.sh
#!/bin/bash
pgbench  -d postgres -U gd -p 6604  -M prepared -n -r -f ./test.sql -c 64 -j 64 -T 120 >>insert.log
sleep 10
pgbench  -d postgres -U gd -p 6604  -M prepared -n -r -f ./test.sql -c 128 -j 128 -T 120 >>insert.log
sleep 10
pgbench  -d postgres -U gd -p 6604 -M prepared -n -r -f ./test.sql -c 256 -j 256 -T 120 >>insert.log
sleep 10
pgbench  -d postgres -U gd -p 6604  -M prepared -n -r -f ./test.sql -c 512 -j 512 -T 120 >>insert.log
sleep 10

开启和关闭huge_pages pgbench测试对比结果如下图：

2、读测试场景测试

还是使用上述的测试表test，写入10万条记录后，进行vacuum操作。

[gd@intel175 bench]$ coord1
psql (3.1.0 fa23c8c914 based on PG 9.6.2)
Type "help" for help.
postgres=# truncate test ;
TRUNCATE TABLE
postgres=# insert into test(id,crt_time)  select generate_series(1,100000) as id ,now() as crt_time;
INSERT 0 100000
postgres=# vacuum FULL test ;
VACUUM
postgres=# select count(*) from test;
 count  
--------
 100000
(1 row)

pgbench测试脚本如下:

[gd@intel175 bench]$ vim select.sql 
\set id random(1,100000)
 select * from test where id = :id;

[gd@intel175 bench]$ vim select.sh 
#!/bin/bash
pgbench  -d postgres -U gd -p 6604  -M prepared -n -r -f ./select.sql -c 64 -j 64 -T 120 >>select.log
sleep 10
pgbench  -d postgres -U gd -p 6604  -M prepared -n -r -f ./select.sql -c 128 -j 128 -T 120 >>select.log
sleep 10
pgbench  -d postgres -U gd -p 6604  -M prepared -n -r -f ./select.sql -c 256 -j 256 -T 120 >>select.log
sleep 10
pgbench  -d postgres -U gd -p 6604  -M prepared -n -r -f ./select.sql -c 512 -j 512 -T 120 >>select.log
sleep 10

开启和关闭huge_pages pgbench测试对比结果如下图：

通过上述测试验证过程，发现在连接数不大的情况下，整体读写TPS并没有明显提升。甚至使用hugepages还不如不用。能使用连接池的话，尽量使用连接池，减少连接到数据库的连接数。

综上，antdb v3.1版本，并发数不高的场景下不建议使用huge_pages。如果不能使用连接池，连接数非常多，并且都是长连接时，可以考虑开启hugepages。

参考：https://blog.csdn.net/leshami/article/details/8777639

AntDB：
开源url：https://github.com/ADBSQL/AntDB
QQ交流群：496464280

AntDB-billyzhu

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
AntDB 使用hugepage场景

背景 HugePages是通过使用大页内存来取代传统的4kb内存页面，使得管理虚拟地址数变少，加快了从虚拟地址到物理地址的映射以及通过摒弃内存页面的换入换出以提高内存的整体性能。AntDB v3.1版本使用的是pg9.6内核版本，当sharebuffer很大时，可以考虑在虚拟内存与物理内存映射表相关的部分优化。...
复制链接

扫一扫