GBase 8a MPP Cluster 集群所面临的集群规模、数据量、 SQL 复杂度越来越大,虽然通过加强数据的合理分布、集群查询计划的合理调度、硬件资源的扩展等手段解决一部分资源( CPU、 NET、 MEMORY、 DISK IO 等)瓶颈问题,但依然面临如下挑战:
1. 系统资源不受控情况下,所有 SQL 执行都会抢占资源,这样会造成系统的不稳定;
2. 系统资源被一条低优先级的 SQL 大量占用,导致高优先级 SQL 无法按时完成;
3. 复杂 SQL 在集群中往往会分多步执行,在并发情况下,同一 SQL 的任务会受资源限制,无法保证在所有节点间同步完成。
配置原则
资源管理要解决的问题是:
1. 系统资源能够按照策略分配使用;
2. 任务的执行要有优先级管理;
3. 复杂(多步)任务在集群中要有统一的管理策略(包括:资源分配、优先级、执行顺序等) 。
集群相关命令
1. 用户优先级设定
语法
grant usage on *.* to user_name with task_priority priority_value |
priority_value 的取值范围为 0, 1, 2, 3 对应最小优先级,低优先级、中优先级和高优先级,缺省为中优先级 2。
权限要求
有 grant 权限的用户,推荐用户: root
示例:
gbase> create user uer1 ;
Query OK, 0 rows affected
gbase> grant usage on *.* to uer1 with task_priority 1;
Query OK, 0 rows affected
gbase> select task_priority from user where user='user1';
+---------------+
| task_priority |
+---------------+
| 1 |
+---------------+
1 row in set
2.用户资源组设定
语法
grant usage on *.* to user_name with resource_group group_value |
group_value 的取值范围为 0-15, 0 组为缺省组。
权限要求
有 grant 权限的用户,推荐用户: root。
示例 :
gbase> create user user0 identified by 'user0';
Query OK, 0 rows affected
gbase> grant usage on *.* to user0 with resource_group 0;
Query OK, 0 rows affected
gbase> select resource_group from user where user='user0';
+----------------+
| resource_group |
+----------------+
| 0 |
+----------------+
1 row in set
3. 查询并行度设定
语法
grant usage on *.* to user_name with max_cpus_used max_cpus_used_value |
max_cpus_used_value 的取值范围为大于 0 的整数,该参数决定查询的并行度,推荐值为用户所在资源组的可用 CPU 数目。
权限要求
有 grant 权限的用户,推荐用户: root。
示例 :
gbase> use gbase;
Query OK, 0 rows affected
gbase> create user user1;
Query OK, 0 rows affected
gbase> grant usage on *.* to user1 with max_cpus_used 4;
Query OK, 0 rows affected
gbase> select max_cpus from user where user='user1';
+----------+
| max_cpus |
+----------+
| 4 |
+----------+
1 row in set
4. 用户优先级与任务调度配重设定
语法
Set gcluster global gbase_high_priority_weight = weight_value( 80-100) Set gcluster global gbase_mid_priority_weight = weight_value ( 60-80) Set gcluster global gbase_low_priority_weight = weight_value ( 40-60) Set gcluster global gbase_min_priority_weight = weight_value ( 20-40) |
weight_value 取值按高、中、低、最小划分,具体范围如下:
高: 80 - 100
中: 60 - 80
低: 40 - 60
最小: 20 - 40
权限要求
有 set 权限用户
示例
gbase> Set gcluster global gbase_min_priority_weight = 20;
Query OK, 0 rows affected, 32 warnings (Elapsed: 00:00:00.01)
gbase> show variables like '%gbase_min_priority_weight%';
+---------------------------+-------+
| Variable_name | Value |
+---------------------------+-------+
| gbase_min_priority_weight | 20 |
+---------------------------+-------+
1 row in set (Elapsed: 00:00:00.00)
说明
使用上述语句配重值不会持久化,即 gnode 重新启动后会丢失。要是需要持久化则需要在执行该语句之前先执行
set gbase_global_variable_persistent = 1
执行该语句后执行
set gbase_global_variable_persistent = 0
配重参数可控制 cpu.shares 与 blkio.weight 参数,具体可参考下表:
参数 | CGroup 最小值 | CGroup 最大值 | CGroup 缺省值 | 集群 最小值 | 集群 最大值 | 集群比重计算 |
cpu.shares | 1 | 无 | 1024 | 1 | 2560 | ( 2560 * weight) / 100 |
Blkio.weight | 1 | 1000 | 300 | 1 | 1000 | ( 1000 * weight) / 100 |
5. 显示优先级状态
语法
Show priorities [where conditions] |
显示
node_name:集群节点名称。
group:资源组编号。
priority:优先级编号。
priority_weight:优先级配重。
status:优先级开启状态 ON/OFF。
description:优先级控制参数描述。
权限要求
有 show 权限用户。
示例
第一步 需要配置 cgconfig.conf 文件。 在配置文件中为资源组 0、 1 进行设置。
第二步 启动 cgconfig 服务。
第三步 重启 gcware, # gcluster_service gcware restart。
完成以上配置操作后,再执行下面示例的命令, 0、1 两个控制组的优先级将为开启状态。
示例1:查看集群全部节点优先级状态。
gbase> show priorities;
+-----------+-------+----------+----------------+--------+-------------+
| node_name | group | priority | priority-weight | status | description|
+-----------+-------+----------+-----------------+-------+-------------+
| node1 | 0| 0 | 20 | ON |...... |
| node1 | 0| 1 | 40 | ON |...... |
| node1 | 0| 2 | 60 | ON |...... |
| node1 | 0| 3 | 80 | ON |...... |
| node1 | 1| 0 | 20 | ON |...... |
| node1 | 1| 1 | 40 | ON |...... |
| node1 | 1| 2 | 60 | ON |...... |
| node1 | 1| 3 | 80 | ON |...... |
| node1 | 2| 0 | 20 | OFF |...... |
| node1 | 2| 1 | 40 | OFF |...... |
| node1 | 2| 2 | 60 | OFF |...... |
| node1 | 2| 3 | 80 | OFF |...... |
......
| node2 | 15| 0 | 20 | OFF |...... |
| node2 | 15| 1 | 40 | OFF |...... |
| node2 | 15| 2 | 60 | OFF |...... |
| node2 | 15| 3 | 80 | OFF |...... |
+-----------+-------+----------+-----------------+-------+-------------+
128 rows in set
示例 2:查看 node1 节点的优先级状态信息。
gbase> show priorities where node_name = 'node1';
+-----------+-------+----------+----------------+--------+-------------+
| node_name | group | priority | priority-weight | status | description|
+-----------+-------+----------+-----------------+-------+-------------+
| node1 | 0| 0 | 20 | ON |...... |
| node1 | 0| 1 | 40 | ON |...... |
| node1 | 0| 2 | 60 | ON |...... |
| node1 | 0| 3 | 80 | ON |...... |
| node1 | 1| 0 | 20 | ON |...... |
| node1 | 1| 1 | 40 | ON |...... |
| node1 | 1| 2 | 60 | ON |...... |
| node1 | 1| 3 | 80 | ON |...... |
| node1 | 2| 0 | 20 | OFF |...... |
| node1 | 2| 1 | 40 | OFF |...... |
| node1 | 2| 2 | 60 | OFF |...... |
| node1 | 2| 3 | 80 | OFF |...... |
......
| node1 | 15| 0 | 20 | OFF |...... |
| node1 | 15| 1 | 40 | OFF |...... |
| node1 | 15| 2 | 60 | OFF |...... |
| node1 | 15| 3 | 80 | OFF |...... |
+-----------+-------+----------+-----------------+-------+-------------+
64 rows in set
示例 3:查看状态为 ON 的优先级信息。
gbase> show priorities where status ='ON';
+-----------+-------+----------+----------------+--------+-------------+
| node_name | group | priority | priority-weight | status | description|
+-----------+-------+----------+-----------------+-------+-------------+
| node1 | 0| 0 | 20 | ON |...... |
| node1 | 0| 1 | 40 | ON |...... |
| node1 | 0| 2 | 60 | ON |...... |
| node1 | 0| 3 | 80 | ON |...... |
| node1 | 1| 0 | 20 | ON |...... |
| node1 | 1| 1 | 40 | ON |...... |
| node1 | 1| 2 | 60 | ON |...... |
| node1 | 1| 3 | 80 | ON |...... |
+-----------+-------+----------+-----------------+-------+-------------+
8 rows in set
示例 4:关闭 node1 节点 cgroup 配置服务( service cgconfig stop)。
# service cgconfig stop
Stopping cgconfig service: [ OK ]
# gcluster_service gcware restart
Stopping GCMonit success!
Signaling GCRECOVER (gcrecover) to terminate: [ OK ]
Waiting for gcrecover services to unload:...[ OK ]
Signaling GCSYNC (gc_sync_server) to terminate: [ OK ]
[ OK ]for gc_sync_server services to unload:[ OK ]
Signaling GCLUSTERD to terminate: [ OK ]
.[ OK ]or gclusterd services to unload:...[ OK ]
Signaling GBASED to terminate: [ OK ]
.[ OK ]or gbased services to unload:[ OK ]
Signaling GCWARE (gcware) to terminate: [ OK ]
Waiting for gcware services to unload:.[ OK ]
Starting GCWARE (gcwexec): [ OK ]
Starting GCMonit success!
Starting GBASED : [ OK ]
Starting GCLUSTERD : [ OK ]
Starting GCSYNC : [ OK ]
Starting GCRECOVER : [ OK ]
# su - gbase
$ gccli -uroot
GBase client 9.5.2.8.111533. Copyright (c) 2004-2019, GBase. All Rights Reserved.
gbase> show priorities where node_name = 'node1';
+-----------+-------+----------+----------------+--------+-------------+
| node_name | group | priority | priority-weight | status | description|
+-----------+-------+----------+-----------------+-------+-------------+
| node1 | 0| 0 | 20 | OFF |...... |
| node1 | 0| 1 | 40 | OFF |...... |
| node1 | 0| 2 | 60 | OFF |...... |
| node1 | 0| 3 | 80 | OFF |...... |
| node1 | 1| 0 | 20 | OFF |...... |
| node1 | 1| 1 | 40 | OFF |...... |
| node1 | 1| 2 | 60 | OFF |...... |
| node1 | 1| 3 | 80 | OFF |...... |
| node1 | 2| 0 | 20 | OFF |...... |
| node1 | 2| 1 | 40 | OFF |...... |
| node1 | 2| 2 | 60 | OFF |...... |
| node1 | 2| 3 | 80 | OFF |...... |
......
| node1 | 15| 0 | 20 | OFF |...... |
| node1 | 15| 1 | 40 | OFF |...... |
| node1 | 15| 2 | 60 | OFF |...... |
| node1 | 15| 3 | 80 | OFF |...... |
+-----------+-------+----------+-----------------+-------+-------------+
64 rows in set
示例 5:重新开启 node1 的 cgroup 配置服务( service cgconfig start)。
# service cgconfig start
Starting cgconfig service: [ OK ]
# gcluster_service gcware restart
Stopping GCMonit success!
Signaling GCRECOVER (gcrecover) to terminate: [ OK ]
Waiting for gcrecover services to unload:...[ OK ]
Signaling GCSYNC (gc_sync_server) to terminate: [ OK ]
[ OK ]for gc_sync_server services to unload:[ OK ]
Signaling GCLUSTERD to terminate: [ OK ]
.[ OK ]or gclusterd services to unload:...[ OK ]
Signaling GBASED to terminate: [ OK ]
.[ OK ]or gbased services to unload:[ OK ]
Signaling GCWARE (gcware) to terminate: [ OK ]
Waiting for gcware services to unload:.[ OK ]
Starting GCWARE (gcwexec): [ OK ]
Starting GCMonit success!
Starting GBASED : [ OK ]
Starting GCLUSTERD : [ OK ]
Starting GCSYNC : [ OK ]
Starting GCRECOVER : [ OK ]
# su - gbase
$ gccli -uroot
GBase client 9.5.2.8.111533. Copyright (c) 2004-2019, GBase. All Rights Reserved.
gbase> show priorities where node_name = 'node1';
+-----------+-------+----------+----------------+--------+-------------+
| node_name | group | priority | priority-weight | status | description|
+-----------+-------+----------+-----------------+-------+-------------+
| node1 | 0| 0 | 20 | ON |...... |
| node1 | 0| 1 | 40 | ON |...... |
| node1 | 0| 2 | 60 | ON |...... |
| node1 | 0| 3 | 80 | ON |...... |
| node1 | 1| 0 | 20 | ON |...... |
| node1 | 1| 1 | 40 | ON |...... |
| node1 | 1| 2 | 60 | ON |...... |
| node1 | 1| 3 | 80 | ON |...... |
| node1 | 2| 0 | 20 | OFF |...... |
| node1 | 2| 1 | 40 | OFF |...... |
| node1 | 2| 2 | 60 | OFF |...... |
| node1 | 2| 3 | 80 | OFF |...... |
......
| node1 | 15| 0 | 20 | OFF |...... |
| node1 | 15| 1 | 40 | OFF |...... |
| node1 | 15| 2 | 60 | OFF |...... |
| node1 | 15| 3 | 80 | OFF |...... |
+-----------+-------+----------+-----------------+-------+-------------+
64 rows in set
6. 配置优先级队列相关参数
通过在 gcluster(/opt/gcluster/config/gbase_8a_gcluster.cnf)与 gnode(/opt/gnode/config/gbase_8a_gbase.cnf)的配置文件中修改下面几个参数值来完成优先级队列的配置:
- gbase_use_priority_queue:参数设置为 0,表示关闭优先级队列;设置为 1,表示开启优先级队列;
- _gbase_priority_total_tasks:参数表示最大并行运行查询任务数目,包括DML 的查询部分,本参数最大值不能超过 128,缺省为本地 CPU 核数 2 倍;
- _gbase_priority_tasks:参数表示每个优先级队列可容纳最大任务数目(即可参加调度的数目),未能进入队列任务将阻塞等待,本参数最大值不能超过 64,缺省为本地 CPU 核数;
- gbase_use_res_ctrl_group:本参数决定是不是启用资源控制组挂接。参数设置为 0,表示不开启资源控制组挂接,缺省设置为不开启状态;设置其他值为开启资源控制组挂接。
7. 指定查询 SQL 优先级
资 源 组 用 户 Session 可 根 据 具 体 情 况 , 通 过 使 用 hint ( 格 式 为/*+PRIORITY(‗priority_value‘)*/), 决定该 SQL 运行级别(即对应的优先级),本命令只限于查询 SQL。
语法
Select /*+PRIORITY(‗priority_value‘)*/ … |
权限要求
有 create, insert, drop, select 权限的用户。
说明
优先级设定只能小于或等于该用户优先级别,设置错误会恢复到用户优先级,并报出警告―can not upgrade to priority X‖( X 是执行用户的优先级)。
示例
gbase> create table t1(a int);
Query OK, 0 rows affected
gbase> insert into t1 values (1),(1),(2),(3),(5);
Query OK, 5 rows affected
Records: 5 Duplicates: 0 Warnings: 0
gbase> select /*+PRIORITY(„0‟)*/ * from t1 group by a;
+------+
| a |
+------+
| 1 |
| 2 |
| 3 |
| 5 |
+------+
4 rows in set