GBase 8c慢日志启用和查询

最新推荐文章于 2025-07-15 16:36:51 发布

原创最新推荐文章于 2025-07-15 16:36:51 发布 · 1.1k 阅读

22 ·

CC 4.0 BY-SA版权

文章标签：

#数据库 #GBASE南大通用 #sql #GBase

原文链接：

https://www.gbase.cn/community/post/3985

更多精彩内容尽在南大通用GBase技术社区，南大通用致力于成为用户最信赖的数据库产品供应商。

GBase 8c可以通过慢日志定位问题、归因诊断分析。慢日志配置和使用方法如下：

1、慢日志配置

（1）相关GUC参数

GBase 8c慢日志主要相关配置参数为：

enable_stmt_track

on：默认值，启用Full/Slow SQL的捕获

off：关闭Full/Slow SQL的捕获

track_stmt_stat_level

组合参数，形式为'full sql stat level, slow sql stat level'

第一部分为全量SQL跟踪级别，取值范围为OFF、L0、L1、L2

第二部分为慢SQL的跟踪级别，取值范围为OFF、L0、L1、L2

建议值：OFF,L0

log_min_duration_statement

时间阈值，当某条语句执行持续时间大于该值以后，该语句执行信息会被记录，取值范围：-1~ 2147483647，单位为毫秒。默认值：30min

instr_unique_sql_count

系统中产生的unique sql条目数量大于instr_unique_sql_count时，将会自动清理或者不再记录

建议值：200000

track_stmt_details_size

单语句收集最大收集SQL大小，默认值：4096 byte。

track_stmt_retention _time

组合参数，控制全量/慢SQL记录的保留时间，以60秒为周期读取该参数，并执行清理超过保留时间的记录。该参数分为两部分，形式为'full sql retention time, slow sql retention time'

full sql retention time为全量SQL保留时间，取值范围为0 ~ 86400

slow sql retention time为慢SQL的保留时间，取值范围为0 ~ 604800

默认值：3600,604800

（2）慢日志配置方法

首先登录数据库安装用户：

su - gbase

执行：

gs_guc reload -Z coordinator -N all -I all -c "enable_stmt_track = ON"
gs_guc reload -Z coordinator -N all -I all -c "track_stmt_stat_level = 'OFF,L0'"
gs_guc reload -Z coordinator -N all -I all -c "log_min_duration_statement = 1000"
gs_guc reload -Z coordinator -N all -I all -c "instr_unique_sql_count = 200000"
gs_guc reload -Z coordinator -N all -I all -c "track_stmt_retention_time = '3600,10800'"

2、慢日志查询

登录数据库，可在GBase 8c数据库内查询慢日志和全量日志。注意需要在postgres系统库中查询。

慢日志查询视图：

dbe_perf.get_global_slow_sql_by_timestamp(start_timestamp timestamp with time zone, end_timestamp timestamp with time zone);

慢日志查询：

select * from dbe_perf.get_global_slow_sql_by_timestamp(start_timestamp, end_timestamp);

全量日志查询视图：

dbe_perf.get_global_full_sql_by_timestamp(start_timestamp timestamp with time zone, end_timestamp timestamp with time zone);

全量日志查询：

select * from dbe_perf.get_global_slow_sql_by_timestamp(start_timestamp, end_timestamp);

3、GBase 8c常见SQL调优手段

GUC参数调优

通过GUC参数设置的方式让执行计划倾向更优规划。

表定义优化

根据业务选择合适的表定义，包括：存储模型，表分布键，分区表，适当的数据类型，适当索引等。

表定义优化是指对表结构进行合理化设计，包括：

使用合适的存储模式(GBase8c支持行存，列存，内存存储)，可根据业务形态选择合适的存储模式；
选择合适的分布键，在分布式结构下，数据按照指定的分布键(列)对表进行拆分，若联表查询或操作数据在不同节点上，则可能需要对表进行重分布，效率低下。若操作数据均在相同节点可以降低SQL执行过程中的数据交互，SQL效率可得到大幅度提升。所以根据业务模型尽量将相关联数据拆分到相同节点，避免SQL执行中的数据交互。

设计合理的字段类型和长度；创建合理的索引；适当使用分区表等；

例如：分布键不同导致SQL效率差异


-- 两个表关联查询时，如果关联字段是分布键，数据库算子可以直接下推到DN节点执行，SQL执行效率较高。
postgres=# EXPLAIN SELECT * FROM td1,td2 WHERE td1.a=td2.c ORDER BY a;
                                 QUERY PLAN
------------------------------------------------------------------------------
Remote Subquery Scan on all (dn1,dn2,dn3)  (cost=2.04..2.05 rows=1 width=16)
  ->  Sort  (cost=2.04..2.05 rows=1 width=16)
        Sort Key: td1.a
        ->  Nested Loop  (cost=0.00..2.03 rows=1 width=16)
              Join Filter: (td1.a = td2.c)
              ->  Seq Scan on td1  (cost=0.00..1.01 rows=1 width=8)
              ->  Seq Scan on td2  (cost=0.00..1.01 rows=1 width=8)
(7 rows)

-- 两个表关联查询时，如果关联字段是非分布键，数据库需要将DN数据拉取到CN节点进行执行，SQL效率较低。
postgres=# EXPLAIN SELECT * FROM td1,td2 WHERE td1.b=td2.b ORDER BY a;
                                              QUERY PLAN
---------------------------------------------------------------------------------------------------------
Remote Subquery Scan on all (dn1,dn2,dn3)  (cost=2.04..2.05 rows=1 width=16)
  ->  Sort  (cost=2.04..2.05 rows=1 width=16)
        Sort Key: td1.a
        ->  Nested Loop  (cost=0.00..2.03 rows=1 width=16)
              Join Filter: (td1.b = td2.b)
              ->  Remote Subquery Scan on all (dn1,dn2,dn3)  (cost=100.00..101.02 rows=1 width=8)
                    Distribute results by H: b
                    ->  Seq Scan on td1  (cost=0.00..1.01 rows=1 width=8)
              ->  Materialize  (cost=100.00..101.03 rows=1 width=8)
                    ->  Remote Subquery Scan on all (dn1,dn2,dn3)  (cost=100.00..101.02 rows=1 width=8)
                          Distribute results by H: b
                          ->  Seq Scan on td2  (cost=0.00..1.01 rows=1 width=8)
(12 rows)

统计信息调优

GBase 8c 是基于代价估算生成的最优执行计划。优化器需要根据analyze收集的统计信息行数估算和代价估算，因此统计信息对优化器行数估算和代价估算起着至关重要的作用。对执行计划进行分析，找到SQL执行计划不够优的算子，并对算子进行优化或者强制改变执行计划的手段。通常情况下GBase8c在执行过程中都会选择最优的执行计划，不需要人为修改执行计划，除非确定该修改可以改善SQL执行效率。

算子级调优

一个查询语句要经过多个算子步骤才会输出最终的结果。由于个别算子耗时过长导致整体查询性能下降的情况比较常见。这些算子是整个查询的瓶颈算子。通用的优化手段是EXPLAIN ANALYZE/PERFORMANCE命令查看执行过程的瓶颈算子，然后进行针对性优化。

SQL语句改写

通过一定的规则调整SQL语句，在保证结果正确的基础上，能够提高SQL执行效率。如果遵守这些规则，常常能够大幅度提升业务查询效率。

示例：使用exists代替in

-- 使用in的方式查询SQL执行效率低下
select st.id,st.name from stu st where st.age=5 and st.id not in (
select stu_id from stu_info where code in ('a','b','c','d')
);
test=# explain select st.id,st.name from stu st where st.age=5 and st.id not in (
test(# select stu_id from stu_info where code in ('a','b','c','d')
test(# );
                                           QUERY PLAN
--------------------------------------------------------------------------------------------------
LightProxy  (cost=0.00..0.00 rows=1000 width=222)
  Node/s: All datanodes
  ->  Nested Loop Anti Join  (cost=0.00..2.22 rows=1 width=10)
        Join Filter: ((st.id = stu_info.stu_id) OR (st.id IS NULL) OR (stu_info.stu_id IS NULL))
        ->  Seq Scan on stu st  (cost=0.00..1.06 rows=1 width=10)
              Filter: (age = 5)
        ->  Materialize  (cost=0.00..1.09 rows=4 width=4)
              ->  Seq Scan on stu_info  (cost=0.00..1.07 rows=4 width=4)
                    Filter: ((code)::text = ANY ('{a,b,c,d}'::text[]))
(9 rows)


-- 正解，当使用exists替换in后效率较高
select st.id,st.name from stu st where st.age=5 and not exists (
select 1 from stu_info si where si.stu_id=st.id and si.code in ('a','b','c','d')
);
test=# explain select st.id,st.name from stu st where st.age=5 and not exists (
test(# select 1 from stu_info si where si.stu_id=st.id and si.code in ('a','b','c','d')
test(# );
                              QUERY PLAN
-------------------------------------------------------------------------
LightProxy  (cost=0.00..0.00 rows=1000 width=222)
  Node/s: All datanodes
  ->  Hash Right Anti Join  (cost=1.07..2.15 rows=1 width=10)
        Hash Cond: (si.stu_id = st.id)
        ->  Seq Scan on stu_info si  (cost=0.00..1.07 rows=4 width=4)
              Filter: ((code)::text = ANY ('{a,b,c,d}'::text[]))
        ->  Hash  (cost=1.06..1.06 rows=1 width=10)
              ->  Seq Scan on stu st  (cost=0.00..1.06 rows=1 width=10)
                    Filter: (age = 5)
(9 rows)