一、前言
- 慢查询SQL可以帮助我们找出响应时间较长的SQL语句,以便进行优化。我们可以通过分析慢查询日志来找出哪些SQL语句需要优化,以及如何进行优化 。本篇文章将讲解介绍如何在StarRocks当中查看历史SQL慢查询,使用grafana展示图表以便我们分析查看。
二、效果
- StarRocks 分桶、副本
- SQL慢查询、查询排行
二、配置StarRocks
1、日志插件部署
- 配置StarRocks审计日志插件
- 问题1:刚开始部署完,没什么问题,随着数据量写入的增多,报错 Too many versions.,调整了以下参数后正常。
# The max size of a batch, default is 50MB
max_batch_size=83886080
# The max interval of batch loaded, default is 60 seconds
max_batch_interval_sec=300
2、Tablet巡检工具部署
- 配置StarRocks tablet巡检工具
- 编写脚本,将生成的csv结果stream_load方式导入StarRocks
(1) 在StarRocks创建表结构
-- 创建buckets tablet表结构
CREATE TABLE starrocks_audit_db__.monitor_table_tablet (
`insert_time` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT "入库时间",
`db` varchar(65533) NOT NULL COMMENT "数据库",
`table_name` varchar(65533) NOT NULL COMMENT "表名",
`replica_counts` int(11) NULL COMMENT "分桶数 * 副本数",
`data_size` double NULL COMMENT "数据大小",
`tablet_size_max` double NULL COMMENT "最大桶大小",
`tablet_size_min` double NULL COMMENT "最小桶大小",
`tablet_size_avg` double NULL COMMENT "平均桶大小",
`table_standard_deviation` double NULL COMMENT "标准差"
) ENGINE=OLAP
DUPLICATE KEY(`insert_time`, `db`, `table_name`)
partition by range(insert_time) ()
DISTRIBUTED BY HASH(`insert_time`, `db`, `table_name`) BUCKETS 1
PROPERTIES
(
"dynamic_partition.time_unit" = "DAY",
"dynamic_partition.start" = "-30",
"dynamic_partition.end" = "3",
"dynamic_partition.prefix" = "p",
"dynamic_partition.buckets" = "1",
"dynamic_partition.enable" = "true",
"replication_num" = "3"
);
(2) 运行shell脚本,将工具生成的csv文件导入到StarRocks
cat /app/shell/monitor_tablet.sh
#!/bin/bash
# 指定配置文本和目标文件路径
FILE_PATH="config.properties"
# 使用 cat 命令将配置文本写入文件
cat > /app/plugin/starrocks_tablet_tool/config.properties <<EOF
host=192.168.10.100:9030
username=root
password=mXZo7nYEkKAmBmr
database=
tablename=
EOF
ls
java -jar /app/plugin/starrocks_tablet_tool/starrocks-tablet-tool.jar /app/plugin/starrocks_tablet_tool/config.properties result.csv
sed '1d' result.csv > result_new.csv
milliseconds=$(date +%s%3N)
uuid=uuidgen | tr '[:upper:]' '[:lower:]'
label=$(echo 'starrocks_audit_db___monitor_table_tablet_')$milliseconds
curl --location-trusted -u root:mXZo7nYEkKAmBmr -H "label:$label" \
-H "column_separator:," \
-H "columns: db, table_name, data_size, replica_counts, tablet_size_max, tablet_size_min, tablet_size_avg, table_standard_deviation" \
-T result_new.csv -XPUT \
http://192.168.10.100:8030/api/starrocks_audit_db__/monitor_table_tablet/_stream_load
(3) 配置定时任务每天凌晨执行脚本,导入StarRocks
01 0 * * * /bin/bash /app/shell/monitor_tablet.sh > /dev/null 2>&1 &
三、配置Grafana
- 配置Grafana变量
-- st_time
select DISTINCT(DATE_FORMAT(timestamp, '%Y-%m-%d')) as "日期" from starrocks_audit_db__.starrocks_audit_tbl__ order by DATE_FORMAT(timestamp, '%Y-%m-%d') desc limit 7;
-- ed_time
select DISTINCT(DATE_FORMAT(timestamp, '%Y-%m-%d')) as "日期" from starrocks_audit_db__.starrocks_audit_tbl__ order by DATE_FORMAT(timestamp, '%Y-%m-%d') desc limit 7;
-- Databases
select DISTINCT(db) as '库名' from starrocks_audit_db__.monitor_table_tablet;
2、配置Table图表
- (1). 慢查询SQL-top10
--慢查询SQL-top10
select
timestamp as "查询开始时间",
(queryTime * 0.001) as "查询时间/s",
user as "查询用户名",
db as "查询所在数据库",
returnRows as "查询返回的结果行数",
stmt as "SQL原始语句"
from
starrocks_audit_db__.starrocks_audit_tbl__
where
timestamp >= '$st_time'
and timestamp < date_add(curdate(), interval 1 day)
and isQuery = 1
and queryTime * 0.001 >= 3
order by
queryTime * 0.001 desc
limit 10;
- (2). 用户查询数
--用户查询数
select
DATE_FORMAT(timestamp, '%Y-%m-%d') as "日期",
user as "用户",
count(1) as "总查询数"
from
starrocks_audit_db__.starrocks_audit_tbl__
where
timestamp >= '$st_time'
and timestamp < date_add(curdate(), interval 1 day)
group by
user,
DATE_FORMAT(timestamp, '%Y-%m-%d')
order by
DATE_FORMAT(timestamp, '%Y-%m-%d') desc ,count(1) desc;
- (3). Tablet Buckets
--Tablet Buckets
select
DATE_FORMAT(insert_time, '%Y-%m-%d') as 时间,
replica_counts as '副本数',
data_size as '数据大小/MB',
tablet_size_max as '最大桶大小/MB',
tablet_size_min as '最小桶大小/MB',
tablet_size_avg as '平均桶大小/MB',
table_standard_deviation as '标准差',
db as '库名',
table_name as '表名'
from
starrocks_audit_db__.monitor_table_tablet
where
DATE_FORMAT(insert_time, '%Y-%m-%d') >= '$st_time'
and db in ($Databases)
order by
table_standard_deviation desc;
- (4). 查询用户排行
SELECT
user,
DATE_FORMAT(timestamp, '%Y-%m-%d') AS "时间",
count(1) as "查询总数"
FROM starrocks_audit_db__.starrocks_audit_tbl__
where
timestamp >= '$st_time'
group by
user, DATE_FORMAT(timestamp, '%Y-%m-%d')
order by
DATE_FORMAT(timestamp, '%Y-%m-%d') desc ,count(1) desc;
- (5). top10大表
SELECT
DATE_FORMAT(insert_time, '%Y-%m-%d') AS 时间,
CONCAT(
IF(data_size >= 1024, ROUND(data_size / 1024, 2), data_size),
IF(data_size >= 1024, 'GB', 'MB')
) AS '数据大小',
db AS '库名',
table_name AS '表名'
FROM
starrocks_audit_db__.monitor_table_tablet
where
DATE_FORMAT(insert_time, '%Y-%m-%d') >= '$st_time'
and db in ($Databases)
ORDER BY
data_size DESC;
四、总结
-
StarRocks是一种新一代的极速全场景MPP数据库,它采用新一代的弹性MPP架构,可以高效支持大数据量级的多维分析、实时分析、高并发分析等多种数据分析场景。在StarRocks中,分区和分桶的选择是非常关键的。在建表时选择好的分区分桶列,可以有效提高集群整体性能, 分桶键选择高基数的列(去重后数据量最大的列),分桶的大小建议在100M-1G之间。
-
标准差越高,说明数据越分散,这样会导致每个分桶中数据量不均衡,从而影响查询性能。此外,还需关注tablet数据平均值是否合理,一般建议该值在100MB-1GB之间,对于表总体数据量比较小时可以容忍小一点,数据量大的表建议在1G左右。如果该值与建议值差异较大,可以适当调整建表语句的bucket数量大小。