一、 实现功能
由于pgsentinel插件存在严重的内存占用问题,本篇改为自行实现,但其语句仍可以参考pgsentinel插件。PostgreSQL ash —— pgsentinel插件 学习与踩坑记录_CSDN博客
v1.0 根据pg 14版本设计及测试,仅支持收集主库信息。默认每10秒收集一次 active与idle in transaction 状态会话信息,保留两个月。
二、 历史会话与阻塞信息
参考 pgsentinel插件的pg_active_session_history视图及pg pg_stat_activity视图,根据不同版本,其中部分字段的值可能为空。
pg_ash表
列名 | 数据类型 | 字段含义 |
---|---|---|
ash_time | timestamp with time zone | 采样时间 |
datid | oid | 会话连接的dbid |
datname | name | 会话连接的DB名 |
pid | integer | 会话进程ID |
leader_pid | integer | 并行进程leader id,pg 13新增 |
usesysid | oid | user id |
usename | name | 用户名 |
application_name | text | 应用程序名 |
client_addr | inet | 客户端ip |
client_hostname | text | 客户端主机名 |
client_port | integer | 客户端端口 |
backend_start | timestamp with time zone | 会话连接到服务器的时间 |
xact_start | timestamp with time zone | 当前事务开始的时间,若无活跃事务则为 NULL |
query_start | timestamp with time zone | 当前活跃查询的开始时间。如果state不为active,则表示上个查询的开始时间 |
state_change | timestamp with time zone | state上次更改的时间 |
wait_event_type | text | 正在等待的事件类型(如果有) |
wait_event | text | 正在等待的事件名(如果有) |
state | text | 当前会话状态 |
backend_xid | xid | 该会话的顶层事务id(如果有) |
backend_xmin | xid | 该会话的xmin horizon |
query | text | active状态下,为当前正在执行的查询;其他状态下,表示最后执行的查询。 默认情况下,查询文本被截断为 1024 字节(由参数track_activity_query_size控制) |
query_id | bigint | 查询id,类似Oracle的sql_id,pg 14新增 |
backend_type | text | 当前会话类型,例如client backend, checkpointer, startup, walreceiver... pg 10新增 |
blockers | integer | 阻塞者数量 |
blockerpid | integer | 阻塞者进程id |
blocker_state | text | 阻塞者状态 |
三、 表结构创建
1. 按月进行分区
CREATE TABLE public.pg_ash (
ash_time timestamp with time zone,
datid oid,
datname name,
pid integer,
leader_pid integer,
usesysid oid,
usename name,
application_name text,
client_addr inet,
client_hostname text,
client_port integer,
backend_start timestamp with time zone,
xact_start timestamp with time zone,
query_start timestamp with time zone,
state_change timestamp with time zone,
wait_event_type text,
wait_event text,
state text,
backend_xid xid,
backend_xmin xid,
query text,
query_id bigint,
backend_type text,
blockers integer,
blockerpid integer,
blocker_state text
) PARTITION BY RANGE(ash_time);
-- 索引创建
CREATE INDEX idx_pg_ash_n1 ON pg_ash(ash_time);
-- 分区创建,超出最大范围的值会落入默认的final分区
CREATE TABLE pg_ash_history PARTITION OF pg_ash DEFAULT;
CREATE TABLE pg_ash_202310 PARTITION OF pg_ash FOR VALUES FROM ('2023-10-01') TO ('2023-11-01');
CREATE TABLE pg_ash_202311 PARTITION OF pg_ash FOR VALUES FROM ('2023-11-01') TO ('2023-12-01');
CREATE TABLE pg_ash_202312 PARTITION OF pg_ash FOR VALUES FROM ('2023-12-01') TO ('2024-01-01');
2. 定期自动新增分区
vi auto_create_partiton.sh
#!/bin/bash
source ~/.bash_profile
# 分区表主表名
MAIN_TABLE_NAME="pg_ash"
# 待新增分区表名
NEXT_MONTH=`date -d '+1 months' +%Y%m`
TABLE_NAME=${MAIN_TABLE_NAME}_${NEXT_MONTH}
# 分区范围
NEXT_MONTH_FIRST_DAY=`date -d '+1 months' +%Y%m01`
NEXT_2_MONTH_FIRST_DAY=`date -d '+2 months' +%Y%m01`
SQL="CREATE TABLE IF NOT EXISTS $TABLE_NAME (LIKE $MAIN_TABLE_NAME INCLUDING INDEXES); ALTER TABLE $MAIN_TABLE_NAME ATTACH PARTITION $TABLE_NAME FOR VALUES FROM ('$NEXT_MONTH_FIRST_DAY') TO ('$NEXT_2_MONTH_FIRST_DAY');"
#echo $SQL;
psql -c "$SQL"
chmod +x auto_create_partiton.sh
每月1号00:00 自动新建下月分区
crontab -e
00 00 01 * *
/data/postgres/home/postgres/auto_create_partiton
.sh
3. 自动清理旧分区
以保留两个月数据为例
vi auto_drop_old_partiton.sh
#!/bin/bash
source ~/.bash_profile
# 分区表主表名
MAIN_TABLE_NAME="pg_ash"
# 待新增分区表名
NEXT_MONTH=`date -d '+1 months' +%Y%m`
TABLE_NAME=${MAIN_TABLE_NAME}_${NEXT_MONTH}
# 分区范围
NEXT_MONTH_FIRST_DAY=`date -d '+1 months' +%Y%m01`
NEXT_2_MONTH_FIRST_DAY=`date -d '+2 months' +%Y%m01`
SQL="CREATE TABLE IF NOT EXISTS $TABLE_NAME (LIKE $MAIN_TABLE_NAME INCLUDING INDEXES); ALTER TABLE $MAIN_TABLE_NAME ATTACH PARTITION $TABLE_NAME FOR VALUES FROM ('$NEXT_MONTH_FIRST_DAY') TO ('$NEXT_2_MONTH_FIRST_DAY');"
#echo $SQL;
psql -c "$SQL"
chmod +x auto_drop_old_partiton.sh
每月30号23:00 自动删除旧分区
crontab -e
00 23 30 * * /data/postgres/home/postgres/auto_drop_old_partiton.sh
四、 数据插入
vi pg_ash.sh
#!/bin/bash
. ~/profile << EOF
5432
EOF
psql << EOF
insert into pg_ash
select now(), act.datid, act.datname, act.pid, act.leader_pid, act.usesysid, act.usename,
act.application_name, act.client_addr, act.client_hostname,
act.client_port, act.backend_start, act.xact_start, act.query_start,
act.state_change, act.wait_event_type, act.wait_event, act.state, act.backend_xid,
act.backend_xmin,act.query,act.query_id,act.backend_type,cardinality(pg_blocking_pids(act.pid))
as blockers,(pg_blocking_pids(act.pid))[1] as blockerpid ,blk.state as blocker_state
from pg_stat_activity act left join pg_stat_activity blk
on (pg_blocking_pids(act.pid))[1] = blk.pid
where act.state in ('active', 'idle in transaction') and act.pid != pg_backend_pid();
EOF
五、 设置定时执行
由于crontab最小只能按分钟执行,这里利用while true+sleep实现每十秒执行。
vi run.sh 与pg_ash.sh放在相同目录
#!/bin/bash
source .bash_profile
while [ true ]
do
sh ./pg_ash.sh
sleep 10
done
后台运行run.sh
nohup ./run.sh &
六、 测试运行效果
1. pgbench压测
- 初始化数据
-bash-4.2$ createdb pgbench
-bash-4.2$ pgbench -i pgbench
dropping old tables...
NOTICE: table "pgbench_accounts" does not exist, skipping
NOTICE: table "pgbench_branches" does not exist, skipping
NOTICE: table "pgbench_history" does not exist, skipping
NOTICE: table "pgbench_tellers" does not exist, skipping
creating tables...
generating data (client-side)...
100000 of 100000 tuples (100%) done (elapsed 0.23 s, remaining 0.00 s)
vacuuming...
creating primary keys...
done in 0.49 s (drop tables 0.00 s, create tables 0.01 s, client-side generate 0.27 s, vacuum 0.11 s, primary keys 0.10 s).
-bash-4.2$
-bash-4.2$ psql
psql (14.0)
Type "help" for help.
postgres=# \l
List of databases
Name | Owner | Encoding | Collate | Ctype | Access privileges
-----------+----------+----------+------------+------------+-----------------------
pgbench | postgres | UTF8 | en_US.utf8 | en_US.utf8 |
postgres | postgres | UTF8 | en_US.utf8 | en_US.utf8 |
template0 | postgres | UTF8 | en_US.utf8 | en_US.utf8 | =c/postgres +
| | | | | postgres=CTc/postgres
template1 | postgres | UTF8 | en_US.utf8 | en_US.utf8 | =c/postgres +
| | | | | postgres=CTc/postgres
testdb | postgres | UTF8 | en_US.utf8 | en_US.utf8 |
(5 rows)
postgres=# \c pgbench
You are now connected to database "pgbench" as user "postgres".
pgbench=# \d
List of relations
Schema | Name | Type | Owner
--------+------------------+-------+----------
public | pgbench_accounts | table | postgres
public | pgbench_branches | table | postgres
public | pgbench_history | table | postgres
public | pgbench_tellers | table | postgres
(4 rows)
- 压测脚本
vi test.sql
\set aid random(1, 100000 * :scale)
\set bid random(1, 1 * :scale)
\set tid random(1, 10 * :scale)
\set delta random(-5000, 5000)
BEGIN;
UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;
SELECT abalance FROM pgbench_accounts WHERE aid = :aid;
UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;
UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;
INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);
END;
- 执行压测
pgbench -c 4 -t 30000 pgbench -r -f test.sql
2. 运行ash脚本
nohup ./run.sh &
3. 查询ash数据
postgres=# select * from pg_ash;
-[ RECORD 1 ]----+-----------------------------------------------------------------------
ash_time | 2023-10-12 05:44:07.152751+08
datid | 41585
datname | pgbench
pid | 1530
leader_pid |
usesysid | 10
usename | postgres
application_name | pgbench
client_addr |
client_hostname |
client_port | -1
backend_start | 2023-10-12 05:44:04.461672+08
xact_start | 2023-10-12 05:44:07.144351+08
query_start | 2023-10-12 05:44:07.145214+08
state_change | 2023-10-12 05:44:07.145215+08
wait_event_type | Lock
wait_event | transactionid
state | active
backend_xid | 677819
backend_xmin | 677814
query | UPDATE pgbench_branches SET bbalance = bbalance + 3177 WHERE bid = 1;
query_id | -6995838559535145041
backend_type | client backend
blockers | 1
blockerpid | 1533
blocker_state | active
-[ RECORD 2 ]----+-----------------------------------------------------------------------
ash_time | 2023-10-12 05:44:07.152751+08
datid | 41585
datname | pgbench
pid | 1531
leader_pid |
usesysid | 10
usename | postgres
application_name | pgbench
client_addr |
client_hostname |
client_port | -1
backend_start | 2023-10-12 05:44:04.463697+08
xact_start | 2023-10-12 05:44:07.151628+08
query_start | 2023-10-12 05:44:07.152311+08
state_change | 2023-10-12 05:44:07.152312+08
wait_event_type | Lock
wait_event | transactionid
state | active
backend_xid | 677821
backend_xmin | 677817
query | UPDATE pgbench_tellers SET tbalance = tbalance + 1637 WHERE tid = 8;
query_id | -9151069917332221911
backend_type | client backend
blockers | 1
blockerpid | 1530
blocker_state | active
...
参考:
GitHub - pgsentinel/pgsentinel: postgresql extension providing Active session history