3 共通的监控工具
前面介绍的PostgreSQL监控工具都偏向于性能分析,没有告警功能。而且它们只是针对PostgreSQL的监视,有时需要监控整个业务相关的系统,这时候就要考虑通用的监控工具了。Linux下比较适合监控数据库的常用的工具有Nagios和Zabbix。Zabbix更容易使用,现在看上去也更被多数人看好,所以本文只介绍Zabbix监控PostgreSQL的方法。3.1 Zabbix简介
Zabbix是一个all in one高度集成的企业级监控解决方案。由一个中心的Zabbix Server和若干可能安装有Zabbix Agent被监控设备构成,主要特性可概括为以下几点
数据采集
支持agent和agent less(SNMP, IPMI, HTTP,FTP...)
支持基于JMX对java应用的监视
可灵活定制agent
数据存储
数据库为PostgreSQL,Mysql,Oracle,SQLite或DB2
可配置历史和趋势数据的保存时间
内建旧清理程序防止数据膨胀
报警
可定制报警阈值
灵活设置报警方式,邮件,SMS,脚本
支持报警升级
报警消息可使用宏变量定制
可视化
可定制的数据图形
仪表盘
地图
所有配置都通过GUI编辑
大规模部署
支持模板
自动发现主机和监控项目
通过Zabbix Proxy实现分布式部署
其他
Zabbix API
认证和访问控制
IT资产收集
zabbix要想监视PostgreSQL这种应用型的对象,一般使用zabbix agent。zabbix agent有2种工作方式。
方式1:被动代理
由Zabbix Server(或Proxy)主动查询数据(如CPU负载),作为响应Zabbix Agent返回查询结果。这也是最简单最常用的方式。
方式2:主动代理
Zabbix Agent先从Zabbix Server获取需要主动报告的监控项目一览,然后定期发送新值到Zabbix Server。主动代理可以用于处理时间比较长的监控项,比如log 。
也可以使用Zabbix Trapper
方式3:Trapper
由Zabbix Agent主动报告数据。被监控端可调用zabbix_send命令或直接利用Socket发送数据到Zabbix Server。通过Trapper可以只在状态变更时进行报告。
另外还有把PostgreSQL状态通过SNMP代理发布的方案(http://pgsnmpd.projects.pgfoundry.org/),估计用的不多,本文不涉及。
详细参考:
https://www.zabbix.com/documentation/2.4/manual/concepts/agent
https://www.zabbix.com/documentation/2.4/manual/appendix/items/activepassive
https://www.zabbix.com/documentation/2.4/manual/config/items/itemtypes/trapper
Zabbix没有内置对PostgreSQL的监控项,所以如果要监控PostgreSQL需要做一些监控项的配置或定制,下面会介绍几种方法。
3.2 方法1:直接扩充UserParameter
自己修改zabbix_agentd.conf的配置文件,在Zabbix agent上增加PostgreSQL相关的监控项,使用psql发SQL的方式获取PostgreSQL的性能数据。例
zabbix_agentd.conf
- #Get the PostgreSQL version
- UserParameter=psql.version,psql --version|head -n1
- #Get the total number of Server Processes that are active
- UserParameter=psql.server_processes,psql -t -c "select sum(numbackends) from pg_stat_database"
- #Get the total number of commited transactions
- UserParameter=psql.tx_commited,psql -t -c "select sum(xact_commit) from pg_stat_database"
- #Get the total number of rolled back transactions
- UserParameter=psql.tx_rolledback,psql -t -c "select sum(xact_rollback) from pg_stat_database"
参照
https://www.zabbix.com/wiki/howto/monitor/db/postgresql
3.3 方法2:扩充UserParameter调用PostgreSQL监视脚本
和方法1类似,但方法1中定义的UserParameter太多,为方便起见,可以把监控PostgreSQL的SQL命令做成一个单独的脚本。
比如参照下面的例子https://www.zabbix.com/forum/showthread.php?t=8009
zabbix_agentd.conf:
- UserParameter=postgresql[*],/opt/zabbix/bin/zapost $1 $2
zapost:
- #
- # Name: zapost
- #
- # Checks PostgreSQL activity.
- #
- # Author: bashman
- #
- # Version: 1.0
- #
-
- zapostver="1.0"
- rval=0
- sql=""
-
- case $1 in
-
- #'summary')
- # sql="select a.datname, pg_size_pretty(pg_database_size(a.datid)) as size, cast(blks_hit/(blks_read+blks_hit+0.000001)*100.0 as numeric(5,2)) as cache, cast(xact_commit/(xact_rollback+xact_commit+0.000001)*100.0 as numeric(5,2)) as success from pg_stat_database a order by a.datname"
- # ;;
-
- #'size')
- #comprobar aqui los parametros
- # shift
- # sql="select pg_database_size('$1') as size"
- # ;;
-
- #'version')
- # sql='select version()'
- # ;;
-
- 'totalsize')
- sql="select sum(pg_database_size(datid)) as total_size from pg_stat_database"
- ;;
-
- 'db_cache')
- # comprueba los parametros
- if [ ! -z $2 ]; then
- shift
- sql="select cast(blks_hit/(blks_read+blks_hit+0.000001)*100.0 as numeric(5,2)) as cache from pg_stat_database where datname = '$1'"
- fi
- ;;
-
- 'db_success')
- # comprueba los parametros
- if [ ! -z $2 ]; then
- shift
- sql="select cast(xact_commit/(xact_rollback+xact_commit+0.000001)*100.0 as numeric(5,2)) as success from pg_stat_database where datname = '$1'"
- fi
- ;;
-
- 'server_processes')
- sql="select sum(numbackends) from pg_stat_database"
- ;;
-
- 'tx_commited')
- sql="select sum(xact_commit) from pg_stat_database"
- ;;
-
- 'tx_rolledback')
- sql="select sum(xact_rollback) from pg_stat_database"
- ;;
-
- 'db_size')
- # comprueba los parametros
- if [ ! -z $2 ]; then
- shift
- sql="select pg_database_size('$1')" #as size"
- fi
- ;;
-
- 'db_connections')
- # comprueba los parametros
- if [ ! -z $2 ]; then
- shift
- sql="select numbackends from pg_stat_database where datname = '$1'"
- fi
- ;;
-
- 'db_returned')
- # comprueba los parametros
- if [ ! -z $2 ]; then
- shift
- sql="select tup_returned from pg_stat_database where datname = '$1'"
- fi
- ;;
-
- 'db_fetched')
- # comprueba los parametros
- if [ ! -z $2 ]; then
- shift
- sql="select tup_fetched from pg_stat_database where datname = '$1'"
- fi
- ;;
-
- 'db_inserted')
- # comprueba los parametros
- if [ ! -z $2 ]; then
- shift
- sql="select tup_inserted from pg_stat_database where datname = '$1'"
- fi
- ;;
-
- 'db_updated')
- # comprueba los parametros
- if [ ! -z $2 ]; then
- shift
- sql="select tup_updated from pg_stat_database where datname = '$1'"
- fi
- ;;
-
- 'db_deleted')
- # comprueba los parametros
- if [ ! -z $2 ]; then
- shift
- sql="select tup_deleted from pg_stat_database where datname = '$1'"
- fi
- ;;
-
- 'db_commited')
- # comprueba los parametros
- if [ ! -z $2 ]; then
- shift
- sql="select xact_commit from pg_stat_database where datname = '$1'"
- fi
- ;;
-
- 'db_rolled')
- # comprueba los parametros
- if [ ! -z $2 ]; then
- shift
- sql="select xact_rollback from pg_stat_database where datname = '$1'"
- fi
- ;;
-
- 'version')
- sql="version"
- ;;
-
- 'zapostver')
- echo "$zapostver"
- exit $rval
- ;;
-
- *)
- echo "zapost version: $zapostver"
- echo "usage:"
- echo " $0 totalsize -- Check the total databases size."
- echo " $0 db_cache <dbname> -- Check the database cache hit ratio (percentage)."
- echo " $0 db_success <dbname> -- Check the database success rate (percentage)."
- echo " $0 server_processes -- Check the total number of Server Processes that are active."
- echo " $0 tx_commited -- Check the total number of commited transactions."
- echo " $0 tx_rolledback -- Check the total number of rolled back transactions."
- echo " $0 db_size <dbname> -- Check the size of a Database (in bytes)."
- echo " $0 db_connections <dbname> -- Check the number of active connections for a specified database."
- echo " $0 db_returned <dbname> -- Check the number of tuples returned for a specified database."
- echo " $0 db_fetched <dbname> -- Check the number of tuples fetched for a specified database."
- echo " $0 db_inserted <dbname> -- Check the number of tuples inserted for a specified database."
- echo " $0 db_updated <dbname> -- Check the number of tuples updated for a specified database."
- echo " $0 db_deleted <dbname> -- Check the number of tuples deleted for a specified database."
- echo " $0 db_commited <dbname> -- Check the number of commited back transactions for a specified database."
- echo " $0 db_rolled <dbname> -- Check the number of rolled back transactions for a specified database."
- echo " $0 version -- The PostgreSQL version."
- echo " $0 zapostver -- Version of this script."
- exit $rval
- ;;
- esac
-
- if [ "$sql" != "" ]; then
- if [ "$sql" == "version" ]; then
- psql --version|head -n1
- rval=$?
- else
- psql -t -c "$sql"
- rval=$?
- fi
- fi
-
- if [ "$rval" -ne 0 ]; then
- echo "ZBX_NOTSUPPORTED
3.4 方法3:使用Postbix或DBforBIX插件
和前面提到的2种方法相比,Postbix插件的功能更全面,它包含了PostgreSQL相关的监控项和图形的Zabbix模板。Postbix以一个的后台java deamon运行,这个deamon通过jdbc查询远端被监控数据库的状态然后以trap agent的方式发送到Zabbix Server。Postbix来自http://www.smartmarmot.com/,除了Postbix该公司还有Orabbix,MySQLBix,这些工具的内部架构和使用方法基本相同,唯一的区别就是支持的被监控数据库不同。所以smartmarmot又推出了整合这几种数据库监控能力的DBforBIX。DBforBIX的内部结构和使用方法和Postbix基本相同,下面介绍一下DBforBIX的简单的使用例子。
1)下载dbforbix
http://www.smartmarmot.com/product/dbforbix/dbforbix-download/
2)在Zabbix Server上安装dbforbix
- [root@zabbix ~]# mkdir /opt/dbforbix
- [root@zabbix ~]# cd /opt/dbforbix
- [root@zabbix dbforbix]# unzip /root/dbforbix-0.6.1.zip
- [root@zabbix dbforbix]# cp /opt/dbforbix/init.d/dbforbix /etc/init.d/dbforbix
- [root@zabbix dbforbix]# chmod +x /etc/init.d/dbforbix
- [root@zabbix dbforbix]# chmod +x /opt/dbforbix/run.sh
- [root@zabbix dbforbix]# chkconfig dbforbix on
3)导入dbforix的模板到Zabbix服务器
点击Zabbix GUI画面的"Configuration->Templates->Import"把下面的模板文件导入Zabbix服务器。
/opt/dbforbix/template/template_postgresql.xml
4)在被监控PostgreSQL实例上创建DBforBIX使用的账号并赋予权限
5)修改config.props
通过拷贝config.props.sample生成config.props, 然后修改config.props设置Zabbix Server的IP和端口号,设置被监控PostgreSQL数据库的访问账号
注意:/opt/dbforbix/init.d/dbforbix有个Bug,本来应该从config.props中读取pid文件名的,结果写死了是dbforbix.pid,config.props.sample中的默认值又是orabix.pid。
6)修改pgsqlquery.props
通过拷贝pgsqlquery.props.sample生成pgsqlquery.props。pgsqlquery.props中定义了监控项目及对应的查询SQL,可以编辑QueryList对监控项做筛选。
看一下 pgsqlquery.props.sample包含的内容
pgsqlquery.props.sample:
7)启动dbforbix deamon
/opt/dbforbix/template/template_postgresql.xml
4)在被监控PostgreSQL实例上创建DBforBIX使用的账号并赋予权限
- CREATE USER zabbix WITH PASSWORD 'passw0rd';
- GRANT SELECT ON pg_stat_activity to zabbix;
- GRANT SELECT ON pg_stat_activity to zabbix;
- GRANT SELECT ON pg_database to zabbix;
- GRANT SELECT ON pg_authid to zabbix;
- GRANT SELECT ON pg_stat_bgwriter to zabbix;
- GRANT SELECT ON pg_locks to zabbix;
- GRANT SELECT ON pg_stat_database to zabbix
通过拷贝config.props.sample生成config.props, 然后修改config.props设置Zabbix Server的IP和端口号,设置被监控PostgreSQL数据库的访问账号
- [root@zabbix dbforbix]# cp /opt/dbforbix/conf/config.props.sample /opt/dbforbix/conf/config.props
- [root@zabbix dbforbix]# vi /opt/dbforbix/conf/config.props
- ZabbixServerList=ZabbixServer
- ZabbixServer.Address=IP_ADDRESS_OF_ZABBIX_SERVER
- ZabbixServer.Port=PORT_OF_ZABBIX_SERVER
- ...
- DBforBIX.PidFile=./logs/dbforbix.pid
- ...
- DatabaseList=PGSQLDB2
- PGSQLDB2.Url=jdbc:postgresql://host:port/database
- PGSQLDB2.User=zabbix
- PGSQLDB2.Password=passw0rd
注意:/opt/dbforbix/init.d/dbforbix有个Bug,本来应该从config.props中读取pid文件名的,结果写死了是dbforbix.pid,config.props.sample中的默认值又是orabix.pid。
6)修改pgsqlquery.props
通过拷贝pgsqlquery.props.sample生成pgsqlquery.props。pgsqlquery.props中定义了监控项目及对应的查询SQL,可以编辑QueryList对监控项做筛选。
- [root@zabbix dbforbix]# cp /opt/dbforbix/conf/pgsqlquery.props.sample /opt/dbforbix/conf/pgsqlquery.props
看一下 pgsqlquery.props.sample包含的内容
pgsqlquery.props.sample:
- QueryList=activeconn,tupfetched,tupinserted,tupupdated,tupdeleted,xactcommit,xactrollback,exclusivelock,accessexclusivelock,accesssharelock,rowsharelock,rowexclusivelock,shareupdateexclusivelock,sharerowexclusivelock,checkpoints_timed,checkpoints_req,buffers_checkpoint,buffers_clean,maxwritten_clean,buffers_backend,buffers_alloc
-
- #statistic of database
- activeconn.Query=select sum(numbackends) from pg_stat_database
- tupreturned.Query=select sum(tup_returned) from pg_stat_database
- tupfetched.Query=select sum(tup_fetched) from pg_stat_database
- tupinserted.Query=select sum(tup_inserted) from pg_stat_database
- tupupdated.Query=select sum(tup_updated) from pg_stat_database
- tupdeleted.Query=select sum(tup_deleted) from pg_stat_database
- xactcommit.Query=SELECT sum(xact_commit) FROM pg_stat_database
- xactrollback.Query=SELECT sum(xact_rollback) FROM pg_stat_database
-
- #locks
- exclusivelock.Query=SELECT count(*) FROM pg_locks where mode='ExclusiveLock'
- accessexclusivelock.Query=SELECT count(*) FROM pg_locks where mode='AccessExclusiveLock'
- accesssharelock.Query=SELECT count(*) FROM pg_locks where mode='AccessShareLock'
- rowsharelock.Query=SELECT count(*) FROM pg_locks where mode='RowShareLock'
- rowexclusivelock.Query=SELECT count(*) FROM pg_locks where mode='RowExclusiveLock'
- shareupdateexclusivelock.Query=SELECT count(*) FROM pg_locks where mode='ShareUpdateExclusiveLock'
- sharerowexclusivelock.Query=SELECT count(*) FROM pg_locks where mode='ShareRowExclusiveLock'
-
- checkpoints_timed.Query=select checkpoints_timed from pg_stat_bgwriter
- checkpoints_req.Query=select checkpoints_req from pg_stat_bgwriter
- buffers_checkpoint.Query=select buffers_checkpoint from pg_stat_bgwriter
- buffers_clean.Query=select buffers_clean from pg_stat_bgwriter
- maxwritten_clean.Query=select maxwritten_clean from pg_stat_bgwriter
- buffers_backend.Query=select buffers_backend from pg_stat_bgwriter
- buffers_alloc.Query=select buffers_alloc from pg_stat_bgwriter
7)启动dbforbix deamon
- [root@zabbix dbforbix]# /etc/init.d/dbforbix start
8) 在Zabbix Server上创建Host
点击Zabbix GUI画面的"Configuration->Hosts->Create Host"为被监控数据库创建一个专门的Host。"Host name"设置为“PGSQLDB2”( 必须和config.props的DatabaseList中的名称一致,这里是“PGSQLDB2”)。并且把Host“PGSQLDB2”链接到前面导入的模板"Template_PostgeSQL"。
9) 检查数据是否已被收集
点击Zabbix GUI画面的"Monitoring->Last data"检查数据是否已被收集。
参考
http://www.smartmarmot.com/wiki/index.php/DBforBIX
3.5 方法4:使用pg_monz模板
pg_monz是一套可以监控PostgreSQL的zabbix模板,通过定制的agent U serParameter监控PostgreSQL数据库 ,并且利用Zabbix的发现机制可以自动发现和监视数据库和表。pg_monz由下面几个文件组成
文件 | 说明 |
pg_monz_template.xml | 模版定义文件 |
userparameter_pgsql.conf | 提供PostgreSQL监控项目的用户参数定义 |
find_dbname.sh | PG数据库的自动发现脚本 |
find_dbname_table.sh | PG数据表的自动发现脚本 |
要了解pg_monz支持哪些监控项目,看一下 userparameter_pgsql.conf 就可以了
userparameter_pgsql.conf:
点击(此处)折叠或打开
- # PostgreSQL user parameter
-
- #
- # Server specific examples
- #
- # Get the total number of commited transactions
- UserParameter=psql.tx_commited[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select sum(xact_commit) from pg_stat_database"
- # Get the total number of rolled back transactions
- UserParameter=psql.tx_rolledback[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select sum(xact_rollback) from pg_stat_database"
- # Max Connections
- UserParameter=psql.server_maxcon[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "show max_connections"
- # PostgreSQL is running
- UserParameter=psql.running[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select 1" > /dev/null 2>&1 ; echo $?
-
- # Added by SRA OSS
- # Get number of checkpoint count (by checkpoint_timeout)
- UserParameter=psql.checkpoints_timed[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select checkpoints_timed from pg_stat_bgwriter"
- # Get number of checkpoint count (by checkpoint_segments)
- UserParameter=psql.checkpoints_req[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select checkpoints_req from pg_stat_bgwriter"
- # Get the total number of connections
- UserParameter=psql.server_connections[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select count(*) from pg_stat_activity;"
- # Get the total number of active (on processing SQL) connections
- UserParameter=psql.active_connections[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select count(state) from pg_stat_activity where state = 'active'"
- # Get the total number of idle connections
- UserParameter=psql.idle_connections[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select count(state) from pg_stat_activity where state = 'idle'"
- # Get the total number of idle in transaction connections
- UserParameter=psql.idle_tx_connections[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select count(state) from pg_stat_activity where state = 'idle in transaction'"
- # Get the total number of lock-waiting connections
- UserParameter=psql.locks_waiting[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select count(*) from pg_stat_activity where waiting = 't'"
-
- # Get buffer information
- UserParameter=psql.buffers_checkpoint[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select buffers_checkpoint from pg_stat_bgwriter"
- UserParameter=psql.buffers_clean[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select buffers_clean from pg_stat_bgwriter"
- UserParameter=psql.maxwritten_clean[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select maxwritten_clean from pg_stat_bgwriter"
- UserParameter=psql.buffers_backend[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select buffers_backend from pg_stat_bgwriter"
- UserParameter=psql.buffers_backend_fsync[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select buffers_backend_fsync from pg_stat_bgwriter"
- UserParameter=psql.buffers_alloc[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select buffers_alloc from pg_stat_bgwriter"
-
- # Get number of slow queries
- UserParameter=psql.slow_queries[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select count(*) from pg_stat_activity where state = 'active' and now() - query_start > '$5 sec'::interval"
- UserParameter=psql.slow_select_queries[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select count(*) from pg_stat_activity where state = 'active' and now() - query_start > '$5 sec'::interval and query ilike 'select%'"
- UserParameter=psql.slow_dml_queries[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select count(*) from pg_stat_activity where state = 'active' and now() - query_start > '$5 sec'::interval and query ~* '^(insert|update|delete)'"
-
- #
- # Database specific examples
- #
- # Get the size of a Database (in bytes)
- UserParameter=psql.db_size[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select pg_database_size('$5')"
- # Get number of active connections for a specified database
- UserParameter=psql.db_connections[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select numbackends from pg_stat_database where datname = '$5'"
- # Get number of tuples returned for a specified database
- UserParameter=psql.db_returned[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select tup_returned from pg_stat_database where datname = '$5'"
- # Get number of tuples fetched for a specified database
- UserParameter=psql.db_fetched[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select tup_fetched from pg_stat_database where datname = '$5'"
- # Get number of tuples inserted for a specified database
- UserParameter=psql.db_inserted[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select tup_inserted from pg_stat_database where datname = '$5'"
- # Get number of tuples updated for a specified database
- UserParameter=psql.db_updated[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select tup_updated from pg_stat_database where datname = '$5'"
- # Get number of tuples deleted for a specified database
- UserParameter=psql.db_deleted[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select tup_deleted from pg_stat_database where datname = '$5'"
- # Get number of commited/rolled back transactions for a specified database
- UserParameter=psql.db_tx_commited[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select xact_commit from pg_stat_database where datname = '$5'"
- UserParameter=psql.db_tx_rolledback[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select xact_rollback from pg_stat_database where datname = '$5'"
-
- # Cache Hit Ratio
- UserParameter=psql.cachehit_ratio[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "SELECT round(blks_hit*100/(blks_hit+blks_read), 2) AS cache_hit_ratio FROM pg_stat_database WHERE datname = '$5' and blks_read > 0 union all select 0.00 AS cache_hit_ratio order by cache_hit_ratio desc limit 1"
-
- # Added by SRA OSS
- # Get number of temp files
- UserParameter=psql.db_temp_files[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select temp_files from pg_stat_database where datname = '$5'"
- # Get temp file size (in bytes)
- UserParameter=psql.db_temp_bytes[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select temp_bytes from pg_stat_database where datname = '$5'"
- # Get percentage of dead tuples of all tables for a specified database
- UserParameter=psql.db_dead_tup_ratio[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select round(sum(n_dead_tup)*100/sum(n_live_tup+n_dead_tup), 2) as dead_tup_ratio from pg_stat_all_tables where n_live_tup > 0"
- # Get number of deadlocks for a specified database (9.2 or later)
- UserParameter=psql.db_deadlocks[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select deadlocks from pg_stat_database where datname = '$5'"
-
-
- #
- # Table specific examples
- #
- # Get table cache hit ratio of a specific table
- UserParameter=psql.table_cachehit_ratio[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select round(heap_blks_hit*100/(heap_blks_hit+heap_blks_read), 2) as cache_hit_ratio from pg_statio_user_tables where schemaname = '$5' and relname = '$6' and heap_blks_read > 0 union all select 0.00 as cache_hit_ratio order by cache_hit_ratio desc limit 1"
- # Get number of sequencial scan of a specific table
- UserParameter=psql.table_seq_scan[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select seq_scan from pg_stat_user_tables where schemaname = '$5' and relname = '$6'"
- # Get number of index scan of a specific table
- UserParameter=psql.table_idx_scan[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select coalesce(idx_scan,0) from pg_stat_user_tables where schemaname = '$5' and relname = '$6'"
- # Get number of vacuum count of a specific table
- UserParameter=psql.table_vacuum_count[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select vacuum_count from pg_stat_user_tables where schemaname = '$5' and relname = '$6'"
- # Get number of analyze count of a specific table
- UserParameter=psql.table_analyze_count[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select analyze_count from pg_stat_user_tables where schemaname = '$5' and relname = '$6'"
- # Get number of autovacuum count of a specific table
- UserParameter=psql.table_autovacuum_count[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select autovacuum_count from pg_stat_user_tables where schemaname = '$5' and relname = '$6'"
- # Get number of autoanalyze count of a specific table
- UserParameter=psql.table_autoanalyze_count[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select autoanalyze_count from pg_stat_user_tables where schemaname = '$5' and relname = '$6'"
-
- # Get number of tuples of a specific table
- UserParameter=psql.table_n_tup_ins[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select n_tup_ins from pg_stat_user_tables where schemaname = '$5' and relname = '$6'"
- UserParameter=psql.table_n_tup_upd[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select n_tup_upd from pg_stat_user_tables where schemaname = '$5' and relname = '$6'"
- UserParameter=psql.table_n_tup_del[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select n_tup_del from pg_stat_user_tables where schemaname = '$5' and relname = '$6'"
- UserParameter=psql.table_seq_tup_read[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select seq_tup_read from pg_stat_user_tables where schemaname = '$5' and relname = '$6'"
- UserParameter=psql.table_idx_tup_fetch[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select coalesce(idx_tup_fetch,0) from pg_stat_user_tables where schemaname = '$5' and relname = '$6'"
- UserParameter=psql.table_n_tup_hot_upd[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select n_tup_hot_upd from pg_stat_user_tables where schemaname = '$5' and relname = '$6'"
- UserParameter=psql.table_n_live_tup[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select n_live_tup from pg_stat_user_tables where schemaname = '$5' and relname = '$6'"
- UserParameter=psql.table_n_dead_tup[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select n_dead_tup from pg_stat_user_tables where schemaname = '$5' and relname = '$6'"
-
-
- #
- # Discovery Rule
- #
- # Database Discovery
- UserParameter=db.list.discovery[*],$5/find_dbname.sh $1 $2 $3 $4
- UserParameter=db_table.list.discovery[*],$5/find_dbname_table.sh $1 $2 $3 $4
使用例:
1)下载pg_monz
https://github.com/pg-monz/pg_monz/releases
2)安装pg_monz
- [root@zabbix ~]# tar xfz pg_monz-1.0.tar.gz
- [root@zabbix ~]# cd pg_monz-1.0/pg_monz
- [root@zabbix pg_monz]# cp find_dbname.sh find_dbname_table.sh /usr/local/bin/
- [root@zabbix pg_monz]# cp userparameter_pgsql.conf /etc/zabbix/zabbix_agentd.d/
- [root@zabbix pg_monz]# chmod +x /usr/local/bin/find_dbname.sh
- [root@zabbix pg_monz]# chmod +x /usr/local/bin/find_dbname_table.sh
- [root@zabbix pg_monz]# /etc/init.d/zabbix-agent restart
3)导入模板到Zabbix服务器
点击Zabbix GUI画面的"Configuration->Templates->Import"把模板文件pg_monz_template.xml导入Zabbix服务器。
点击Zabbix GUI画面的"Configuration->Templates",再点击其中的"PostgreSQL Check"模板,然后点击"Marcos" Tab设置必要宏参数(尤其是连接相关的参数)。
5)在Zabbix Server上创建Host
点击Zabbix GUI画面的"Configuration->Hosts->Create Host"为被监控数据库所在主机创建一个Host,如果该主机的Host已存在也可使用已有Host。这个Host要设置Zabbix Agent,并且把该Host链接到前面导入的模板"PostgeSQL Check"。
6) 检查数据是否已被收集
点击Zabbix GUI画面的"Monitoring->Last data"检查数据是否已被收集。
参考
http://pg-monz.github.io/pg_monz/index-en.html
3.6 小结
以上的方法1和方法2都需要自己再进行定制,而DBforBIX和pg_monz已经比较成熟了。pg_monz和DBforBIX相比更简单,可监控的PostgreSQL项目也更多,还可以自动发现库和表;DBforBIX的优势则在于支持监控多种常用的数据库以及可以使用jdbc连接池。综合而言如果不需要监控多种数据库个人倾向于pg_monz。
转载: http://blog.chinaunix.net/uid-20726500-id-4513716.html