【最佳实践】分布式数据库对接开源监控软件

 Grafana+InfluxDB+Telegraf+SequoiaDB

Grafana作为一款强大的开源的监控软件,可以进行灵活的报表定制与性能监控。用户可以通过时序数据库InfluxDB作为数据源为Grafana提供性能监控数据,同时使用Telegraf作为性能数据采集工具从SequoiaDB中定时采集性能指标,已达到准实时性能监控的目的。

 

整个框架的搭建非常简单,需要编写代码的仅4行,基本上可以通过合理的配置完成环境的搭建。


本文使用1台阿里云RHEL7环境,并详细标明了每一步用户需要执行的命令。

 

1)安装Telegraf

直接通过wget下载安装

[root@iZ2ze06q07wqluc8htj4pdZ ~]# wget https://dl.influxdata.com/telegraf/releases/telegraf-1.5.0-1.x86_64.rpm
[root@iZ2ze06q07wqluc8htj4pdZ ~]# yum localinstall telegraf-1.5.0-1.x86_64.rpm

2)安装InfluxDB

直接通过wget下载安装

[root@iZ2ze06q07wqluc8htj4pdZ ~]# wget https://dl.influxdata.com/influxdb/releases/influxdb-1.4.2.x86_64.rpm
[root@iZ2ze06q07wqluc8htj4pdZ ~]# yum localinstall influxdb-1.4.2.x86_64.rpm

3)安装Grafana

直接通过wget下载安装

[root@iZ2ze06q07wqluc8htj4pdZ ~]# wget https://s3-us-west-2.amazonaws.com/grafana-releases/release/grafana-4.6.3-1.x86_64.rpm 
 [root@iZ2ze06q07wqluc8htj4pdZ ~]# yum localinstall grafana-4.6.3-1.x86_64.rpm

4)安装SequoiaDB

在官网注册下载SequoiaDB后解压,所有参数使用默认即可。

[root@iZ2ze06q07wqluc8htj4pdZ ~]# ./sequoiadb-2.8.4-linux_x86_64-enterprise-installer.run –SMS true

5)连接SequoiaDB SAC进行图形化安装

参考官方文档http://doc.sequoiadb.com/cn/SequoiaDB-cat_id-1483944500-edition_id-208


5.1)admin/admin用户名密码登录


5.2)选择右下方一键部署

 

5.3)添加本机为主机

 

5.4)配置单节点单副本最小集群 


5.5)确认配置点击下一步

 

5.6)启动成功

 

5.7)如果需要SQL接口可以下载并安装SequoiaSQL插件

所有安装参数均使用默认值即可。

[root@iZ2ze06q07wqluc8htj4pdZ ~]# ./sequoiasql-oltp-2.8.4-x86_64-enterprise-installer.run     

5.7.1)启动SequoiaSQL服务

# 从root用户切换到sdbadmin用户
[root@iZ2ze06q07wqluc8htj4pdZ sequoiasqloltp]# su - sdbadmin
上一次登录:五 1月  5 13:44:47 CST 2018pts/1 上
# 创建一个叫做testinst的实例
[sdbadmin@iZ2ze06q07wqluc8htj4pdZ ~]$ /opt/sequoiasqloltp/bin/sdb_sql_ctl addinst testinst -D /home/sdbadmin/sdb_data
Adding instance testinst ...
Ok
# 启动testinst实例
[sdbadmin@iZ2ze06q07wqluc8htj4pdZ ~]$ /opt/sequoiasqloltp/bin/sdb_sql_ctl start testinst
Starting instance testinst ...
ok (PID: 24206)
# 创建一个叫做testdb的数据库
[sdbadmin@iZ2ze06q07wqluc8htj4pdZ ~]$ /opt/sequoiasqloltp/bin/sdb_sql_ctl createdb testdb testinst
Creating database testinst ...
ok
# 开始创建SDB集合空间与集合
[sdbadmin@iZ2ze06q07wqluc8htj4pdZ ~]$ sdb "db=new Sdb()"
localhost:11810
[sdbadmin@iZ2ze06q07wqluc8htj4pdZ ~]$ sdb "db.createCS ('testcs')"
localhost:11810.testcs
[sdbadmin@iZ2ze06q07wqluc8htj4pdZ ~]$ sdb "db.testcs.createCL('testcl')"
localhost:11810.testcs.testcl
# 创建PGSQL映射表
[sdbadmin@iZ2ze06q07wqluc8htj4pdZ ~]$ /opt/sequoiasqloltp/bin/psql -p 5432 testdb
psql (9.3.4)
Type "help" for help.
# 创建Foreign Data Wrapper
testdb=# create extension sdb_fdw;
CREATE EXTENSION
# 创建SequoiaDB服务
testdb=# create server sdb_server foreign data wrapper sdb_fdw options(address '127.0.0.1', service '11810');
CREATE SERVER
# 创建SequoiaDB映射表,真实表存在于SequoiaDB之内
testdb=# create foreign table test ( name text, id numeric) server sdb_server options ( collectionspace 'testcs', collection 'testcl', decimal 'on');
CREATE FOREIGN TABLE
# 收集一下统计信息
testdb=# analyze test;
ANALYZE
# 确认表是空的
testdb=# select * from test ;
 name | id 
------+----
(0 rows)
# 写一条数据进去
testdb=# insert into test values('one', 1);
INSERT 0 1
# 试着更新一下
testdb=# update test set id=9 where name='one';
UPDATE 1
# 然后查看一下
testdb=# select * from test ;
 name | id 
------+----
 one  |  9
(1 row)

5.7.2)向SAC中添加SequoiaSQL服务

点击发现业务


选择SequoiaSQL引擎 


填入内网IP地址与端口


在数据中可以看到相应的表与数据 


6)验证SequoiaDB性能监控快照

# 连接数据库,验证snapshot与REST接口都是通的
[sdbadmin@iZ2ze06q07wqluc8htj4pdZ ~]$ sdb "db=new Sdb()"
localhost:11810
[sdbadmin@iZ2ze06q07wqluc8htj4pdZ ~]$ sdb "db.snapshot(SDB_SNAP_DATABASE)"
{
  "TotalNumConnects": 1,
  "TotalDataRead": 16187,
  "TotalIndexRead": 238,
  "TotalDataWrite": 15,
  "TotalIndexWrite": 16,
  "TotalUpdate": 8,
  "TotalDelete": 0,
  "TotalInsert": 7,
  "ReplUpdate": 0,
  "ReplDelete": 0,
  "ReplInsert": 0,
  "TotalSelect": 9706,
  "TotalRead": 9738,
  "TotalReadTime": 0,
  "TotalWriteTime": 0,
  "freeLogSpace": 1644167168,
  "vsize": 4315893760,
  "rss": 71836,
  "fault": 2,
  "TotalMapped": 1607335936,
  "svcNetIn": 761,
  "svcNetOut": 1396,
  "shardNetIn": 1564857,
  "shardNetOut": 1296727,
  "replNetIn": 0,
  "replNetOut": 0,
  "ErrNodes": []
}
Return 1 row(s).
[sdbadmin@iZ2ze06q07wqluc8htj4pdZ ~]$ curl -d "cmd=snapshot database" "172.17.230.225:11814" 
{ "errno": 0 }{ "TotalNumConnects": 1.0, "TotalDataRead": 19679.0, "TotalIndexRead": 242.0, "TotalDataWrite": 15.0, "TotalIndexWrite": 16.0, "TotalUpdate": 8.0, "TotalDelete": 0.0, "TotalInsert": 7.0, "ReplUpdate": 0.0, "ReplDelete": 0.0, "ReplInsert": 0.0, "TotalSelect": 11804.0, "TotalRead": 11836.0, "TotalReadTime": 0.0, "TotalWriteTime": 0.0, "freeLogSpace": 1644167168.0, "vsize": 4315893760.0, "rss": 71842.0, "fault": 2.0, "TotalMapped": 1607335936.0, "svcNetIn": 761.0, "svcNetOut": 1396.0, "shardNetIn": 1920028.0, "shardNetOut": 1672847.0, "replNetIn": 0.0, "replNetOut": 0.0, "ErrNodes": [] }

7)启动influxdb与telegraf

# 启动influxdb服务
[root@iZ2ze06q07wqluc8htj4pdZ ~]# systemctl restart influxdb
[root@iZ2ze06q07wqluc8htj4pdZ ~]# ps -elf | grep influxdb
4 S influxdb 22288     1  1  80   0 - 55908 futex_ 14:09 ?        00:00:00 /usr/bin/influxd -config /etc/influxdb/influxdb.conf
0 S root     22299 18679  0  80   0 - 28169 pipe_w 14:09 pts/1    00:00:00 grep --color=auto influxdb
# 进入influxdb并创建数据库
[root@iZ2ze06q07wqluc8htj4pdZ ~]# influx
Connected to http://localhost:8086 version 1.4.2
InfluxDB shell version: 1.4.2
> create database "telegraf"
> show databases
name: databases
name
----
telegraf
_internal
[root@iZ2ze06q07wqluc8htj4pdZ ~]# systemctl restart telegraf
[root@iZ2ze06q07wqluc8htj4pdZ ~]# ps -elf | grep telegraf
4 S telegraf 22195     1  1  80   0 - 23463 futex_ 14:08 ?        00:00:00 /usr/bin/telegraf -config /etc/telegraf/telegraf.conf -config-directory /etc/telegra/telegraf.d
0 S root     22206 18679  0  80   0 - 28169 pipe_w 14:08 pts/1    00:00:00 grep --color=auto telegraf
# telegraf启动以后确认influxdb可以被正常写入监控数据
[root@iZ2ze06q07wqluc8htj4pdZ ~]# influx
Connected to http://localhost:8086 version 1.4.2
InfluxDB shell version: 1.4.2
> use telegraf
Using database telegraf
> select * from cpu limit 10
name: cpu
time                cpu       host                    usage_guest usage_guest_nice usage_idle        usage_iowait        usage_irq usage_nice usage_softirq usage_steal usage_system        usage_user
----                ---       ----                    ----------- ---------------- ----------        ------------        --------- ---------- ------------- ----------- ------------        ----------
1515132510000000000 cpu-total iZ2ze06q07wqluc8htj4pdZ 0           0                98.89834752131219 0.10015022533794657 0         0          0             0           0.35052578868309764 0.650976464697222
1515132510000000000 cpu0      iZ2ze06q07wqluc8htj4pdZ 0           0                98.69608826480209 0.10030090270805804 0         0          0             0           0.3009027081244592  0.9027081243733776
1515132510000000000 cpu1      iZ2ze06q07wqluc8htj4pdZ 0           0                99.09999999999854 0.09999999999990905 0         0          0             0           0.3999999999999915  0.40000000000006253

8)使用exec插件收集SequoiaDB性能指标

这里是本框架中唯一需要编写代码的地方,总共四行代码,分别为test.sh与test.js。

# 创建test.sh与test.js程序用于定期收集SequoiaDB性能指标
[root@iZ2ze06q07wqluc8htj4pdZ ~]# cat /tmp/test.sh
#!/bin/sh
/opt/sequoiadb/bin/sdb -f /tmp/test.js | sed '/Return 1/d'
[root@iZ2ze06q07wqluc8htj4pdZ ~]# cat /tmp/test.js
var db = new Sdb();
db.snapshot(SDB_SNAP_DATABASE);
# telegraf使用自己的用户执行,因此在root下创建的脚本需要给全局执行权限
[root@iZ2ze06q07wqluc8htj4pdZ ~]# chmod 777 /tmp/test.sh
[root@iZ2ze06q07wqluc8htj4pdZ ~]# chmod 777 /tmp/test.js
 
# 在telegraf配置中配置telegraf.conf,反注释inputs.exec并编辑相关内容
[root@iZ2ze06q07wqluc8htj4pdZ ~]# vi /etc/telegraf/telegraf.conf
……
[[inputs.exec]]
commands = ["/tmp/test.sh"]
timeout = "5s"
data_format = "json"
 
# 验证配置文件,看到inputs.exec类型的数据确实收集了
[root@iZ2ze06q07wqluc8htj4pdZ ~]# telegraf --test
2018/01/05 16:02:27 I! Using config file: /etc/telegraf/telegraf.conf
* Plugin: inputs.mem, Collection 1
> mem,host=iZ2ze06q07wqluc8htj4pdZ available=3092996096i,used=882089984i,buffered=66174976i,used_percent=22.190462451570358,available_percent=77.80953754842965,total=3975086080i,cached=3042557952i,active=2068545536i,inactive=1559457792i,slab=164098048i,free=133742592i 1515139347000000000
……
* Plugin: inputs.exec, Collection 1
> exec,host=iZ2ze06q07wqluc8htj4pdZ TotalUpdate=8,TotalRead=12064,TotalDataWrite=15,shardNetOut=1795359,svcNetOut=1396,replNetIn=0,TotalWriteTime=0,vsize=4525711360,replNetOut=0,TotalIndexRead=242,TotalSelect=12032,svcNetIn=761,TotalIndexWrite=16,ReplInsert=0,TotalNumConnects=1,fault=2,ReplUpdate=0,ReplDelete=0,TotalReadTime=0,TotalInsert=7,rss=72156,TotalDelete=0,TotalDataRead=20059,freeLogSpace=1644167168,shardNetIn=2034612,TotalMapped=1607335936 1515139348000000000
* Plugin: inputs.kernel, Collection 1
> kernel,host=iZ2ze06q07wqluc8htj4pdZ boot_time=1515116065i,processes_forked=67543i,interrupts=11615115i,context_switches=20009921i 1515139348000000000
……
 
# 重启telegraf服务
[root@iZ2ze06q07wqluc8htj4pdZ ~]# systemctl restart telegraf
 
# 确认数据每十秒钟录入influxdb
[root@iZ2ze06q07wqluc8htj4pdZ ~]# influx
Connected to http://localhost:8086 version 1.4.2
InfluxDB shell version: 1.4.2
> use telegraf
Using database telegraf
> select * from exec
name: exec
time                ReplDelete ReplInsert ReplUpdate TotalDataRead TotalDataWrite TotalDelete TotalIndexRead TotalIndexWrite TotalInsert TotalMapped TotalNumConnects TotalRead TotalReadTime TotalSelect TotalUpdate TotalWriteTime fault freeLogSpace host                    replNetIn replNetOut rss   shardNetIn shardNetOut svcNetIn svcNetOut vsize
----                ---------- ---------- ---------- ------------- -------------- ----------- -------------- --------------- ----------- ----------- ---------------- --------- ------------- ----------- ----------- -------------- ----- ------------ ----                    --------- ---------- ---   ---------- ----------- -------- --------- -----
1515139522000000000 0          0          0          20064         15             0           242            16              7           1607335936  1                12067     0             12035       8           0              2     1644167168   iZ2ze06q07wqluc8htj4pdZ 0         0          72156 2036130    1796971     761      1396      4525711360
1515140020000000000 0          0          0          20069         15             0           242            16              7           1607335936  1                12070     0             12038       8           0              2     1644167168   iZ2ze06q07wqluc8htj4pdZ 0         0          72156 2037648    1798583     761      1396      4525711360

9)启动grafana

[root@iZ2ze06q07wqluc8htj4pdZ ~]# systemctl start grafana-server
[root@iZ2ze06q07wqluc8htj4pdZ ~]# ps -elf | grep grafana
4 S grafana   5708     1  5  80   0 - 74671 futex_ 16:57 ?        00:00:00 /usr/sbin/grafan-server --config=/etc/grafana/grafana.ini --pidfile=/var/run/grafana/grafana-server.pid cfg:default.paths.logs=/var/log/grafana cfg:default.paths.data=/var/lib/grafana cfg:default.paths.plugins=/var/lib/grafana/plugins
0 S root      5725  1702  0  80   0 - 28169 pipe_w 16:57 pts/1    00:00:00 grep --color=auto grafana

连接3000端口可以看到

 

 

使用admin/admin登录后添加influxdb数据源 

点击Save&Test通过测试

 

10)制定报表

在Dashboard中创建一个新的Panel,数据源选择刚刚添加的SequoiaDB,然后在指标中配置好4个指标TotalDataRead、TotalDataWrite、TotalIndexRead、TotalIndexWrite。

 

 

分别各做一个table和一个折线图

 

11)小结

至此为止,我们通过使用InfluxDB与Telegraf,仅编写4行代码就将Grafana监控软件与SequoiaDB成功对接。感兴趣的读者甚至可以编写并提交针对SequoiaDB的Telegraf inputs插件,使得其他用户不需要编写脚本即可直接获得SequoiaDB的性能指标。


产品特性
解决方案与案例 
数据库下载 
技术文档 

微信客服:
sequoiadb111




已标记关键词 清除标记