python 运维系统监控_运维监控系统之Open-Falcon

本文详细介绍了Open-Falcon监控系统的搭建过程,包括系统环境和软件环境的准备,如Redis、MySQL的安装与配置,Go语言的安装,以及Open-Falcon后端和前端的启动与配置。此外,还提供了访问监控网站、客户端配置和相关参考文档链接。
摘要由CSDN通过智能技术生成

运维监控系统之Open-Falcon

一、Open-Falcon介绍

open-falcon是一款用golang和python写的监控系统,由小米启动这个项目。

1、监控系统,可以从运营级别(基本配置即可),以及应用级别(二次开发,通过端口进行日志上报),对服务器、操作系统、中间件、应用进行全面的监控,及报警,对我们的系统正常运行的作用非常重要。

2、基础监控

CPU、Load、内存、磁盘、IO、网络相关、内核参数、ss 统计输出、端口采集、核心服务的进程存活信息采集、关键业务进程资源消耗、NTP offset采集、DNS解析采集,这些指标,都是open-falcon的agent组件直接支持的。

对于这些基础监控选项全部理解透彻的时刻,也就是对Linux运行原理及命令进阶的时刻。

3、第三方监控

术业有专攻,运行在OS上的应用甚多,Open-Falcon的开发团队不可能把所有的第三方应用的监控全部做完,这个就需要开源社区提供更多的插件,当前对于很多常用的第三方应用都有相关插件了。

4、JVM监控

对于Java作为主要开发语言的大多数公司,对于JVM的监控不可或缺。

每个JVM应用的参数,比如GC、类加载、JVM内存、进程、线程,都可以上报给Falcon,而这些参数的获得,都可以通过MxBeans实现。

5、业务应用监控

对于业务需要监控的接口,比如响应时间等。可以根据业务的需要,上报相关数据到Falcon,并通过Falcon查看结果。

二、Open-Falcon编写的整个脑洞历程

三、环境准备

1.系统环境

[root@open-falcon-server ~]# cat /etc/redhat-release

CentOS Linux release 7.2.1511 (Core)

2.系统优化

#安装下载软件

yum install wget -y

#更换aliyun源

mv /etc/yum.repos.d/CentOS-Base.repo /etc/yum.repos.d/CentOS-Base.repo.backup

wget -O /etc/yum.repos.d/CentOS-Base.repo http://mirrors.aliyun.com/repo/Centos-7.repo

#下载epel源

yum install epel-release.noarch -y

rpm -Uvh http://mirrors.aliyun.com/epel/epel-release-latest-7.noarch.rpm

yum clean all

yum makecache

#下载常用软件

yum install git telnet net-tools tree nmap sysstat lrzsz dos2unix tcpdump ntpdate -y

#配置时间同步

ntpdate cn.pool.ntp.org

#更改主机名

hostnamectl set-hostname open-falcon-server

hostname open-falcon-server

#开启缓存

sed -i 's#keepcache=0#keepcache=1#g' /etc/yum.conf

grep keepcache /etc/yum.conf

#关闭selinux

sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config

setenforce 0

#关闭防火墙

systemctl stop firewalld.service

systemctl disable firewalld.service

3.软件环境准备

(1)redis准备

#安装 redis

yum install redis -y

#redis常用命令

redis-server redis 服务端

redis-cli     redis 命令行客户端

redis-benchmark redis 性能测试工具

redis-check-aof   AOF文件修复工具

redis-check-dump RDB文件修复工具

redis-sentinel    Sentinel 服务端

#启动redis

[root@open-falcon-server ~]# redis-server &

[1] 1662

[root@open-falcon-server ~]# 1662:C 27 Jul 14:44:56.463 # Warning: no config file specified, using the default config. In order to specify a config file use redis-server /path/to/redis.conf

1662:M 27 Jul 14:44:56.464 * Increased maximum number of open files to 10032 (it was originally set to 1024).

_._

_.-``__ ''-._

_.-`` `. `_. ''-._ Redis 3.2.10 (00000000/0) 64 bit

.-`` .-```. ```\/ _.,_ ''-._

( ' , .-` | `, ) Running in standalone mode

|`-._`-...-` __...-.``-._|'` _.-'| Port: 6379

| `-._ `._ / _.-' | PID: 1662

`-._ `-._ `-./ _.-' _.-'

|`-._`-._ `-.__.-' _.-'_.-'|

| `-._`-._ _.-'_.-' | http://redis.io

`-._ `-._`-.__.-'_.-' _.-'

|`-._`-._ `-.__.-' _.-'_.-'|

| `-._`-._ _.-'_.-' |

`-._ `-._`-.__.-'_.-' _.-'

`-._ `-.__.-' _.-'

`-._ _.-'

`-.__.-'

1662:M 27 Jul 14:44:56.464 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.

1662:M 27 Jul 14:44:56.464 # Server started, Redis version 3.2.10

1662:M 27 Jul 14:44:56.464 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.

1662:M 27 Jul 14:44:56.464 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled.

1662:M 27 Jul 14:44:56.464 * The server is now ready to accept connections on port 6379

(2)mysql准备

#安装mysql

yum install mariadb mariadb-server -y

#启动mysql

systemctl start mariadb

systemctl enable mariadb

#登录数据库测试

[root@open-falcon-server ~]# mysql -uroot -p

Enter password:

Welcome to the MariaDB monitor. Commands end with ; or \g.

Your MariaDB connection id is 4

Server version: 5.5.56-MariaDB MariaDB Server

Copyright (c) 2000, 2017, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> exit

Bye

#检查服务

[root@open-falcon-server ~]# netstat -lntp|egrep "3306|6379"

tcp 0 0 0.0.0.0:3306 0.0.0.0:* LISTEN 1978/mysqld

tcp 0 0 0.0.0.0:6379 0.0.0.0:* LISTEN 1662/redis-server *

tcp6 0 0 :::6379 :::* LISTEN 1662/redis-server *

#初始化MySQL表结构

cd /tmp/ && git clone https://github.com/open-falcon/falcon-plus.git

cd /tmp/falcon-plus/scripts/mysql/db_schema/

mysql -h 127.0.0.1 -u root -p < 1_uic-db-schema.sql

mysql -h 127.0.0.1 -u root -p < 2_portal-db-schema.sql

mysql -h 127.0.0.1 -u root -p < 3_dashboard-db-schema.sql

mysql -h 127.0.0.1 -u root -p < 4_graph-db-schema.sql

mysql -h 127.0.0.1 -u root -p < 5_alarms-db-schema.sql

rm -rf /tmp/falcon-plus/

#设置数据库密码

mysqladmin -uroot password "123456"

#检查导入的数据库

[root@open-falcon-server ~]# mysql -uroot -p

Enter password:

Welcome to the MariaDB monitor. Commands end with ; or \g.

Your MariaDB connection id is 11

Server version: 5.5.56-MariaDB MariaDB Server

Copyright (c) 2000, 2017, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> show databases;

+--------------------+

| Database |

+--------------------+

| information_schema |

| alarms |

| dashboard |

| falcon_portal |

| graph |

| mysql |

| performance_schema |

| test |

| uic |

+--------------------+

9 rows in set (0.00 sec)

MariaDB [(none)]> exit

Bye

(3)Go安装

#安装go语言开发包

yum install golang -y

#检查版本

[root@open-falcon-server ~]# go version

go version go1.9.4 linux/amd64

#查看Go安装路径

[root@open-falcon-server ~]# find / -name go

/etc/alternatives/go

/var/lib/alternatives/go

/usr/bin/go

/usr/lib/golang/src/cmd/go #需要这个路径

/usr/lib/golang/src/go

/usr/lib/golang/bin/go

/usr/lib/golang/pkg/linux_amd64/cmd/go

/usr/lib/golang/pkg/linux_amd64/go

四、Open-Falcon后端

#创建工作目录

export FALCON_HOME=/home/work

export WORKSPACE=$FALCON_HOME/open-falcon

mkdir -p $WORKSPACE

#下载解压二进制包

wget https://github.com/open-falcon/falcon-plus/releases/download/v0.2.1/open-falcon-v0.2.1.tar.gz

tar xf open-falcon-v0.2.1.tar.gz -C $WORKSPACE

#查看解压结果

[root@open-falcon-server ~]# cd $WORKSPACE

[root@open-falcon-server open-falcon]# ll

总用量 3896

drwxrwxr-x 7 501 501 67 8月 15 2017 agent

drwxrwxr-x 5 501 501 40 8月 15 2017 aggregator

drwxrwxr-x 5 501 501 40 8月 15 2017 alarm

drwxrwxr-x 6 501 501 51 8月 15 2017 api

drwxrwxr-x 5 501 501 40 8月 15 2017 gateway

drwxrwxr-x 6 501 501 51 8月 15 2017 graph

drwxrwxr-x 5 501 501 40 8月 15 2017 hbs

drwxrwxr-x 5 501 501 40 8月 15 2017 judge

drwxrwxr-x 5 501 501 40 8月 15 2017 nodata

-rwxrwxr-x 1 501 501 3987469 8月 15 2017 open-falcon

lrwxrwxrwx 1 501 501 16 8月 15 2017 plugins -> ./agent/plugins/

lrwxrwxrwx 1 501 501 15 8月 15 2017 public -> ./agent/public/

drwxrwxr-x 5 501 501 40 8月 15 2017 transfer

模块

文件所在路径

aggregator

/home/work/aggregator/config/cfg.json

graph

/home/work/graph/config/cfg.json

hbs

/home/work/hbs/config/cfg.json

nodata

/home/work/nodata/config/cfg.json

api

/home/work/api/config/cfg.json

alarm

/home/work/alarm/config/cfg.json

#修改配置文件

sed -i 's#root:@tcp(127.0.0.1:3306)#root:123456@tcp(127.0.0.1:3306)#g' `find ./ -type f -name "cfg.json"|egrep "alarm|api|nodata|hbs|graph|aggregator"`

cat `find ./ -type f -name "cfg.json"|egrep "alarm|api|nodata|hbs|graph|aggregator"` |grep 'root:123456@tcp(127.0.0.1:3306)'

#启动后端模块

[root@open-falcon-server open-falcon]# cd /home/work/open-falcon

[root@open-falcon-server open-falcon]# ./open-falcon start

[falcon-graph] 5583

[falcon-hbs] 5592

[falcon-judge] 5600

[falcon-transfer] 5606

[falcon-nodata] 5613

[falcon-aggregator] 5620

[falcon-agent] 5628

[falcon-gateway] 5635

[falcon-api] 5641

[falcon-alarm] 5653

#检查服务启动状态

[root@open-falcon-server open-falcon]# ./open-falcon check

falcon-graph UP 5583

falcon-hbs UP 5592

falcon-judge UP 5600

falcon-transfer UP 5606

falcon-nodata UP 5613

falcon-aggregator UP 5620

falcon-agent UP 5628

falcon-gateway UP 5635

falcon-api UP 5641

falcon-alarm UP 5653

#更多命令行工具用法

# ./open-falcon [start|stop|restart|check|monitor|reload] module

./open-falcon start agent

./open-falcon check

falcon-graph UP 53007

falcon-hbs UP 53014

falcon-judge UP 53020

falcon-transfer UP 53026

falcon-nodata UP 53032

falcon-aggregator UP 53038

falcon-agent UP 53044

falcon-gateway UP 53050

falcon-api UP 53056

falcon-alarm UP 53063

#For debugging , You can check $WorkDir/$moduleName/log/logs/xxx.log

至此后端部署完成。

#其他用法

重载配置(备注:修改vi cfg.json配置文件后,可以用下面命令重载配置)

curl 127.0.0.1:1988/config/reload

五、Open-Falcon前端

#创建工作目录

export HOME=/home/work

export WORKSPACE=$HOME/open-falcon

mkdir -p $WORKSPACE

cd $WORKSPACE

#克隆前端组件代码

git clone https://github.com/open-falcon/dashboard.git

#安装依赖包

yum install -y python-virtualenv

yum install -y python-devel

yum install -y openldap-devel

yum install -y mysql-devel

yum groupinstall "Development tools" -y

#下载ez_setup.py

cd ~

wget --no-check-certificate https://bootstrap.pypa.io/ez_setup.py

python ez_setup.py --insecure

#下载安装pip

wget https://pypi.python.org/packages/11/b6/abcb525026a4be042b486df43905d6893fb04f05aac21c32c638e939e447/pip-9.0.1.tar.gz#md5=35f01da33009719497f01a4ba69d63c9

tar xf pip-9.0.1.tar.gz

cd pip-9.0.1

python setup.py install

#解决pip安装慢

mkdir -p ~/.pip

echo '[global]' >>~/.pip/pip.conf

echo 'index-url = https://pypi.tuna.tsinghua.edu.cn/simple' >>~/.pip/pip.conf

#测试是否可用

[root@open-falcon-server ~]# cd /home/work/open-falcon/dashboard

[root@open-falcon-server dashboard]# pip -V

pip 9.0.1 from /usr/lib/python2.7/site-packages/pip-9.0.1-py2.7.egg (python 2.7)

[root@open-falcon-server dashboard]# pip

Usage:

pip [options]

Commands:

install Install packages.

download Download packages.

uninstall Uninstall packages.

freeze Output installed packages in requirements format.

list List installed packages.

show Show information about installed packages.

check Verify installed packages have compatible dependencies.

search Search PyPI for packages.

wheel Build wheels from your requirements.

hash Compute hashes of package archives.

completion A helper command used for command completion.

help Show help for commands.

General Options:

-h, --help Show help.

--isolated Run pip in an isolated mode, ignoring environment variables and user configuration.

-v, --verbose Give more output. Option is additive, and can be used up to 3 times.

-V, --version Show version and exit.

-q, --quiet Give less output. Option is additive, and can be used up to 3 times (corresponding to WARNING, ERROR, and CRITICAL logging levels).

--log Path to a verbose appending log.

--proxy Specify a proxy in the form [user:passwd@]proxy.server:port.

--retries Maximum number of retries each connection should attempt (default 5 times).

--timeout Set the socket timeout (default 15 seconds).

--exists-action Default action when a path already exists: (s)witch, (i)gnore, (w)ipe, (b)ackup, (a)bort.

--trusted-host Mark this host as trusted, even though it does not have valid or any HTTPS.

--cert Path to alternate CA bundle.

--client-cert Path to SSL client certificate, a single file containing the private key and the certificate in PEM format.

--cache-dir

--no-cache-dir Disable the cache.

--disable-pip-version-check

Don't periodically check PyPI to determine whether a new version of pip is available for download. Implied with --no-index.

#查看需要安装模块

[root@open-falcon-server dashboard]# cat pip_requirements.txt

Flask==0.10.1

Flask-Babel==0.9

Jinja2==2.7.2

Werkzeug==0.9.4

gunicorn==19.5.0

python-dateutil==2.2

requests==2.3.0

mysql-python

python-ldap

#安装模块

pip install -r pip_requirements.txt

#修改配置文件

配置说明:

dashboard的配置文件为: 'rrd/config.py',根据实际情况修改:

# API_ADDR 表示后端api组件的地址

API_ADDR = "http://127.0.0.1:8080/api/v1"

# 根据实际情况,修改PORTAL_DB_*, 默认用户名为root,默认密码为""

# 根据实际情况,修改ALARM_DB_*, 默认用户名为root,默认密码为""

配置修改:

cp rrd/config.py{,.bak}

vim rrd/config.py

修改内容:

# Falcon+ API

API_ADDR = os.environ.get("API_ADDR","http://10.0.0.100:8080/api/v1")

# portal database

# TODO: read from api instead of db

PORTAL_DB_HOST = os.environ.get("PORTAL_DB_HOST","10.0.0.100")

PORTAL_DB_PORT = int(os.environ.get("PORTAL_DB_PORT",3306))

PORTAL_DB_USER = os.environ.get("PORTAL_DB_USER","root")

PORTAL_DB_PASS = os.environ.get("PORTAL_DB_PASS","123456")

PORTAL_DB_NAME = os.environ.get("PORTAL_DB_NAME","falcon_portal")

# alarm database

# TODO: read from api instead of db

ALARM_DB_HOST = os.environ.get("ALARM_DB_HOST","10.0.0.100")

ALARM_DB_PORT = int(os.environ.get("ALARM_DB_PORT",3306))

ALARM_DB_USER = os.environ.get("ALARM_DB_USER","root")

ALARM_DB_PASS = os.environ.get("ALARM_DB_PASS","123456")

ALARM_DB_NAME = os.environ.get("ALARM_DB_NAME","alarms")

#启动服务

[root@open-falcon-server dashboard]# virtualenv ./env

New python executable in /home/work/open-falcon/dashboard/env/bin/python

Installing setuptools, pip, wheel...done.

[root@open-falcon-server dashboard]# source env/bin/activate

(env) [root@open-falcon-server dashboard]# ./control start

falcon-dashboard started..., pid=20814

(env) [root@open-falcon-server dashboard]# ./control tail

[2018-07-27 16:37:02 +0000] [20814] [INFO] Starting gunicorn 19.5.0

[2018-07-27 16:37:02 +0000] [20814] [INFO] Listening at: http://0.0.0.0:8081 (20814)

[2018-07-27 16:37:02 +0000] [20814] [INFO] Using worker: sync

[2018-07-27 16:37:02 +0000] [20819] [INFO] Booting worker with pid: 20819

[2018-07-27 16:37:02 +0000] [20820] [INFO] Booting worker with pid: 20820

[2018-07-27 16:37:02 +0000] [20821] [INFO] Booting worker with pid: 20821

[2018-07-27 16:37:02 +0000] [20826] [INFO] Booting worker with pid: 20826

^C

(env) [root@open-falcon-server dashboard]# deactivate

六、访问网站

http://10.0.0.100:8081

访问网站

#dashbord用户管理

dashbord没有默认创建任何账号包括管理账号,需要你通过页面进行注册账号。

想拥有管理全局的超级管理员账号,需要手动注册用户名为root的账号(第一个帐号名称为root的用户会被自动设置为超级管理员)。

超级管理员可以给普通用户分配权限管理。

小提示:注册账号能够被任何打开dashboard页面的人注册,所以当给相关的人注册完账号后,需要去关闭注册账号功能。只需要去修改api组件的配置文件cfg.json,将signup_disable配置项修改为true,重启api即可。当需要给人开账号的时候,再将配置选项改回去,用完再关掉即可。

首页

七、Open-Falcon客户端

#服务端操作

[root@open-falcon-server ~]# cd /home/work/open-falcon

[root@open-falcon-server open-falcon]# scp -r agent root@10.0.0.101:/home/

[root@open-falcon-server open-falcon]# scp -r open-falcon root@10.0.0.101:/home/

#客户端操作

[root@open-falcon-client ~]# mkdir -p /home/work/open-falcon

[root@open-falcon-client ~]# mkdir -p /home/work/open-falcon

[root@open-falcon-client ~]# mv /home/open-falcon /home/agent /home/work/open-falcon

[root@open-falcon-client ~]# cd /home/work/open-falcon

[root@open-falcon-client open-falcon]# vim agent/config/cfg.json

修改内容:

{

"debug": true, # 控制一些debug信息的输出,生产环境通常设置为false

"hostname": "", # agent采集了数据发给transfer,endpoint就设置为了hostname,默认通过`hostname`获取,如果配置中配置了hostname,就用配置中的

"ip": "", # agent与hbs心跳的时候会把自己的ip地址发给hbs,agent会自动探测本机ip,如果不想让agent自动探测,可以手工修改该配置

"plugin": {

"enabled": false, # 默认不开启插件机制

"dir": "./plugin", # 把放置插件脚本的git repo clone到这个目录

"git": "https://github.com/open-falcon/plugin.git", # 放置插件脚本的git repo地址

"logs": "./logs" # 插件执行的log,如果插件执行有问题,可以去这个目录看log

},

"heartbeat": {

"enabled": true, # 此处enabled要设置为true

"addr": "10.0.0.100:6030", # hbs的地址,端口是hbs的rpc端口

"interval": 60, # 心跳周期,单位是秒

"timeout": 1000 # 连接hbs的超时时间,单位是毫秒

},

"transfer": {

"enabled": true,

"addrs": [

"10.0.0.100:18433"

], # transfer的地址,端口是transfer的rpc端口, 可以支持写多个transfer的地址,agent会保证HA

"interval": 60, # 采集周期,单位是秒,即agent一分钟采集一次数据发给transfer

"timeout": 1000 # 连接transfer的超时时间,单位是毫秒

},

"http": {

"enabled": true, # 是否要监听http端口

"listen": ":1988",

"backdoor": false

},

"collector": {

"ifacePrefix": ["eth", "em"], # 默认配置只会采集网卡名称前缀是eth、em的网卡流量,配置为空就会采集所有的,lo的也会采集。可以从/proc/net/dev看到各个网卡的流量信息

"mountPoint": []

},

"default_tags": {

},

"ignore": { # 默认采集了200多个metric,可以通过ignore设置为不采集

"cpu.busy": true,

"df.bytes.free": true,

"df.bytes.total": true,

"df.bytes.used": true,

"df.bytes.used.percent": true,

"df.inodes.total": true,

"df.inodes.free": true,

"df.inodes.used": true,

"df.inodes.used.percent": true,

"mem.memtotal": true,

"mem.memused": true,

"mem.memused.percent": true,

"mem.memfree": true,

"mem.swaptotal": true,

"mem.swapused": true,

"mem.swapfree": true

}

}

#启动服务

./open-falcon start agent 启动进程

./open-falcon stop agent 停止进程

./open-falcon monitor agent 查看日志

看var目录下的log是否正常,或者浏览器访问其1988端口。另外agent提供了一个--check参数,可以检查agent是否可以正常跑在当前机器上

cd /home/work/open-falcon/agent/bin/

./falcon-agent --check

进入监控界面查看:

监控界面

八、参考文档

## Open-Falcon

# 运维监控系统之Open-Falcon

https://www.cnblogs.com/nulige/p/7741580.html

# open-falcon安装使用监控树莓派

https://yq.aliyun.com/articles/437196

# 小米运维架构服务监控Open-Falcon

https://blog.csdn.net/qq_27384769/article/details/79234270

# 架构师的成长之路-博客-导图

https://github.com/csy512889371/learnDoc

# Open-Falcon编写的整个脑洞历程

http://mp.weixin.qq.com/s?__biz=MjM5OTcxMzE0MQ==&mid=400225178&idx=1&sn=c98609a9b66f84549e41cd421b4df74d

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值