目录
概述
EFK是一套日志收集、存储、分析、展示的开源工具,包括Elasticsearch、Filebeat\Fluentd、Kibana。
Elasticsearch是一个实时的、分布式的可扩展的搜索引擎,允许进行全文、结构化搜索,它通常用于分析、搜索、存储大量日志数据,也可用于搜索许多不同类型的文档。
Filebeat是一个轻量级的收集本地 log 数据的工具。
Kibana是一个可视化的展示平台,对日志数据进行一系列图形化展示。
安装
配置软件源
# 创建源文件,vim elasticstack.repo
[elasticstack]
name=elasticstack
baseurl=https://mirrors.tuna.tsinghua.edu.cn/elasticstack/7.x/yum/
gpgcheck=0
enable=1
安装Filebeat
在日志产生的机器上安装配置Filebeat,用于收集日志,传输给Elasticsearch
# 安装filebeat
yum install filebeat
# 修改配置文件,vim /etc/filebeat/filebeat.yml
filebeat.inputs:
- type: log
enabled: true
paths:
- /etc/nginx/logs/access.log
pipeline: filebeat-nginx-access
output.elasticsearch:
hosts: ["http://172.31.11.178:9200", "http://172.31.11.179:9200", "http://172.31.11.180:9200"]
index: "nginx-access-oa-%{+yyyy.MM.dd}"
setup.template.enabled: false
setup.template.name: "filebeat-nginx"
setup.template.pattern: "nginx-access-oa*"
setup.ilm.enabled: false
logging: error
# 启动
systemctl start filebeat
systemctl enable filebeat
安装Elasticsearch
创建Elasticsearch集群,最少三台机器
# 在三台机器上都安装Elasticsearch
yum install elasticsearch
# 修改配置文件
cluster.name: efk
node.name: es01
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
network.host: 0.0.0.0
http.port: 9200
discovery.seed_hosts: ["es01", "es02", "es03"]
cluster.initial_master_nodes: ["es01", "es02", "es03"]
http.cors.enabled: true
http.cors.allow-origin: "*"
# 修改hosts文件,node.name无法解析的话,集群无法启动
172.31.11.178 es01
172.31.11.179 es02
172.31.11.180 es03
# 启动
systemctl start elasticsearch
systemctl enable elasticsearch
安装Kibana
# 安装
yum install kibana
# 修改配置文件
server.port: 5601
server.host: "0.0.0.0"
server.publicBaseUrl: "http://0.0.0.0:5602"
elasticsearch.hosts: ["http://172.31.11.178:9200", "http://172.31.11.179:9200", "http://172.31.11.180:9200"]
i18n.locale: "zh-CN"
# 启动
systemctl start kibana
systemctl enable kibana
安装Elasticsearch-head
elasticsearch-head是一款针对于elasticsearch的客户端工具,可以通过web页面来管理es中的索引
# 项目地址
https://github.com/mobz/elasticsearch-head
# 克隆项目
git clone git://github.com/mobz/elasticsearch-head.git
# 升级node工具,安装es-head需要高版本node
wget https://nodejs.org/dist/v16.16.0/node-v16.16.0-linux-x64.tar.xz
tar -xf node-v16.16.0-linux-x64.tar.xz
cp -r node-v16.16.0-linux-x64 /usr/local/node-v16.16.0
# 更新node环境变量,vim /etc/profile,追加两行
export NODE_HOME=/usr/local/node-v16.16.0
export PATH=${NODE_HOME}/bin:$PATH
source /etc/profile
# 安装es-head
cd elasticsearch-head-master
npm install phantomjs-prebuilt@2.1.14 --ignore-scripts
npm install
npm run start
# 安装完成后可通过web访问工具
图形化展示配置
预处理管道
filebeat可以使用kibana上配置的预处理管道(pipeline),对采集的数据进行预处理,再存入elasticsearch,主要是grok拆分日志字段。
# 示例nginx日志
10.12.34.10 - - [27/Jul/2022:09:41:47 +0800] "GET //oasdfauidf/aksdhfnkerhsdjkfh/validateAuthority?caseType=21 HTTP/1.1" 200 97 "http://sdkfah.sdjfh.com/html/cm_basic_details_pabs.html?id=1d31284b872asdfasd84d7e9ad971b01a1f6f62&casetasdfadype=21&reaadfadonly=false&viewfddlag=false&sign=2a61f0d994c690c37cdd56a7805bcf8d" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.79 Safari/537.36 Browser/61.2.x" 0.006 0.007 200
# grok拆分语句,可以使用kibana自带的dev_tool进行测试
%{IP:client} - - \\[%{HTTPDATE:datetime}\\] \"%{WORD:method} %{URIPATHPARAM:uri} HTTP/%{NUMBER:http_protocol}\" %{NUMBER:status} %{NUMBER:body_byte} \"%{URI:referer}\" \"%{GREEDYDATA:agent}\" %{NUMBER:response_time} %{NUMBER:upstream_response_time} %{NUMBER:upstream_status}
# 替换时间戳,索引默认时间戳为filebeat采集时间,与日志实际记录时间有可能会有误差,所以需要将默认时间戳替换掉
{
"date": {
"field": "datetime",
"formats": [
"dd/MMM/yyyy:HH:mm:ss Z"
],
"target_field": "@timestamp",
"timezone": "Asia/Shanghai"
}
# 需要注意填写对时间格式,上述格式为HTTPDATE类型(27/Jul/2022:09:41:47 +0800)
索引模板
filebeat采集数据存放到elasticsearch后,kibana可对数据进行图形化展示,但需要先创建索引模板,对索引进行设置,包括分片/副本数量、映射(字段类型设置)等。
# 分片/副本数量设置
{
"index": {
"number_of_shards": "3",
"number_of_replicas": "1"
}
}
# 映射
# 字段类型映射逐一添加,但是如果某一字段需要进行索引及可视化配置,需要将字段类型调整为关键字(keyword)
索引模式
要想通过kibana discover浏览/索引数据,需要提前创建索引模式(index patterns)。
Geoip
geoip工具可以对IP地址进行分析,获取其归属地国家、城市等信息,自定义其数据文件脚本如下:
## cat build_mmdb.py
#!/usr/bin/python3
#!coding=utf-8
import mmdbencoder
# first install lib
# pip install py-mmdb-encoder
enc = mmdbencoder.Encoder(
6, # IP version
32, # Size of the pointers
'GeoLite2-City', # Name of the table
['en'], # Languages
{'en': 'GeoLite2-City'}, # Description
compat=True) # Map IPv4 in IPv6 (::abcd instead of ::ffff:abcd) to be read by official libraries
def build_geoip_data(prov: str, city: str, lat: float, lon: float):
return {
"city":
{
"names":
{
"en": city
}
}, "continent":
{
"code": "AS",
"names":
{
"en":
"AS"
}
}, "country":
{
"iso_code": "CN",
"names":
{
"en": "China"
}
}, "location":
{
"accuracy_radius": 2,
"latitude": lat,
"longitude": lon,
"metro_code": 623,
"time_zone": "Asia/Shanghai"
}, "registered_country":
{
"iso_code": "CN",
"names":
{
"en": "China"
}
}, "subdivisions":
[
{
"iso_code": prov,
"names":
{
"en": prov
}
}
]
}
subnet_info = {
# locatoin:[lat,lon]
"LiaoLin_ShenYang": {"location": [22.765297, 108.37512], "subnet": [u"10.189.90.0/24"]},
"JiLIn_ChangChun": {"location": [43.888988, 125.272324], "subnet": [u"10.189.110.0/24"]},
"HeiLongJiang_HaErBin": {"location": [45.699308, 126.57951], "subnet": [u"10.178.113.0/24"]},
"NeiMengGu_HuHeHaoTe": {"location": [40.802448, 111.665459], "subnet": [u"10.148.165.0/24"]},
"GuangXi_NanNing": {"location": [22.83862, 108.26534], "subnet": [u"10.148.169.0/24"]},
"GuiZhou_GuiYang": {"location": [26.618567, 106.64389], "subnet": [u"10.182.231.0/24"]},
"HuBei_WuHan": {"location": [30.574642, 114.234604], "subnet": [u"10.128.237.0/24"]},
}
for i in subnet_info:
prov = i.split('_')[0]
city = i.split('_')[1]
lat = subnet_info[i]['location'][0]
lon = subnet_info[i]['location'][1]
data = enc.insert_data(build_geoip_data(prov, city, lat, lon))
for x in subnet_info[i]['subnet']:
enc.insert_network(x, data)
enc.write_file('GeoLite2-City.mmdb')
生成mmdb数据库文件后,将该文件放到elasticsearch组件目录中,集群中的es都需放置
mv GeoLite2-City.mmdb /usr/share/elasticsearch/modules/ingest-geoip
在预处理管道中添加GeoIP Processor,选择IP这一字段即可。
返回目录