elasticsearch

xiao_bing0103

已于 2024-08-15 10:39:55 修改

阅读量287

点赞数 5

文章标签： elasticsearch 大数据

于 2024-08-15 10:24:49 首次发布

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.csdn.net/xiao_bing0103/article/details/141215661

版权

目录

概述
安装
图形化展示配置

概述

EFK是一套日志收集、存储、分析、展示的开源工具，包括Elasticsearch、Filebeat\Fluentd、Kibana。

Elasticsearch是一个实时的、分布式的可扩展的搜索引擎，允许进行全文、结构化搜索，它通常用于分析、搜索、存储大量日志数据，也可用于搜索许多不同类型的文档。

Filebeat是一个轻量级的收集本地 log 数据的工具。

Kibana是一个可视化的展示平台，对日志数据进行一系列图形化展示。

安装

配置软件源

# 创建源文件,vim elasticstack.repo
[elasticstack]
name=elasticstack
baseurl=https://mirrors.tuna.tsinghua.edu.cn/elasticstack/7.x/yum/
gpgcheck=0
enable=1

安装Filebeat

在日志产生的机器上安装配置Filebeat，用于收集日志，传输给Elasticsearch

# 安装filebeat
yum install filebeat

# 修改配置文件，vim /etc/filebeat/filebeat.yml
filebeat.inputs:
- type: log
  enabled: true
  paths:
    - /etc/nginx/logs/access.log
  pipeline: filebeat-nginx-access
  
output.elasticsearch:
  hosts: ["http://172.31.11.178:9200", "http://172.31.11.179:9200", "http://172.31.11.180:9200"]
  index: "nginx-access-oa-%{+yyyy.MM.dd}"

setup.template.enabled: false
setup.template.name: "filebeat-nginx"
setup.template.pattern: "nginx-access-oa*"
setup.ilm.enabled: false

logging: error

# 启动
systemctl start filebeat
systemctl enable filebeat

安装Elasticsearch

创建Elasticsearch集群，最少三台机器

# 在三台机器上都安装Elasticsearch
yum install elasticsearch

# 修改配置文件
cluster.name: efk
node.name: es01
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
network.host: 0.0.0.0
http.port: 9200
discovery.seed_hosts: ["es01", "es02", "es03"]
cluster.initial_master_nodes: ["es01", "es02", "es03"]
http.cors.enabled: true
http.cors.allow-origin: "*"

# 修改hosts文件，node.name无法解析的话，集群无法启动
172.31.11.178 es01
172.31.11.179 es02
172.31.11.180 es03

# 启动
systemctl start elasticsearch
systemctl enable elasticsearch

安装Kibana

# 安装
yum install kibana

# 修改配置文件
server.port: 5601
server.host: "0.0.0.0"
server.publicBaseUrl: "http://0.0.0.0:5602"
elasticsearch.hosts: ["http://172.31.11.178:9200", "http://172.31.11.179:9200", "http://172.31.11.180:9200"]
i18n.locale: "zh-CN"

# 启动
systemctl start kibana
systemctl enable kibana

安装Elasticsearch-head

elasticsearch-head是一款针对于elasticsearch的客户端工具，可以通过web页面来管理es中的索引

# 项目地址
https://github.com/mobz/elasticsearch-head

# 克隆项目
git clone git://github.com/mobz/elasticsearch-head.git

# 升级node工具，安装es-head需要高版本node
wget https://nodejs.org/dist/v16.16.0/node-v16.16.0-linux-x64.tar.xz
tar -xf node-v16.16.0-linux-x64.tar.xz
cp -r node-v16.16.0-linux-x64 /usr/local/node-v16.16.0

# 更新node环境变量，vim /etc/profile，追加两行
export NODE_HOME=/usr/local/node-v16.16.0
export PATH=${NODE_HOME}/bin:$PATH

source /etc/profile

# 安装es-head
cd elasticsearch-head-master
npm install phantomjs-prebuilt@2.1.14 --ignore-scripts
npm install
npm run start

# 安装完成后可通过web访问工具

图形化展示配置

预处理管道

filebeat可以使用kibana上配置的预处理管道(pipeline),对采集的数据进行预处理，再存入elasticsearch,主要是grok拆分日志字段。

# 示例nginx日志
10.12.34.10 - - [27/Jul/2022:09:41:47 +0800] "GET //oasdfauidf/aksdhfnkerhsdjkfh/validateAuthority?caseType=21 HTTP/1.1" 200 97 "http://sdkfah.sdjfh.com/html/cm_basic_details_pabs.html?id=1d31284b872asdfasd84d7e9ad971b01a1f6f62&casetasdfadype=21&reaadfadonly=false&viewfddlag=false&sign=2a61f0d994c690c37cdd56a7805bcf8d" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.79 Safari/537.36 Browser/61.2.x" 0.006 0.007 200

# grok拆分语句，可以使用kibana自带的dev_tool进行测试
%{IP:client} - - \\[%{HTTPDATE:datetime}\\] \"%{WORD:method} %{URIPATHPARAM:uri} HTTP/%{NUMBER:http_protocol}\" %{NUMBER:status} %{NUMBER:body_byte} \"%{URI:referer}\" \"%{GREEDYDATA:agent}\" %{NUMBER:response_time} %{NUMBER:upstream_response_time} %{NUMBER:upstream_status}

# 替换时间戳，索引默认时间戳为filebeat采集时间，与日志实际记录时间有可能会有误差，所以需要将默认时间戳替换掉
{
    "date": {
      "field": "datetime",
      "formats": [
        "dd/MMM/yyyy:HH:mm:ss Z"
      ],
      "target_field": "@timestamp",
      "timezone": "Asia/Shanghai"
    }
# 需要注意填写对时间格式，上述格式为HTTPDATE类型(27/Jul/2022:09:41:47 +0800)

索引模板

filebeat采集数据存放到elasticsearch后，kibana可对数据进行图形化展示，但需要先创建索引模板，对索引进行设置，包括分片/副本数量、映射(字段类型设置)等。

# 分片/副本数量设置
{
  "index": {
    "number_of_shards": "3",
    "number_of_replicas": "1"
  }
}

# 映射
# 字段类型映射逐一添加，但是如果某一字段需要进行索引及可视化配置，需要将字段类型调整为关键字(keyword)

索引模式

要想通过kibana discover浏览/索引数据，需要提前创建索引模式(index patterns)。

Geoip

geoip工具可以对IP地址进行分析，获取其归属地国家、城市等信息，自定义其数据文件脚本如下：

## cat build_mmdb.py

#!/usr/bin/python3
#!coding=utf-8
import mmdbencoder

# first install lib
# pip install py-mmdb-encoder

enc = mmdbencoder.Encoder(
    6,  # IP version
    32,  # Size of the pointers
    'GeoLite2-City',  # Name of the table
    ['en'],  # Languages
    {'en': 'GeoLite2-City'},  # Description
    compat=True)  # Map IPv4 in IPv6 (::abcd instead of ::ffff:abcd) to be read by official libraries


def build_geoip_data(prov: str, city: str, lat: float, lon: float):
    return {
        "city":
            {
                "names":
                    {
                        "en": city
                    }
            }, "continent":
            {
                "code": "AS",
                "names":
                    {
                        "en":
                            "AS"
                    }
            }, "country":
            {
                "iso_code": "CN",
                "names":
                    {
                        "en": "China"
                    }
            }, "location":
            {
                "accuracy_radius": 2,
                "latitude": lat,
                "longitude": lon,
                "metro_code": 623,
                "time_zone": "Asia/Shanghai"
            }, "registered_country":
            {
                "iso_code": "CN",
                "names":
                    {
                        "en": "China"
                    }
            }, "subdivisions":
            [
                {
                    "iso_code": prov,
                    "names":
                        {
                            "en": prov
                        }
                }
            ]
    }


subnet_info = {
    # locatoin:[lat,lon]
    "LiaoLin_ShenYang": {"location": [22.765297, 108.37512], "subnet": [u"10.189.90.0/24"]},
    "JiLIn_ChangChun": {"location": [43.888988, 125.272324], "subnet": [u"10.189.110.0/24"]},
    "HeiLongJiang_HaErBin": {"location": [45.699308, 126.57951], "subnet": [u"10.178.113.0/24"]},
    "NeiMengGu_HuHeHaoTe": {"location": [40.802448, 111.665459], "subnet": [u"10.148.165.0/24"]},
    "GuangXi_NanNing": {"location": [22.83862, 108.26534], "subnet": [u"10.148.169.0/24"]},
    "GuiZhou_GuiYang": {"location": [26.618567, 106.64389], "subnet": [u"10.182.231.0/24"]},
    "HuBei_WuHan": {"location": [30.574642, 114.234604], "subnet": [u"10.128.237.0/24"]},
}

for i in subnet_info:
    prov = i.split('_')[0]
    city = i.split('_')[1]
    lat = subnet_info[i]['location'][0]
    lon = subnet_info[i]['location'][1]
    data = enc.insert_data(build_geoip_data(prov, city, lat, lon))
    for x in subnet_info[i]['subnet']:
        enc.insert_network(x, data)
    enc.write_file('GeoLite2-City.mmdb')

生成mmdb数据库文件后，将该文件放到elasticsearch组件目录中，集群中的es都需放置

mv GeoLite2-City.mmdb /usr/share/elasticsearch/modules/ingest-geoip

在预处理管道中添加GeoIP Processor，选择IP这一字段即可。
返回目录

关注

5
点赞
踩
3

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。