elasticsearch集群部署
node1 192.168.0.161
node2 192.168.0.162
node3 192.168.0.163
各节点配置相同的JDK和elasticsearch版本
[root@es-node1 ~]#java -version #装好JDK
java version "1.8.0_191"
Java(TM) SE Runtime Environment (build 1.8.0_191-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.191-b12, mixed mode)
[root@es-node1 ~]#cd /usr/local/src/
[root@es-node1 src]#ll #提前下载好安装包
-rw-r--r-- 1 root root 296519136 Jun 8 2020 elasticsearch-7.6.1-x86_64.rpm
[root@es-node1 src]#yum install elasticsearch-7.6.1-x86_64.rpm
修改配置文件
[root@es-node1 ~]#grep ^[a-Z] /etc/elasticsearch/elasticsearch.yml
cluster.name: alibaba #集群名称,应该一样
node.name: node1 #集群中每个节点的名称,应该不同
path.data: /elk/data #es数据目录
path.logs: /elk/logs #es日志目录
#bootstrap.memory_lock: true #服务启动的时候要有足够的内存,防止写入swap,内存比较小的虚拟机做实验不开启
network.host: 192.168.0.161 #监听IP,各节点应该不一样,监听本机IP
http.port: 9200 #监听端口
discovery.seed_hosts: ["192.168.0.161","192.168.0.162","192.168.0.163"] #集群发现列表
cluster.initial_master_nodes: ["192.168.0.161","192.168.0.162","192.168.0.163"] #集群哪些可以被推举为master
gateway.recover_after_nodes: 2 #集群中N个节点启动后,才允许进行数据恢复,默认是1
action.destructive_requires_name: true #是否可以通过正则和_all删除或者关闭索引库,默认true表示必须需要显示指定索引库名称
http.cors.enabled: true #开启跨域访问
http.cors.allow-origin: "*" #允许地址范围
修改资源限制,同时执行
[root@es-node1 ~]#echo "* soft nofile 65536" >> /etc/security/limits.conf
[root@es-node1 ~]#echo "* hard nofile 65536" >> /etc/security/limits.conf
[root@es-node1 ~]#vim /usr/lib/systemd/system/elasticsearch.service
LimitMEMLOCK=infinity #无限制使用内存
[root@es-node1 ~]#vim /etc/elasticsearch/jvm.options
-Xms1g
-Xmx1g #最小和最大内存限制,最小和最大设置一样大,设为本机内存一半
[root@es-node1 ~]#sysctl -p
vm.max_map_count = 262144
准备数据目录和所有者,同时执行
[root@es-node1 ~]#mkdir /elk/ #配置文件设置的子目录会自动创建
[root@es-node1 ~]#chown elasticsearch.elasticsearch /elk/ -R
启动服务,同时执行
[root@es-node1 ~]#systemctl restart elasticsearch
[root@es-node1 ~]#tail -f /elk/logs/alibaba.log #开启另一个终端查看日志
...
...
...
[2021-01-03T06:20:11,302][INFO ][o.e.n.Node ] [node1] started
[root@es-node1 ~]#ss -lntp #9200是客户端访问端口,9300是集群互相之间的通信端口
LISTEN 0 128 :::9200 :::* users:(("java",pid=13196,fd=249))
LISTEN 0 128 :::9300 :::* users:(("java",pid=13196,fd=224))
安装管理插件,只在其中一台安装即可
[root@es-node1 ~]#yum install docker
[root@es-node1 ~]#docker run -p 9100:9100 mobz/elasticsearch-head:5
部署logstach
基于ruby语言编写,依赖于Java环境,在没有java的机器,可以用另一个组成部分–filebeat
https://github.com/elastic/logstash
https://baike.baidu.com/item/Ruby/11419
https://www.elastic.co/guide/en/logstash参数选项文档
jdk(略)
yum安装(略)
数据目录所有者改为logstash,不改不行,将整个data数据目录更改所有者
[root@logtash]#chown logstash.logstash /usr/share/logstash/data/queue -R
前端执行测试,非常慢,机器配置低的话,有死机的感觉,启动了也是不知不觉
标准输入和输出举例
input输入
output输出
stdin标准输入
stdout标准输出
[root@logtash]#/usr/share/logstash/bin/logstash -e 'input { stdin{} } output { stdout { codec => rubydebug } }' #rubydebug是默认格式
[INFO ] 2020-12-07 01:31:30.791
[INFO ]
[INFO ]
[INFO ] 2020-12-07 01:31:32.639 [Api Webserver] agent - Successfully started Logstash API endpoint {:port=>9600}
测试输入内容
hello
/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/awesome_print-1.7.0/lib/awesome_print/formatters/base_formatter.rb:31: warning: constant ::Fixnum is deprecated
{
"@version" => "1", #事件版本号
"message" => "hello", #内容
"host" => "logtash", #标记事件发生的位置
"@timestamp" => 2020-12-06T17:33:29.169Z #事件发生事件
}
将内容输出到文件
file+文件路径
# /usr/share/logstash/bin/logstash -e 'input { stdin{} } output { file { path => "/tmp/a.txt" } }'
将内容输出到elasticsearch
指定输出到elasticsearch,指定主机hosts,从java获取时间(默认项,可以不加)
# /usr/share/logstash/bin/logstash -e 'input { stdin{} } output { elasticsearch { hosts => ["192.168.0.160:9200"] index => "linux38-%{+YYYY.MM.dd}" } }'
将文件输出到elasticsearch
start_position从文件读取的起始位置 “begining”起始位置 “end”结尾
start_interval读取的间隔时间
[root@logtash ~]#/usr/share/logstash/bin/logstash -e 'input { file { path => "/var/log/dmesg" start_position => "beginning" stat_interval => "3" } } output { elasticsearch { hosts => ["192.168.0.160:9200"] index => "alibaba-%{+YYYY.MM.dd}" } }'
...
...
[INFO ] 2020-12-07 02:20:39.948 [Api Webserver] agent - Successfully started Logstash API endpoint {:port=>9600}
出现数值,但是因为我的集群只有一台机器,报黄色
在elasticsearch上查看分片,但是删除不要在目录下删除,会导致集群异常
[root@elasticsearch src]#ll /var/lib/elasticsearch/nodes/0/indices/
total 0
drwxr-sr-x 4 elasticsearch elasticsearch 29 Dec 6 18:20 cQlX69wdSNqekziUUJo3Pg
在生产中,往往就是写成配置文件的形式后台执行
[root@logtash ~]#vim /etc/logstash/conf.d/syslog_elastic.conf
input {
file {
path => "/var/log/dmesg"
start_position => "beginning"
stat_interval => "3"
}
}
output {
elasticsearch {
hosts => ["192.168.0.160:9200"]
index => "alibaba-%{+YYYY.MM.dd}"
}
}
[root@logtash ~]#systemctl restart logstash
[root@logtash ~]#ss -lntp
State Recv-Q Send-Q Local Address:Port Peer Address:Port
LISTEN 0 50 ::ffff:127.0.0.1:9600 :::* users:(("java",pid=35703,fd=90))
logstash知道日志收集节点的原理,这些文件里会记录一些标记,成为游标,logstash自动维护,这样他就这道上一次收集到哪个位置
[root@tomcat src]#ls -i /var/lib/logstash/plugins/inputs/file/.sincedb_452905a167cf4509fd08acb964fdb20c
68073675 /var/lib/logstash/plugins/inputs/file/.sincedb_452905a167cf4509fd08acb964fdb20c
[root@tomcat src]#cat /var/lib/logstash/plugins/inputs/file/.sincedb_452905a167cf4509fd08acb964fdb20c
67558096 0 2050 1485722 1609661497.936932
68073686 0 2050 452987 1610026802.2512848 /var/log/messages
部署kibana
https://github.com/elastic/kibana
https://typescript.bootcss.com/
yum安装(略)
配置文件
[root@kibana ~]#grep ^[a-Z] /etc/kibana/kibana.yml
server.port: 5601
server.host: "0.0.0.0"
elasticsearch.hosts: ["http://192.168.0.160:9200"]
i18n.locale: "zh-CN"
启动服务,同样非常慢,配置低得等待
[root@kibana ~]#systemctl start kibana
初始化界面
创建索引-第一步,指定个名称,最好用正则表达式
第二步,时间筛选字段名称,就用那个默认项吧
在logstach端,模拟写入数据
[root@logtash ~]#echo "hahaha" >>/var/log/dmesg
在kibana界面查看,可以选择字段,在下面就看到日志信息了
通过rsyslog收集haproxy日志
适用于那些不能装logstash的设备,比如网络设备、交换、路由器、防火墙等等
这里以haproxy为例(可以理解为网络设备),通过rsyslog收集本地日志,传输至远端logstash服务收集日志传输至es集群
本地haproxy产生日志–本地rsyslog收集–传输至logstash–传输至haproxy的VIP端口–传输至es集群
前端haproxy做VIP代理后端es集群(生产中也常用此方式)
[root@haproxy ~]#vim /etc/keepalived/keepalived.conf
global_defs {
vrrp_iptables #不生成iptables规则,否则会访问不进来
vrrp_instance VI_1 {
state MASTER
interface eth0
garp_master_delay 10
smtp_alert
virtual_router_id 51
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.0.200 eth0 label eth0:1 #生成VIP
[root@haproxy ~]#vim /etc/haproxy/haproxy.cfg
listen stats #开启状态页(可不加)
mode http
bind 0.0.0.0:9999
stats enable
log global
stats uri /haproxy-status
stats auth admin:123123
listen elasticsearch-9200 #开启反向代理
bind 192.168.0.200:9200 #监听地址端口
mode tcp
server 192.168.0.161 192.168.0.161:9200 check inter 2s fall 3 rise 5
server 192.168.0.162 192.168.0.162:9200 check inter 2s fall 3 rise 5
server 192.168.0.163 192.168.0.163:9200 check inter 2s fall 3 rise 5
[root@haproxy01 ~]#systemctl restart keepalived
[root@haproxy01 ~]#systemctl restart haproxy
[root@haproxy01 ~]#ss -lntp
State Recv-Q Send-Q Local Address:Port Peer Address:Port
LISTEN 0 128 *:5000 *:* users:(("haproxy",pid=8296,fd=5))
LISTEN 0 128 *:9999 *:* users:(("haproxy",pid=8296,fd=7))
LISTEN 0 128 192.168.0.200:9200 *:* users:(("haproxy",pid=8296,fd=8))
本机再开启rsyslog服务
[root@haproxy ~]#vim /etc/rsyslog.conf
$ModLoad imudp
$UDPServerRun 514
local2.* @@192.168.0.63:514 #设置日志级别和传输至远端的地址端口
[root@haproxy01 ~]#systemctl restart rsyslog
此处注意,rsyslog监听端口是514,如果logstash如果是用普通用户启动的,是监听不了1024以内的端口的
logstatsh服务器配置文件
[root@tomcat conf.d]#vim rsyslog-to-es.conf
input {
syslog {
port => 514
host => "0.0.0.0"
type => "rsyslog"
}}
output {
if [type] == "rsyslog" {
elasticsearch {
hosts => ["192.168.0.200:9200"]
index => "rsyslog-0.63-%{+YYYY.MM.dd}"
}}}
[root@tomcat conf.d]#systemctl restart logstash
登录haproxy状态页或者其他行为,使haproxy产生日志(就是网络设备产生日志),登录es的head插件查看,kibana创建索引(略)
收集nginx访问日志和错误日志
安装nginx(略),修改配置文件的日志格式
[root@tomcat src]#vim /apps/nginx/conf/nginx.conf
log_format access_json '{"@timestamp":"$time_iso8601",'
'"host":"$server_addr",'
'"clientip":"$remote_addr",'
'"size":$body_bytes_sent,'
'"responsetime":$request_time,'
'"upstreamtime":"$upstream_response_time",'
'"upstreamhost":"$upstream_addr",'
'"http_host":"$host",'
'"url":"$uri",'
'"domain":"$host",'
'"xff":"$http_x_forwarded_for",'
'"referer":"$http_referer",'
'"tcp_xff":"$proxy_protocol_addr",'
'"http_use_agent":"$http_user_agent",'
'"status":"$status"}';
access_log /var/log/nginx/access.log access_json;
写测试页面并启动
[root@tomcat ~]#vcat /apps/nginx/html/index.html
<h1>192.168.0.63</h1>
[root@tomcat ~]#/apps/nginx/sbin/nginx
查看日志格式,可以再去校验下
[root@tomcat ~]#tail -f /var/log/nginx/access.log
{"@timestamp":"2021-01-04T20:55:48+08:00","host":"192.168.0.63","clientip":"192.168.0.104","size":22,"responsetime":0.000,"upstreamtime":"-","upstreamhost":"-","http_host":"192.168.0.63","url":"/index.html","domain":"192.168.0.63","xff":"-","referer":"-","tcp_xff":"","http_use_agent":"Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:83.0) Gecko/20100101 Firefox/83.0","status":"200"}
logstash收集文件
[root@tomcat ~]#vim /etc/logstash/conf.d/log-to-es.conf
input {
file {
path => "/var/log/nginx/access.log"
start_position => "beginning"
stat_interval => "3"
type => "nginx-accesslog"
codec => json
}
file {
path => "/var/log/nginx/error.log"
start_position => "beginning"
stat_interval => "3"
type => "nginx-errorlog"
}
}
output {
if [type] == "nginx-accesslog" {
elasticsearch {
hosts => ["192.168.0.161:9200"]
index => "nginx-accesslog-0.63-%{+YYYY.MM.dd}"
}
}
if [type] == "nginx-errorlog" {
elasticsearch {
hosts => ["192.168.0.162:9200"]
index => "nginx-errorlog-0.63-%{+YYYY.MM.dd}"
}
}
}
es head插件检查和kibana增加索引过程就不再赘述了,nginx的日志字段,主要看到的upstreamtime检查到后端服务器的时间,还有responsetime响应时间,uri、clientip等