logstash的安装以及各种数据情况的处理方法+解析嵌套json格式数据详解

简介与安装:

logstash的作用:日志收集

工作原理:

  • 输入(input):必须要,如stdin、file、http、exec...
  • 过滤器(filter):可选,如grok、mutate...
  • 输出(output):必须要,如stdout、elasticsearch、file...

运行logstash的两种方法:

logstash -e 'input{stdin{ } } output{stout{ }}'          //使用给定的配置字符串,其格式同配置文件
logstash -f 配置文件                                      //读取配置文件

安装教程:

环境要求:虚拟机、jdk

软件资源(自取):

https://pan.baidu.com/s/1A8-3ranhF8bHf6McvsCKCg 
提取码:otjs

//解压
tar -zxf logstash-6.2.2.tar.gz
//移动到soft文件夹下(个人习惯,没有请自建文件夹)
mv logstash-6.2.2 /opt/soft/logstash622

 简单输入输出:

首先需要进入你安装的logstash的bin目录下,执行以下语句:

 修改输出格式为json格式:

 配置文件执行logstash:

1. 自定义配置文件

//配置文件内容
input {
    stdin {}
}
output {
    stdout {
        codec => rubydebug
    }
}

2. 加载配置文件执行logstash

利用logstash读取文件内容:

1. 新建一个文本文件:

 2. 配置文件写法:

input {
    file {
        path => "/opt/a.txt"
        start_position => "beginning"
        sincedb_path => "/dev/null"            (设置每次都从文章开头读起)
    }
}
output {
    stdout {
        codec => rubydebug
    }
}

3. 测试结果:

 读取json文件内容:

将配置文件修改为如下配置:

input {
    file {
        path => "/opt/a.txt"
        start_position => "beginning"
        sincedb_path => "/dev/null"
        codec => json
    }
}
output {
    stdout {
        codec => rubydebug
    }
}

测试内容:

 {"browse":"chrome:true version:81.1","custid":"10000","jzy":{"eventCategory":"btn_click","position":"128 65","time":"2000-01-01 12:00:00","pageName":"index.html","msg":"body"}}
{"browse":"chrome:true version:81.1","custid":"29070","jzy":{"eventCategory":"btn_click","position":"128 65","time":"2000-01-01 12:01:00","pageName":"index.html","msg":"every"}}
{"browse":"chrome:true version:81.2","custid":"26775","jzy":{"eventCategory":"btn_click","position":"128 65","time":"2000-01-01 12:02:00","pageName":"index.html","msg":"body"}}
{"browse":"chrome:true version:81.1","custid":"43694","jzy":{"eventCategory":"href_click","position":"128 65","time":"2000-01-01 12:03:00","pageName":"index.html","msg":"hello"}}

测试结果:

读取嵌套json语句:(重点!!!)

{"browser":"chrome:true version:63.0.3315.0","custid":"10000",
"clz":{"eventCategory":"txt_input","position":"128 64",
"time":"2000-02-01 12:00:00","pageName":"http://localhost:8080list.html","msg":"ddd"}}
 
 
46825|event_login|949395933986|192.168.56.202


{
	"name":"zhangsan",
	"friends":
	{
		"friend1":"lisi",
		"friend2":"wangwu",
		"msg":["haha","yaya"]
	}
}

预期的解析格式:

{
	"name":"zhangsan",
	"friend1":"lisi",
	"friend2":"wangwu",
	"msg":["haha","yaya"]
}

logstash配置文件写法:

input 
{
	file{
        path => "/opt/system/sys.log"
        start_position => "beginning"
        sincedb_path => "/dev/null"      (设置每次都从文章开头读起)
        type => "system"
     }
    file{
        path => "/opt/action/user.log"
        start_position => "beginning"
        sincedb_path => "/dev/null"      (设置每次都从文章开头读起)
        type => "action"
        codec => json                     (输入文本为json格式的)
     }
}

filter
grok{
    match => {"message" => "(?<userid>[0-9]+)\|(?<event_name>[a-zA-Z_]+)\|(?<times>[0-9]+)\|(?<client_ip>[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})"}
    remove_field => ["message"]
  }
    {
        mutate
	       {
            add_field => { "@abc" => "%{clz}" } #先新建一个新的字段,并将friends赋值给它
           }
	    json
	       {
		      source => "@abc"	#再进行解析
		      remove_field => [ "@abc","clz" ]	#删除不必要的字段,也可以不用这语句
	       }
}

output
{
	elasticsearch{           (输出到elasticsearch数据库里)
        hosts => "http://192.168.56.101:9200"
        index => "system"     (库名)
        document_type => "sys"   (表名)
    }
}

grok语法与正则的使用:

现要对以下非json格式进行分割,并将其以json格式输出

68738|event_login|947700000426|192.168.56.1
50553|event_login|948042000426|192.168.56.2
76286|event_login|951224460426|192.168.56.3

方法一:

正则匹配:

input {
    file {
        path => "/opt/b.txt"
        start_position => "beginning"
        sincedb_path => "/dev/null"
    }
}
filter {
    grok{
        match => {"message" => "(?<userid>[0-9]+)\|(?<event_name>[a-zA-Z_]+)\|(?<times>[0-9]+)\|(?<client_ip>[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})"}
        remove_field => ["message"]
    }
}
output {
    stdout {
        codec => rubydebug
    }
}

方法二:

grok语法:

input {
    file {
        path => "/opt/b.txt"
        start_position => "beginning"
        sincedb_path => "/dev/null"
    }
}
filter {
    grok {
        match => { "message" => "%{NUMBER:usid}\|%{WORD:uname}\|%{NUMBER:times}\|%{IP:client_ip}" }
        remove_field => [ "message" ]
    }
}
output {
    stdout {
        codec => rubydebug
    }
}

结果测试:

双线程读取:

同时读取两个文件,这里以a.txt json格式,与b.txt普通格式为例:

文件内容:

system:

46825|event_login|949395933986|192.168.56.202
65668|event_login|949413933986|192.168.56.128
28504|event_login|949431933986|192.168.56.24
84867|event_login|949449993986|192.168.56.99
60249|event_login|949467993986|192.168.56.52
24364|event_login|949485993986|192.168.56.128
38244|event_login|949503993986|192.168.56.66
37634|event_login|949521993986|192.168.56.67
66553|event_login|949539993986|192.168.56.27
44840|event_login|949557993986|192.168.56.240

 action:

{"browser":"chrome:true version:63.0.3315.0","custid":"10000","clz":{"eventCategory":"txt_input","position":"128 64","time":"2000-02-01 12:00:00","pageName":"http://localhost:8080list.html","msg":"ddd"}}
{"browser":"chrome:true version:63.0.3315.0","custid":"82251","clz":{"eventCategory":"txt_input","position":"128 64","time":"2000-02-01 12:01:00","pageName":"http://localhost:8080list.html","msg":"ccc"}}
{"browser":"chrome:true version:65.0.3314.0","custid":"15991","clz":{"eventCategory":"txt_input","position":"128 64","time":"2000-02-01 12:02:00","pageName":"http://localhost:8080index.html","msg":"ddd"}}
{"browser":"chrome:true version:65.0.3314.0","custid":"44813","clz":{"eventCategory":"href_click","position":"128 64","time":"2000-02-01 12:03:00","pageName":"http://localhost:8080list.html","msg":"ccc"}}
{"browser":"chrome:true version:63.0.3315.0","custid":"35950","clz":{"eventCategory":"href_click","position":"128 64","time":"2000-02-01 12:04:00","pageName":"http://localhost:8080index.html","msg":"ddd"}}
{"browser":"chrome:true version:63.0.3315.0","custid":"68266","clz":{"eventCategory":"btn_click","position":"128 64","time":"2000-02-01 12:05:00","pageName":"http://localhost:8080list.html","msg":"ccc"}}
{"browser":"chrome:true version:63.0.3315.0","custid":"40362","clz":{"eventCategory":"txt_input","position":"128 64","time":"2000-02-01 12:06:00","pageName":"http://localhost:8080list.html","msg":"hello"}}
{"browser":"chrome:true version:63.0.3315.0","custid":"40355","clz":{"eventCategory":"txt_input","position":"128 64","time":"2000-02-01 12:07:00","pageName":"http://localhost:8080list.html","msg":"ccc"}}
{"browser":"chrome:true version:63.0.3315.0","custid":"39131","clz":{"eventCategory":"href_click","position":"128 64","time":"2000-02-01 12:08:00","pageName":"http://localhost:8080list.html","msg":"ddd"}}
{"browser":"chrome:true version:63.0.3315.0","custid":"31252","clz":{"eventCategory":"txt_input","position":"128 64","time":"2000-02-01 12:09:00","pageName":"http://localhost:8080index.html","msg":"ddd"}}
{"browser":"chrome:true version:65.0.3314.0","custid":"28516","clz":{"eventCategory":"btn_click","position":"128 64","time":"2000-02-01 12:10:00","pageName":"http://localhost:8080list.html","msg":"hello"}}
{"browser":"chrome:true version:65.0.3314.0","custid":"55582","clz":{"eventCategory":"href_click","position":"128 64","time":"2000-02-01 12:11:00","pageName":"http://localhost:8080list.html","msg":"ccc"}}
{"browser":"chrome:true version:63.0.3315.0","custid":"62551","clz":{"eventCategory":"href_click","position":"128 64","time":"2000-02-01 12:12:00","pageName":"http://localhost:8080index.html","msg":"hello"}}

配置文件如下写法:

input {
    file {
        path => "/opt/b.txt"
        start_position => "beginning"
        sincedb_path => "/dev/null"
        type => "system"
    }
    file {
        path => "/opt/a.txt"
        start_position => "beginning"
        sincedb_path => "/dev/null"
        codec => json
        type => "action"
    }
}
filter {
    if [type] == "system" {
        grok {            match => { "message" => "(?<userid>[0-9]+)\|(?<event_name>[a-zA-Z_]+)\|(?<times>[0-9]+)\|(?<clientip>[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})" }
            remove_field => [ "message" ]
        }
    }else {
         mutate {
             add_field => { "@adv" => "%{clz}" }
         }
         json {
            source => "@adv"
            remove_field => [ "@adv","clz" ]
         }
    }
}
output {
    stdout {
        codec => rubydebug
    }
}

结果测试:

使用logstash将读取内容导入到elasticsearch中:

写法如下:

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值