本文内容:
前两天在弄flume的正则过滤器,因为日志截取原因,自定义写了一个。今天就说一下官方的正则过滤器。
官方的正则过滤器用来过滤被正则匹配的日志。
1.excludeEvents属性
当 excludeEvents 属性值为 true 则把正则匹配到的日志 过滤掉,不读取到channel,通过sink 进行输出。
当 excludeEvents 属性值为 false 则把正则没有匹配到的日志 过滤掉,将正则匹配到的日志信息读取到channel,通过sink 进行输出。
excludeEvents 默认值为false 。
注意:flume的正则过滤是过滤整条日志信息,不会将日志信息通过正则匹配进行截取。
2.案例
案例目的,过滤日志信息;
过滤前日志信息:
2017-01-06T11:32:48: Debug: D-UNK-000-000: Rules file processing took 332 usec.
2017-01-06T11:32:48: Debug: D-UNK-000-000: Flushing events to object servers
2017-01-06T11:32:48: Debug: D-UNK-000-000: 1 buffered alerts
2017-01-06T11:33:18: Debug: D-JPR-000-000: Parsing events: Omegamon_Base;cms_hostname='itmserver';cms_port='37076';integration_type='U';master_reset_flag='';appl_label='';situation_name='disk';situation_type='S';situation_origin='itmserver:LZ';situation_time='01/06/2017 11:33:23.000';situation_status='N';situation_thrunode='TEMS_TEST';situation_fullname='home_disk_error';situation_displayitem='';source='ITM';sub_source='itmserver:LZ';hostname='itmserver';origin='192.168.100.50';adapter_host='itmserver';date='01/06/2017';severity='CRITICAL';msg='itm server home directory > 80%';situation_eventdata='~';END
2017-01-06T11:33:18: Debug: D-UNK-000-000: [Event Processor] EventString: Omegamon_Base;
adapter_host='itmserver';
cms_hostname='itmserver';
situation_type='S';
situation_eventdata='~';
integration_type='U';
situation_displayitem='';
msg='itm server home directory > 80%';
sub_source='itmserver:LZ';
situation_time='01/06/2017 11:33:23.000';
situation_thrunode='TEMS_TEST';
master_reset_flag='';
appl_label='';
hostname='itmserver';
situation_fullname='home_disk_error';
cms_port='37076';
situation_status='N';
source='ITM';
severity='CRITICAL';
situation_origin='itmserver:LZ';
date='01/06/2017';
situation_name='disk';
origin='192.168.100.50';
END
2017-01-06T11:33:18: Debug: D-UNK-000-000: [Event Processor] ClassName: Omegamon_Base
2017-01-06T11:33:18: Debug: D-UNK-000-000: [Event Processor] adapter_host: itmserver
2017-01-06T11:33:18: Debug: D-UNK-000-000: [Event Processor] cms_hostname: itmserver
过滤后日志信息:
2017-01-06T11:33:18: Debug: D-JPR-000-000: Parsing events: Omegamon_Base;cms_hostname='itmserver';cms_port='37076';integration_type='U';master_reset_flag='';appl_label='';situation_name='disk';situation_type='S';situation_origin='itmserver:LZ';situation_time='01/06/2017 11:33:23.000';situation_status='N';situation_thrunode='TEMS_TEST';situation_fullname='home_disk_error';situation_displayitem='';source='ITM';sub_source='itmserver:LZ';hostname='itmserver';origin='192.168.100.50';adapter_host='itmserver';date='01/06/2017';severity='CRITICAL';msg='itm server home directory > 80%';situation_eventdata='~';END
2.1 flume 配置文件
(source 采用 exec ,channel 采用memory,sink 采用 file ,filter采用regex_filter ,因为excludeEvents 默认值为false ,所以没有进行配置)
a1.sources = r1
a1.sinks = k1
a1.channels = c1
# Describe/configure the source
a1.sources.r1.type = exec
a1.sources.r1.shell = /bin/bash -c
a1.sources.r1.channels = c1
a1.sources.r1.command = tail -F /opt/apps/logs/tail4.log
#filter
a1.sources.r1.interceptors=i1
a1.sources.r1.interceptors.i1.type=regex_filter
a1.sources.r1.interceptors.i1.regex=(Parsing events)(.*)(END)
# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
# sink
a1.sinks.k1.type = file_roll
a1.sinks.k1.channel = c1
a1.sinks.k1.sink.directory = /opt/apps/tmp
2.2 案例2,采用avor source 数据源
(其实配置跟exec 差不多一样,就是source 不一样)
a1.sources = r1
a1.sinks = k1
a1.channels = c1
# Describe/configure the source
a1.sources.r1.type= avro
a1.sources.r1.channels = c1
a1.sources.r1.bind = 0.0.0.0
a1.sources.r1.port = 4141
#filter
a1.sources.r1.interceptors=i1
a1.sources.r1.interceptors.i1.type=regex_filter
a1.sources.r1.interceptors.i1.regex=(Parsing events)(.*)(END)
# Use a channel which buffers events in memory
a1.channels.c1.type= memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
#sink
a1.sources.r1.channels = c1
a1.sinks.k1.type = file_roll
a1.sinks.k1.channel = c1
a1.sinks.k1.sink.directory = /opt/apps/tmp
注意
注意:flume 的 interceptors 对有些数据源不支持。比如说 spooldir 类型的数据源就不支持。但是对exec 和 avor 数据源是支持的,亲测有效。