Flume中的拦截器的使用方法(Interceptor)

Flume中的拦截器的使用方法(Interceptor)

Flume中的拦截器(interceptor),用户Source读取events发送到Sink的时候,在events header中加入一些有用的信息,或者对events的内容进行过滤,完成初步的数据清洗。这在实际业务场景中非常有

Java代码实现

用java代码实现简单的功能,如果出现hello开头的,拦截到一个地方,出现hi开头的 拦截到另一个地方

import org.apache.flume.Context;
import org.apache.flume.Event;
import org.apache.flume.interceptor.Interceptor;

import java.util.ArrayList;
import java.util.List;
import java.util.Map;

/**
 * @Description  对source  接收到的event 进行分辨
 * event:header,body
 *  如果  body内容中以hello开头,则给当前的event header 打入hello标签
 */
public class InterceptorDemo implements Interceptor {
    ArrayList<Event> addHeaderEvents=null;
    @Override
    public void initialize() {
        addHeaderEvents = new ArrayList<>();
    }

    @Override
    public Event intercept(Event event) {
        Map<String, String> headers = event.getHeaders();
        byte[] body = event.getBody();
        String bodystr = new String(body);
        if (bodystr.startsWith("hello")){
            headers.put("type","hello");
        }else if (bodystr.startsWith("hi")){
            headers.put("type","hi");
        }else {
            headers.put("type","other");
        }
        return event;
    }

    @Override
    public List<Event> intercept(List<Event> list) {
        addHeaderEvents.clear();
        for (Event event : list) {
            Event opEvent = intercept(event);
            addHeaderEvents.add(opEvent);
        }
        return addHeaderEvents;
    }

    @Override
    public void close() {
            addHeaderEvents.clear();
            addHeaderEvents=null;
    }

    public static class Builder implements Interceptor.Builder{

        @Override
        public Interceptor build() {
            return new InterceptorDemo();
        }

        @Override
        public void configure(Context context) {

        }
    }
}

打Jar 包

在这里插入图片描述
依次双击clean 和package ,此时会出现jar包
在这里插入图片描述
然后将jar包放到 flume路径的lib路径内备用

创建conf文件

这里将hello 拦截到 HDFS上面保存
hi拦截到kafka 保存
other拦截到 logger保存

interceptordemo.sources=interceptorDemoSource
interceptordemo.channels=interceptorDemoChannelhello  interceptorDemoChannelhi interceptorDemoChannelother
interceptordemo.sinks=interceptorDemoSinkhello interceptorDemoSinkhi interceptorDemoSinkother

interceptordemo.sources.interceptorDemoSource.type=netcat
interceptordemo.sources.interceptorDemoSource.bind=localhost
interceptordemo.sources.interceptorDemoSource.port=44444
interceptordemo.sources.interceptorDemoSource.interceptors=interceptor1
interceptordemo.sources.interceptorDemoSource.interceptors.interceptor1.type=nj.zb.kb11.InterceptorDemo$Builder
interceptordemo.sources.interceptorDemoSource.selector.type=multiplexing
interceptordemo.sources.interceptorDemoSource.selector.mapping.hello=interceptorDemoChannelhello
interceptordemo.sources.interceptorDemoSource.selector.mapping.hi=interceptorDemoChannelhi
interceptordemo.sources.interceptorDemoSource.selector.mapping.other=interceptorDemoChannelother
interceptordemo.sources.interceptorDemoSource.selector.header=type

interceptordemo.channels.interceptorDemoChannelhello.type=memory
interceptordemo.channels.interceptorDemoChannelhello.capacity=1000
interceptordemo.channels.interceptorDemoChannelhello.transactionCapacity=100

interceptordemo.channels.interceptorDemoChannelhi.type=memory
interceptordemo.channels.interceptorDemoChannelhi.capacity=1000
interceptordemo.channels.interceptorDemoChannelhi.transactionCapacity=100

interceptordemo.channels.interceptorDemoChannelother.type=memory
interceptordemo.channels.interceptorDemoChannelother.capacity=1000
interceptordemo.channels.interceptorDemoChannelother.transactionCapacity=100

interceptordemo.sinks.interceptorDemoSinkhello.type=hdfs
interceptordemo.sinks.interceptorDemoSinkhello.hdfs.fileType=DataStream
interceptordemo.sinks.interceptorDemoSinkhello.hdfs.filePrefix=hello
interceptordemo.sinks.interceptorDemoSinkhello.hdfs.fileSuffix=.csv
interceptordemo.sinks.interceptorDemoSinkhello.hdfs.path=hdfs://192.168.146.222:9000/kb11/hello/%Y-%m-%d
interceptordemo.sinks.interceptorDemoSinkhello.hdfs.useLocalTimeStamp=true
interceptordemo.sinks.interceptorDemoSinkhello.hdfs.batchSize=640
interceptordemo.sinks.interceptorDemoSinkhello.hdfs.rollCount=0
interceptordemo.sinks.interceptorDemoSinkhello.hdfs.rollSize=6400000
interceptordemo.sinks.interceptorDemoSinkhello.hdfs.rollInterval=3

interceptordemo.sinks.interceptorDemoSinkhi.type=org.apache.flume.sink.kafka.KafkaSink
interceptordemo.sinks.interceptorDemoSinkhi.batchSize=640
interceptordemo.sinks.interceptorDemoSinkhi.brokerList=192.168.146.222:9092
interceptordemo.sinks.interceptorDemoSinkhi.topic=hi

interceptordemo.sinks.interceptorDemoSinkother.type=logger

interceptordemo.sources.interceptorDemoSource.channels=interceptorDemoChannelhello  interceptorDemoChannelhi interceptorDemoChannelother
interceptordemo.sinks.interceptorDemoSinkhello.channel=interceptorDemoChannelhello
interceptordemo.sinks.interceptorDemoSinkhi.channel=interceptorDemoChannelhi
interceptordemo.sinks.interceptorDemoSinkother.channel=interceptorDemoChannelother

然后分别在hdfs上面创建对应路径的hello文件夹
在kafka创建topic 名为 hi
运行conf文件:

./bin/flume-ng agent --name interceptordemo --conf ./conf/ --conf-file ./conf/kb11job/netcat-flume-interceptor.conf -Dflume.root.logger=INFO,console

然后另一台打开kafka的消费者模式:

kafka-console-consumer.sh --topic hi --bootstrap-server 192.168.146.222:9092 --from-beginning

然后第三台连接nc的控制台:

telnet localhost 44444

验证

此时在 telnet端输入 : hello java
此时在HDFS上面会生成对应的文件
在这里插入图片描述
在这里插入图片描述

然后在telnet端输入: hi lilei
此时在kafka topic的 hi 里面 会出现 对应内容
在这里插入图片描述
在这里插入图片描述

此时在控制台输入任意其他内容
然后在logger会对应出现内容
在这里插入图片描述
在这里插入图片描述

  • 1
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值