Flume HttpSource collects raw type data

文章介绍了如何在使用Flume的HttpSource时,处理来自Http接口的原始文本数据,而不仅仅是限定格式的Json数组。作者提供了一个自定义的HttpSourceRawHandler,该处理器允许接收到的数据不需包裹在Json数组中,直接以字符串形式发送到通道。文章还包含了一个配置示例以及测试用例,说明了如何配置Flume以使用这个处理器,并通过Postman测试发送数据。
摘要由CSDN通过智能技术生成

Preface

In the process of using Flume’s HttpSource, we found that when collecting data from the Http interface, we can only collect the Json array type in the specified format, but sometimes we need to collect the original text data of the Http interface, which is not applicable in this case, when searching on the network, we can only find code examples about custom parsing XML handler, so this article is published for reference only.

Code(Putting Jar packages into the lib directory):

package com.examples;

import org.apache.flume.Context;
import org.apache.flume.Event;
import org.apache.flume.event.EventBuilder;
import org.apache.flume.source.http.HTTPBadRequestException;
import org.apache.flume.source.http.HTTPSourceHandler;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import javax.servlet.http.HttpServletRequest;
import java.io.BufferedReader;
import java.nio.charset.UnsupportedCharsetException;
import java.util.ArrayList;
import java.util.List;
import java.util.stream.Collectors;

/**
 * @Description: In flume's default JsonHandler, sending a Json string to the http interface had
 *     to wrap the json in an array, this change will improve it so that even if you send a
 *     normal string you can still receive the data
 * @Author: Cheung Francis Yung
 * @Date: 2023-03-06
 *
 */
public class HttpSourceRawHandler implements HTTPSourceHandler {
    private static final Logger LOG = LoggerFactory.getLogger(HttpSourceRawHandler.class);

    @Override
    public List<Event> getEvents(HttpServletRequest request) throws HTTPBadRequestException, Exception {
        BufferedReader reader = request.getReader();
        String charset = request.getCharacterEncoding();

        final List<Event> events = new ArrayList<Event>(0);
        try {
            // UTF-8 is default for JSON or Text. If no charset is specified, UTF-8 is to be assumed.
            if (charset == null) {
                LOG.debug("Charset is null, default charset of UTF-8 will be used.");
                charset = "UTF-8";
            } else if (!(charset.equalsIgnoreCase("utf-8")
                    || charset.equalsIgnoreCase("utf-16")
                    || charset.equalsIgnoreCase("utf-32"))) {
                LOG.error("Unsupported character set in request {}. "
                        + "This handler supports UTF-8, "
                        + "UTF-16 and UTF-32 only.", charset);
                throw new UnsupportedCharsetException("This handler supports UTF-8, "
                        + "UTF-16 and UTF-32 only.");
            }
            // Get data from source and convert BufferedReader to String type
            String collectData = reader.lines().collect(Collectors.joining());
            // After receiving the event, this Handler sends it directly to the channel
            // without any processing
            events.add(EventBuilder.withBody(collectData.getBytes()));

        }catch (Exception ex) {
            throw new HTTPBadRequestException("Request is not in expected format. " +
                    "Please refer documentation for expected format.", ex);
        }
        return events;
    }
    @Override
    public void configure(Context context) {
    }
}

Test Case

Here is the configuration file for Flume(inimalist version):

agent.sources=r1  
agent.sinks=k1  
agent.channels=c1  

agent.sources.r1.type=http  
agent.sources.r1.bind=xx.xx.xx.xx
agent.sources.r1.port=50000  
agent.sources.r1.channels=c1  
agent.sources.r1.handler=com.examples.HttpSourceRawHandler

agent.channels.c1.type=memory  
agent.channels.c1.capacity=1000  
agent.channels.c1.transactionCapacity=100

agent.sinks.k1.type = file_roll
agent.sinks.k1.channel = c1
agent.sinks.k1.sink.directory = /software/apache-flume-1.9.0/data_logs/test_raw

Launch Flume:

/software/apache-flume-1.9.0/bin/flume-ng agent -c /software/apache-flume-1.9.0/conf -f /software/apache-flume-1.9.0/conf/flume-test-raw.properties -n agent -Dflume.root.logger=INFO,console

Testing the http interface on postman:
在这里插入图片描述
Data in the target directory, implementing the need to send Json strings without following a fixed format.
在这里插入图片描述

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值