Preface
In the process of using Flume’s HttpSource, we found that when collecting data from the Http interface, we can only collect the Json array type in the specified format, but sometimes we need to collect the original text data of the Http interface, which is not applicable in this case, when searching on the network, we can only find code examples about custom parsing XML handler, so this article is published for reference only.
Code(Putting Jar packages into the lib directory):
package com.examples;
import org.apache.flume.Context;
import org.apache.flume.Event;
import org.apache.flume.event.EventBuilder;
import org.apache.flume.source.http.HTTPBadRequestException;
import org.apache.flume.source.http.HTTPSourceHandler;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import javax.servlet.http.HttpServletRequest;
import java.io.BufferedReader;
import java.nio.charset.UnsupportedCharsetException;
import java.util.ArrayList;
import java.util.List;
import java.util.stream.Collectors;
/**
* @Description: In flume's default JsonHandler, sending a Json string to the http interface had
* to wrap the json in an array, this change will improve it so that even if you send a
* normal string you can still receive the data
* @Author: Cheung Francis Yung
* @Date: 2023-03-06
*
*/
public class HttpSourceRawHandler implements HTTPSourceHandler {
private static final Logger LOG = LoggerFactory.getLogger(HttpSourceRawHandler.class);
@Override
public List<Event> getEvents(HttpServletRequest request) throws HTTPBadRequestException, Exception {
BufferedReader reader = request.getReader();
String charset = request.getCharacterEncoding();
final List<Event> events = new ArrayList<Event>(0);
try {
// UTF-8 is default for JSON or Text. If no charset is specified, UTF-8 is to be assumed.
if (charset == null) {
LOG.debug("Charset is null, default charset of UTF-8 will be used.");
charset = "UTF-8";
} else if (!(charset.equalsIgnoreCase("utf-8")
|| charset.equalsIgnoreCase("utf-16")
|| charset.equalsIgnoreCase("utf-32"))) {
LOG.error("Unsupported character set in request {}. "
+ "This handler supports UTF-8, "
+ "UTF-16 and UTF-32 only.", charset);
throw new UnsupportedCharsetException("This handler supports UTF-8, "
+ "UTF-16 and UTF-32 only.");
}
// Get data from source and convert BufferedReader to String type
String collectData = reader.lines().collect(Collectors.joining());
// After receiving the event, this Handler sends it directly to the channel
// without any processing
events.add(EventBuilder.withBody(collectData.getBytes()));
}catch (Exception ex) {
throw new HTTPBadRequestException("Request is not in expected format. " +
"Please refer documentation for expected format.", ex);
}
return events;
}
@Override
public void configure(Context context) {
}
}
Test Case
Here is the configuration file for Flume(inimalist version):
agent.sources=r1
agent.sinks=k1
agent.channels=c1
agent.sources.r1.type=http
agent.sources.r1.bind=xx.xx.xx.xx
agent.sources.r1.port=50000
agent.sources.r1.channels=c1
agent.sources.r1.handler=com.examples.HttpSourceRawHandler
agent.channels.c1.type=memory
agent.channels.c1.capacity=1000
agent.channels.c1.transactionCapacity=100
agent.sinks.k1.type = file_roll
agent.sinks.k1.channel = c1
agent.sinks.k1.sink.directory = /software/apache-flume-1.9.0/data_logs/test_raw
Launch Flume:
/software/apache-flume-1.9.0/bin/flume-ng agent -c /software/apache-flume-1.9.0/conf -f /software/apache-flume-1.9.0/conf/flume-test-raw.properties -n agent -Dflume.root.logger=INFO,console
Testing the http interface on postman:
Data in the target directory, implementing the need to send Json strings without following a fixed format.