上一篇文章阐述了通用的flume+kafka+storm模式,若是有如下需要:在传输数据的同时,携带header头文件信息,比如在传输文件的同时,在flume Event的header中携带文件的元数据,该怎么处理呢。
上一篇文章连接:http://blog.csdn.net/chenguangchun1993/article/details/79474350
核心有两点:
第一点:
在flume的配置文件中 设置:a1.sinks.k1.useFlumeEventFormat = true , useFlumeEventFormat默认为false,表示只将Event的body传送给kafka,若为true,表示将整个Event(header+body)一起传输给kafka。
第二点:
在storm解析kafka数据的时候,接收的是一个byte数组,默认是string类型,我们需要将其反序列话为AvroflumeEvent。
storm代码如下:
KafkaTopology.java代码上一篇一样,这里就不列出了
ParseBolt.java
package com.cgc.kafka;
import org.apache.avro.io.BinaryDecoder;
import org.apache.flume.source.avro.AvroFlumeEvent;
import org.apache.storm.task.OutputCollector;
import org.apache.storm.task.TopologyContext;
import org.apache.storm.topology.OutputFieldsDeclarer;
import org.apache.storm.topology.base.BaseRichBolt;
import org.apache.storm.tuple.Tuple;
import java.io.IOException;
import java.nio.ByteBuffer;
import java.util.Map;
import org.apache.avro.specific.SpecificDatumReader;
import org.apache.avro.io.DecoderFactory;
/**
* Created by chenguangchun on 2018/2/28
*/
public class ParseBolt extends BaseRichBolt {
private OutputCollector collector;
public void prepare(Map stormConf, TopologyContext context, OutputCollector collector) {
this.collector=collector;
}
public void execute(Tuple input) {
AvroFlumeEvent result = null;
ByteBuffer data = null;
Map<CharSequence, CharSequence> map = null;
byte[] bytes = input.getString(0).getBytes();
//解码还原为flumeEvent
SpecificDatumReader<AvroFlumeEvent> reader = new SpecificDatumReader<AvroFlumeEvent>(AvroFlumeEvent.class);
BinaryDecoder decoder = DecoderFactory.get().binaryDecoder(bytes, null);
try{
result = reader.read(null, decoder);
map = result.getHeaders();
data = result.getBody();
}catch (IOException e){
e.printStackTrace();
}
// 输出header信息和body信息
System.out.println("header: ");
for (Map.Entry<CharSequence, CharSequence>entry: map.entrySet()){
System.out.println(entry.getKey() + " : " + entry.getValue());
}
String bodyData = new String(data.array());
System.out.println("body: " + bodyData);
this.collector.ack(input);//告诉KafkaSpout已处理完成(必须应答Spout才记录读取进度)
}
public void declareOutputFields(OutputFieldsDeclarer declarer) {
}
}
storm已经处理完毕。
向flume发送数据的时候加上头文件。
import com.google.gson.Gson;
import org.apache.tika.metadata.Metadata;
import org.apache.flume.event.JSONEvent;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.net.HttpURLConnection;
import java.net.MalformedURLException;
/**
* Created by chenguangchun on 2018/3/6
*/
public class ImageToJson {
public static void main(String[] args) {
JSONEvent jse = new JSONEvent();
Map<String, String > map = new HashMap<String, String >();
map.put("time","2018");
map.put("user","yk");
jse.setBody("this is a message, hahhahhahhahhahah".getBytes());
jse.setHeaders(map);
Gson gson = new Gson();
List<Object> events1 = new ArrayList<Object>();
events1.add(jse);
String jsonstr = gson.toJson(events1);
post("http://192.168.1.22:50000", jsonstr);
}
public void post(String urlstr, String json){
try{
//创建连接
URL url = new URL(urlstr);
HttpURLConnection connection = (HttpURLConnection) url.openConnection();
connection.setDoOutput(true);
connection.setDoInput(true);
connection.setRequestMethod("POST");
connection.setUseCaches(false);
connection.setInstanceFollowRedirects(true);
connection.setRequestProperty("Content-Type",
"application/x-www-form-urlencoded");
connection.connect();
//POST请求
DataOutputStream out = new DataOutputStream(
connection.getOutputStream());
out.writeBytes(json);
out.flush();
out.close();
//读取响应
BufferedReader reader = new BufferedReader(new InputStreamReader(
connection.getInputStream()));
String lines;
StringBuffer sb = new StringBuffer("");
while ((lines = reader.readLine()) != null) {
lines = new String(lines.getBytes(), "utf-8");
sb.append(lines);
}
System.out.println(sb);
reader.close();
// 断开连接
connection.disconnect();
} catch (MalformedURLException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (UnsupportedEncodingException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
启动程序。
然后可以在storm看到结果。
可以看到,header信息已完全接收到。OK!
参考文章:http://blog.csdn.net/a95473004/article/details/53896791