问题导读
1.项目背景是什么?
2.nextTuple() 的作用是什么?
3.declareOutputFields(OutputFieldsDeclarer declarer)的作用是什么?
4.excute() 和declareOutputFields(OutputFieldsDeclarer declarer) 的作用是什么?
项目背景:实时GPS数据客流特征分析系统,项目来源于深圳交委,数据来源是深圳大约5万两出租车和公交车的车载GPS仪,其目的是要研究出行者的出行特征、实时路况、客流特征等。
开发环境:请参考: storm的开发环境部署指导
函数详解:
一、GPSReceiverSpout
最重要的两个函数是 nextTuple() 和 declareOutputFields(OutputFieldsDeclarer declarer)。
nextTuple()告诉storm下一个tuple是什么内容,其处理过程是先用一个socket函数接受来自网络的实时GPS数据,用lineSplit()将GPS以逗号分隔 成字符串数组,发送给下一个处理单元 DistrictMatchingBolt。
例如一条原始GPS记录:粤BXXXXX,114.121765,22.569218,2013-02-08 17:29:58,1065382,28,101,0,蓝色
_collector.emit(new Values(GPSRecord[0],GPSRecord[3],GPSRecord[7],GPSRecord[5],GPSRecord[6] , GPSRecord[2],GPSRecord[1])) 这一条语句则提取了GPS记录中的第0、3、7、5、6、2、1列字符串发送给下一个处理单元。
declareOutputFields()告诉下一个处理单元DistrictMatchingBolt: spout的输出数据即DistrictMatchingBolt的输入数据格式的列数和内容,即:"vehicle_number","date_time","occupied","speed","bearing","lantitude","longitude" 共7列。
复制代码
二、DistrictMatchingBolt
最重要的两个函数是 excute() 和declareOutputFields(OutputFieldsDeclarer declarer)。
excute() 将GPSReceiverSpout接受到的数据提取经度和纬度,调用Sects类 Sect.fetchSect(GPSrecord)方法,查询本地的地理信息数据库,返回该条GPS记录所在的区域标号 districtID,并将这个字段添加到GPS后面,发射给下一个bolt : countBolt。
declareOutputFields() 告诉下一个countBolt , 这个DistrcitMatchingBolt的输出数据格式是:"viechleID", "dateTime", "occupied", "speed","bearing", "latitude", "longitude", "districtID"。
需要说明的是Sects类调用了开源的地理信息系统工具geotools,感兴趣的朋友可以去http://www.geotools.org/ 下载安装包,并将相关的jar包全部添加到Eclipse 的building path里面,就可以调用geotools查询本地的地理信息数据库了。
1.项目背景是什么?
2.nextTuple() 的作用是什么?
3.declareOutputFields(OutputFieldsDeclarer declarer)的作用是什么?
4.excute() 和declareOutputFields(OutputFieldsDeclarer declarer) 的作用是什么?
项目背景:实时GPS数据客流特征分析系统,项目来源于深圳交委,数据来源是深圳大约5万两出租车和公交车的车载GPS仪,其目的是要研究出行者的出行特征、实时路况、客流特征等。
开发环境:请参考: storm的开发环境部署指导
函数详解:
一、GPSReceiverSpout
最重要的两个函数是 nextTuple() 和 declareOutputFields(OutputFieldsDeclarer declarer)。
nextTuple()告诉storm下一个tuple是什么内容,其处理过程是先用一个socket函数接受来自网络的实时GPS数据,用lineSplit()将GPS以逗号分隔 成字符串数组,发送给下一个处理单元 DistrictMatchingBolt。
例如一条原始GPS记录:粤BXXXXX,114.121765,22.569218,2013-02-08 17:29:58,1065382,28,101,0,蓝色
_collector.emit(new Values(GPSRecord[0],GPSRecord[3],GPSRecord[7],GPSRecord[5],GPSRecord[6] , GPSRecord[2],GPSRecord[1])) 这一条语句则提取了GPS记录中的第0、3、7、5、6、2、1列字符串发送给下一个处理单元。
declareOutputFields()告诉下一个处理单元DistrictMatchingBolt: spout的输出数据即DistrictMatchingBolt的输入数据格式的列数和内容,即:"vehicle_number","date_time","occupied","speed","bearing","lantitude","longitude" 共7列。
- public class GPSReceiverSpout implements IRichSpout {
- private static final long serialVersionUID = 1L;
- private SpoutOutputCollector _collector;
- private BufferedReader fileReader;
- //private TopologyContext context;
- //private String file="/home/ghchen/2013-01-05.1/2013-01-05--11_05_48.txt";
- private TupleInfo tupleInfo=new TupleInfo();
-
- static Socket sock=null;
-
- @Override
- public void close() {
- }
- @Override
- public void open(Map conf, TopologyContext context, SpoutOutputCollector collector)
- {
- _collector = collector;
- System.out.println("This is open function in FieldSpout !");
- }
- @SuppressWarnings("unused")
- @Override
- public void nextTuple() {
- int count=0;
- int ch=0;
- int err=0;
- try {
- if(sock==null){
- sock=new Socket("172.20.14.XXX",portXXXX);}
- while(true){
- byte[] b3= new byte[3];
- if(sock!=null ){
- try{
- sock.getInputStream().read(b3,0,3);
- ch=b3[0];
- }catch ( Exception e){
- System.out.println("connection reset, reconnecting ...");
- sock.close();
- Thread.sleep(100);
- sock=new Socket("172.20.14.XXX",portXXXX);;
- }
- }else{
- sock=new Socket("172.20.14.XXX",portXXXX);;
- break ;
- }
- int len=SocketJava.bytesToShort(b3, 1);
- if(len<0) break;
- byte[] bytelen= new byte[len];
- sock.getInputStream().read(bytelen);
- if(bytelen==null){
- System.out.println("read the second part from byte from socket failed ! ");
- break;
- }
- sock.getInputStream().markSupported();
- sock.getInputStream().mark(3);
- String gpsString=SocketJava.DissectOneMessage(ch,bytelen);
- String[] GPSRecord=null;
- if(gpsString!=null){
- GPSRecord =gpsString.split(TupleInfo.getDelimiter());
- _collector.emit(new Values(GPSRecord[0],GPSRecord[3],GPSRecord[7],GPSRecord[5],
- GPSRecord[6] , GPSRecord[2],GPSRecord[1]));
- //}
- }else{
- break;
- }
- }
- } catch (IOException e) {
- e.printStackTrace();
- } catch (Exception e) {
- e.printStackTrace();
- }
- }
- @Override
- public void ack(Object id) {
- System.out.println("OK:"+id);
- }
-
- @Override
- public void fail(Object id) {
- System.out.println("Fail:"+id);
- }
- @Override
- public void declareOutputFields(OutputFieldsDeclarer declarer) {
- TupleInfo tuple = new TupleInfo();
- Fields fieldsArr;
- try {
- fieldsArr= tuple.getFieldList();
- declarer.declare(fieldsArr);
- } catch (Exception e) {
- throw new RuntimeException("error:fail to new Tuple object in declareOutputFields, tuple is null",e);
- }
- }
- @Override
- public void activate() {
- }
- @Override
- public void deactivate() {
- }
- @Override
- public Map<String, Object> getComponentConfiguration() {
- return null;
- }
- static int count=0;
- public static void writeToFile(String fileName, Object obj){
- try {
- FileWriter fwriter;
- fwriter= new FileWriter(fileName,true);
- BufferedWriter writer= new BufferedWriter(fwriter);
- writer.write(obj.toString());
- writer.close();
- } catch (IOException e1) {
- e1.printStackTrace();
- }
- }
- }
二、DistrictMatchingBolt
最重要的两个函数是 excute() 和declareOutputFields(OutputFieldsDeclarer declarer)。
excute() 将GPSReceiverSpout接受到的数据提取经度和纬度,调用Sects类 Sect.fetchSect(GPSrecord)方法,查询本地的地理信息数据库,返回该条GPS记录所在的区域标号 districtID,并将这个字段添加到GPS后面,发射给下一个bolt : countBolt。
declareOutputFields() 告诉下一个countBolt , 这个DistrcitMatchingBolt的输出数据格式是:"viechleID", "dateTime", "occupied", "speed","bearing", "latitude", "longitude", "districtID"。
需要说明的是Sects类调用了开源的地理信息系统工具geotools,感兴趣的朋友可以去http://www.geotools.org/ 下载安装包,并将相关的jar包全部添加到Eclipse 的building path里面,就可以调用geotools查询本地的地理信息数据库了。
- package main.java.realODMatrix.bolt;
- import java.io.IOException;
- import java.util.List;
- import java.util.Map;
- import backtype.storm.task.OutputCollector;
- import backtype.storm.task.TopologyContext;
- import backtype.storm.topology.IRichBolt;
- import backtype.storm.topology.OutputFieldsDeclarer;
- import backtype.storm.tuple.Fields;
- import backtype.storm.tuple.Values;
- import backtype.storm.tuple.Tuple;
- import main.java.realODMatrix.spout.FieldListenerSpout;
- import main.java.realODMatrix.struct.*;
- public class DistrictMatchingBolt implements IRichBolt {
- private static final long serialVersionUID = -433427751113113358L;
- private OutputCollector _collector;
- Integer districtID ;
- GPSRcrd record;
- Map<GPSRcrd, Integer> gpsMatch; //map
- Integer taskID;
- String taskname;
- List<Object> inputLine;
- Fields matchBoltDeclare=null;
- static String path = "/home/ghchen/sects/sects.shp";
- static Sects sects=null ;
- int count=0;
- @Override
- public void prepare(Map stormConf, TopologyContext context,
- OutputCollector collector) {
- // TODO Auto-generated method stub
- this._collector=collector;
- this.taskID=context.getThisTaskId();
- this.taskname=context.getThisComponentId();
- }
- @Override
- public void execute(Tuple input) {
- try {
- if(sects==null){
- sects= new Sects(path);
- }
- List<</span>Object> inputLine = input.getValues();//getFields();
- Fields inputLineFields = input.getFields();
- record=new GPSRcrd(Double.parseDouble((String) inputLine.get(6)),
- Double.parseDouble((String) inputLine.get(5)), Integer.parseInt((String) inputLine.get(3)),
- Integer.parseInt((String) inputLine.get(4)));
- if( Double.parseDouble((String) inputLine.get(6)) > 114.5692938 ||
- Double.parseDouble((String) inputLine.get(6)) <</span> 113.740000 ||
- Double.parseDouble((String) inputLine.get(5)) > 22.839945 ||
- Double.parseDouble((String) inputLine.get(5)) <</span> 22.44
- ) return;
- districtID = sects.fetchSect(record);
- if(districtID!=-1)
- {
- System.out.println(count++ +": GPS Point falls into Sect No. :" + districtID);
- inputLine.add(Integer.toString(districtID));
- //input.getFields().toList().add("districtID");
- List<String> fieldList= input.getFields().toList();
- fieldList.add("districtID");
- matchBoltDeclare=new Fields(fieldList);
- //FieldListenerSpout.writeToFile("/home/ghchen/output","matchBoltDeclare="+matchBoltDeclare);
- String[] obToStrings=new String[inputLine.size()];
- obToStrings=inputLine.toArray(obToStrings);
- _collector.emit(new Values(obToStrings));
- //_collector.emit(new Values(inputLine));
- }
- } catch (Exception e) {
- e.printStackTrace();
- }
- _collector.ack(input);
- }
- @Override
- public void cleanup() {
- // TODO Auto-generated method stub
- System.out.println("-- District Mathchier ["+taskname+"-"+districtID+"] --");
- for(Map.Entry<</span>GPSRcrd, Integer> entry : gpsMatch.entrySet()){
- System.out.println(entry.getKey()+": "+entry.getValue());
- }
- }
- @Override
- public void declareOutputFields(OutputFieldsDeclarer declarer) {
- declarer.declare(new Fields ("viechleID", "dateTime", "occupied", "speed",
- "bearing", "latitude", "longitude", "districtID"));
- }
- @Override
- public Map<</span>String, Object> getComponentConfiguration() {
- // TODO Auto-generated method stub
- return null;
- }
- }