自定义Hive Sql Job分析工具

最新推荐文章于 2022-04-19 15:55:45 发布

小飞侠-2

最新推荐文章于 2022-04-19 15:55:45 发布

阅读量283

点赞数

文章标签：大数据 java 开发工具

前言

我们都知道,在大数据领域,Hive的出现帮我降低了许多使用Hadoop书写方式的学习成本.使用用户可以使用类似Sql的语法规则写明查询语句,从hive表数据中查询目标数据.最为重要的是这些sql语句会最终转化为map reduce作业进行处理.这也是Hive最强大的地方.可以简单的理解为Hive就是依托在Hadoop上的1个壳.但是这里有一点点小小的不同,不是每段hive查询sql语句与最后生成的job一一对应,如果你的这段sql是一个大sql,他在转化掉之后,会衍生出许多小job,这些小job是独立存在运行的,以不同的job名称进行区别,但是也会保留公共的job名称.所以一个问题来了,对于超级长的hive sql语句,我想查看到底是哪段子sql花费了我大量的执行时间,在JobHistory上只有每个子Job的运行时间,没有子Job对应的sql语句,一旦这个功能有了之后,就会帮助我们迅速的定位到问题所在.

Hive子Job中的Sql

OK,带着上述的目标,我们要想分析出到底哪段子sql所衍生的job运行时间更长,就要先知道这些sql到底在存在与哪里.在前面的描述中,已经提到了,Hive是依托于Hadoop,自然Hive提交的job信息也是保存在Hadoop的HDFS上的.在联想一下JobHistory中的各个文件类型.你应该会发现带有下面后缀的文件存在.

我们发现里面包含了之前分析过的.jhist文件,还有带conf字符的.xml格式文件,从文件名上来看就是job提交时的一些配置信息,然后我们用vim命令查阅conf.xml后缀的文件,看看里面是不是有我们想要的hive qury string 这样的属性

OK,目标算是找到了,这的确就是我们想要的属性.说明这样的信息的确是存在的,后面的操作就是怎么去解析这段有用的信息了.

程序工具分析Hive Sql Job

知道了目标数据源,我们能想到的最简单快速的方法就是逐行解析文件,做做文本匹配,筛选关键信息.这些代码谁都会写,首先要传入一个HDFS目录地址,这个是在JobHistory的存储目录上加上一个具体日期目录,这段解析程序在文章的末尾会加上.下面列举在调试分析程序时遇到的一些问题,这个还是比较有用的.

1.hive sql中的中文导致解析出现乱码

这个又是非常讨厌的java解析出现乱码的原因,因为考虑到sql中存在中文注释,而Hadoop在存中文的时候都是用utf8的编码方式,所以读出文件数据后进行一次转utf-8编码方式的处理,就是下面所示代码.

[java]view plaincopyprint? 
   
 ... 
 fileSystem=path.getFileSystem(newConfiguration()); 
 in=fileSystem.open(path); 
 InputStreamReaderisr; 
 BufferedReaderbr; 
  
 isr=newInputStreamReader(in,"UTF-8"); 
 br=newBufferedReader(isr); 
  
 while((str=br.readLine())!=null){ 
 ... 

2.单线程解析文件速度过慢

之前在测试环境中做文件解析看不出真实效果,文件一下解析就OK了,但是到真实环境中,多达几万个job文件,程序马上就吃不消了,算上解析文件,再把结果写入mysql,耗时达到60多分钟,后面改成了多线程的方式,后来开到10个线程去跑,速度才快了许多.

3.结果数据写入MySql过慢

后来处理速度是上去了,但是写入sql速度过慢,比如说,我有一次测试,开10个线程区解析,花了8分钟就解析好了几万个文件数据,但是插入数据库花了20分钟左右,而且量也就几万条语句.后来改成了批处理的方式,效果并没有什么大的改变,这个慢的问题具体并没有被解决掉,怀疑可能是因有些语句中存在超长的hive sql语句导致的.

下面是程序的主要分析代码,分为工具类代码,和解析线程类,代码全部链接再此处:https://github.com/linyiqun/yarn-jobhistory-crawler/tree/master/jobHiveSqlAnalyse

主工具代码

[java]view plaincopyprint? 
   
 packageorg.apache.hadoop.mapreduce.v2.hs.tool.sqlanalyse; 
  
 importjava.io.BufferedReader; 
 importjava.io.FileNotFoundException; 
 importjava.io.IOException; 
 importjava.io.InputStream; 
 importjava.io.InputStreamReader; 
 importjava.io.PrintStream; 
 importjava.util.ArrayList; 
 importjava.util.HashMap; 
 importjava.util.LinkedList; 
 importjava.util.List; 
 importjava.util.Map.Entry; 
  
 importorg.apache.hadoop.conf.Configuration; 
 importorg.apache.hadoop.fs.FSDataInputStream; 
 importorg.apache.hadoop.fs.FileContext; 
 importorg.apache.hadoop.fs.FileStatus; 
 importorg.apache.hadoop.fs.FileSystem; 
 importorg.apache.hadoop.fs.Path; 
 importorg.apache.hadoop.fs.RemoteIterator; 
 importorg.apache.hadoop.fs.UnsupportedFileSystemException; 
 importorg.apache.hadoop.io.IOUtils; 
  
 publicclassHiveSqlAnalyseTool{ 
 privateintthreadNum; 
 privateStringdirType; 
 privateStringjobHistoryPath; 
  
 privateFileContextdoneDirFc; 
 privatePathdoneDirPrefixPath; 
  
 privateLinkedList<FileStatus>fileStatusList; 
 privateHashMap<String,String[]>dataInfos; 
 privateDbClientdbClient; 
  
 publicHiveSqlAnalyseTool(StringdirType,StringjobHistoryPath, 
 intthreadNum){ 
 this.threadNum=threadNum; 
 this.dirType=dirType; 
 this.jobHistoryPath=jobHistoryPath; 
  
 this.dataInfos=newHashMap<String,String[]>(); 
 this.fileStatusList=newLinkedList<FileStatus>(); 
 this.dbClient=newDbClient(BaseValues.DB_URL, 
 BaseValues.DB_USER_NAME,BaseValues.DB_PASSWORD, 
 BaseValues.DB_HIVE_SQL_STAT_TABLE_NAME); 
  
 try{ 
 doneDirPrefixPath=FileContext.getFileContext(newConfiguration()) 
 .makeQualified(newPath(this.jobHistoryPath)); 
 doneDirFc=FileContext.getFileContext(doneDirPrefixPath.toUri()); 
 }catch(UnsupportedFileSystemExceptione){ 
 //TODOAuto-generatedcatchblock 
 e.printStackTrace(); 
 }catch(IllegalArgumentExceptione){ 
 //TODOAuto-generatedcatchblock 
 e.printStackTrace(); 
 } 
 } 
  
 publicvoidreadJobInfoFiles(){ 
 List<FileStatus>files; 
  
 files=newArrayList<FileStatus>(); 
 try{ 
 files=scanDirectory(doneDirPrefixPath,doneDirFc,files); 
 }catch(IOExceptione){ 
 //TODOAuto-generatedcatchblock 
 e.printStackTrace(); 
 } 
  
 if(files!=null){ 
 for(FileStatusfs:files){ 
 //parseFileInfo(fs); 
 } 
 System.out.println("filesnumis"+files.size()); 
 System.out 
 .println("fileStatusListsizeis"+fileStatusList.size()); 
  
 ParseThread[]threads; 
 threads=newParseThread[threadNum]; 
 for(inti=0;i<threadNum;i++){ 
 System.out.println("thread"+i+"startrun"); 
 threads[i]=newParseThread(this,fileStatusList,dataInfos); 
 threads[i].start(); 
 } 
  
 for(inti=0;i<threadNum;i++){ 
 System.out.println("thread"+i+"joinrun"); 
 try{ 
 if(threads[i]!=null){ 
 threads[i].join(); 
 } 
 }catch(InterruptedExceptione){ 
 //TODOAuto-generatedcatchblock 
 e.printStackTrace(); 
 } 
 } 
 }else{ 
 System.out.println("filesisnull"); 
 } 
  
 printStatDatas(); 
 } 
  
 protectedList<FileStatus>scanDirectory(Pathpath,FileContextfc, 
 List<FileStatus>jhStatusList)throwsIOException{ 
 path=fc.makeQualified(path); 
 System.out.println("dirpathis"+path.getName()); 
 try{ 
 RemoteIterator<FileStatus>fileStatusIter=fc.listStatus(path); 
 while(fileStatusIter.hasNext()){ 
 FileStatusfileStatus=fileStatusIter.next(); 
 PathfilePath=fileStatus.getPath(); 
  
 if(fileStatus.isFile()){ 
 jhStatusList.add(fileStatus); 
 fileStatusList.add(fileStatus); 
 }elseif(fileStatus.isDirectory()){ 
 scanDirectory(filePath,fc,jhStatusList); 
 } 
 } 
 }catch(FileNotFoundExceptionfe){ 
 System.out.println("Errorwhilescanningdirectory"+path); 
 } 
  
 returnjhStatusList; 
 } 
  
 privatevoidparseFileInfo(FileStatusfs){ 
 StringresultStr; 
 Stringstr; 
 Stringusername; 
 StringfileType; 
 StringjobId; 
 StringjobName; 
 StringhiveSql; 
  
 intstartPos; 
 intendPos; 
 inthiveSqlFlag; 
 longlaunchTime; 
 longfinishTime; 
 intmapTaskNum; 
 intreduceTaskNum; 
 StringxmlNameFlag; 
 StringlaunchTimeFlag; 
 StringfinishTimeFlag; 
 StringlaunchMapFlag; 
 StringlaunchReduceFlag; 
  
 Pathpath; 
 FileSystemfileSystem; 
 InputStreamin; 
  
 resultStr=""; 
 fileType=""; 
 hiveSql=""; 
 jobId=""; 
 jobName=""; 
 username=""; 
 hiveSqlFlag=0; 
 launchTime=0; 
 finishTime=0; 
 mapTaskNum=0; 
 reduceTaskNum=0; 
 xmlNameFlag="<value>"; 
 launchTimeFlag="\"launchTime\":"; 
 finishTimeFlag="\"finishTime\":"; 
 launchMapFlag="\"Launchedmaptasks\""; 
 launchReduceFlag="\"Launchedreducetasks\""; 
  
 path=fs.getPath(); 
 str=path.getName(); 
 if(str.endsWith(".xml")){ 
 fileType="config"; 
  
 endPos=str.lastIndexOf("_"); 
 jobId=str.substring(0,endPos); 
 }elseif(str.endsWith(".jhist")){ 
 fileType="info"; 
  
 endPos=str.indexOf("-"); 
 jobId=str.substring(0,endPos); 
 }else{ 
 return; 
 } 
  
 try{ 
 fileSystem=path.getFileSystem(newConfiguration()); 
 in=fileSystem.open(path); 
 InputStreamReaderisr; 
 BufferedReaderbr; 
  
 isr=newInputStreamReader(in,"UTF-8"); 
 br=newBufferedReader(isr); 
  
 while((str=br.readLine())!=null){ 
 if(str.contains("mapreduce.job.user.name")){ 
 startPos=str.indexOf(xmlNameFlag); 
 endPos=str.indexOf("</value>"); 
 username=str.substring(startPos+xmlNameFlag.length(), 
 endPos); 
 }elseif(str.contains("mapreduce.job.name")){ 
 startPos=str.indexOf(xmlNameFlag); 
 endPos=str.indexOf("</value>"); 
 jobName=str.substring(startPos+xmlNameFlag.length(), 
 endPos); 
 }elseif(str.contains("hive.query.string")){ 
 hiveSqlFlag=1; 
 hiveSql=str; 
 }elseif(hiveSqlFlag==1){ 
 hiveSql+=str; 
  
 if(str.contains("</value>")){ 
 startPos=hiveSql.indexOf(xmlNameFlag); 
 endPos=hiveSql.indexOf("</value>"); 
 hiveSql=hiveSql.substring( 
 startPos+xmlNameFlag.length(),endPos); 
  
 hiveSqlFlag=0; 
 } 
 }elseif(str.startsWith("{\"type\":\"JOB_INITED\"")){ 
 startPos=str.indexOf(launchTimeFlag); 
 str=str.substring(startPos+launchTimeFlag.length()); 
 endPos=str.indexOf(","); 
 launchTime=Long.parseLong(str.substring(0,endPos)); 
 }elseif(str.startsWith("{\"type\":\"JOB_FINISHED\"")){ 
 mapTaskNum=parseTaskNum(launchMapFlag,str); 
 reduceTaskNum=parseTaskNum(launchReduceFlag,str); 
  
 startPos=str.indexOf(finishTimeFlag); 
 str=str.substring(startPos+finishTimeFlag.length()); 
 endPos=str.indexOf(","); 
 finishTime=Long.parseLong(str.substring(0,endPos)); 
 } 
 } 
  
 System.out.println("jobIdis"+jobId); 
 System.out.println("jobNameis"+jobName); 
 System.out.println("usernameis"+username); 
 System.out.println("maptasknumis"+mapTaskNum); 
 System.out.println("reducetasknumis"+reduceTaskNum); 
 System.out.println("launchTimeis"+launchTime); 
 System.out.println("finishTimeis"+finishTime); 
 System.out.println("hivequerysqlis"+hiveSql); 
 }catch(IOExceptione){ 
 //TODOAuto-generatedcatchblock 
 e.printStackTrace(); 
 } 
  
 if(fileType.equals("config")){ 
 insertConfParseData(jobId,jobName,username,hiveSql); 
 }elseif(fileType.equals("info")){ 
 insertJobInfoParseData(jobId,launchTime,finishTime,mapTaskNum, 
 reduceTaskNum); 
 } 
 } 
  
 privatevoidinsertConfParseData(StringjobId,StringjobName, 
 Stringusername,Stringsql){ 
 String[]array; 
  
 if(dataInfos.containsKey(jobId)){ 
 array=dataInfos.get(jobId); 
 }else{ 
 array=newString[BaseValues.DB_COLUMN_HIVE_SQL_LEN]; 
 } 
  
 array[BaseValues.DB_COLUMN_HIVE_SQL_JOBID]=jobId; 
 array[BaseValues.DB_COLUMN_HIVE_SQL_JOBNAME]=jobName; 
 array[BaseValues.DB_COLUMN_HIVE_SQL_USERNAME]=username; 
 array[BaseValues.DB_COLUMN_HIVE_SQL_HIVE_SQL]=sql; 
  
 dataInfos.put(jobId,array); 
 } 
  
 privatevoidinsertJobInfoParseData(StringjobId,longlaunchTime, 
 longfinishedTime,intmapTaskNum,intreduceTaskNum){ 
 String[]array; 
  
 if(dataInfos.containsKey(jobId)){ 
 array=dataInfos.get(jobId); 
 }else{ 
 array=newString[BaseValues.DB_COLUMN_HIVE_SQL_LEN]; 
 } 
  
 array[BaseValues.DB_COLUMN_HIVE_SQL_JOBID]=jobId; 
 array[BaseValues.DB_COLUMN_HIVE_SQL_START_TIME]=String 
 .valueOf(launchTime); 
 array[BaseValues.DB_COLUMN_HIVE_SQL_FINISH_TIME]=String 
 .valueOf(finishedTime); 
 array[BaseValues.DB_COLUMN_HIVE_SQL_MAP_TASK_NUM]=String 
 .valueOf(mapTaskNum); 
 array[BaseValues.DB_COLUMN_HIVE_SQL_REDUCE_TASK_NUM]=String 
 .valueOf(reduceTaskNum); 
  
 dataInfos.put(jobId,array); 
 } 
  
 privateintparseTaskNum(Stringflag,StringjobStr){ 
 inttaskNum; 
 intstartPos; 
 intendPos; 
  
 StringtmpStr; 
  
 taskNum=0; 
 tmpStr=jobStr; 
 startPos=tmpStr.indexOf(flag); 
  
 if(startPos==-1){ 
 return0; 
 } 
  
 tmpStr=tmpStr.substring(startPos+flag.length()); 
 endPos=tmpStr.indexOf("}"); 
 tmpStr=tmpStr.substring(0,endPos); 
 taskNum=Integer.parseInt(tmpStr.split(":")[1]); 
  
 returntaskNum; 
 } 
  
 privatevoidprintStatDatas(){ 
 StringjobId; 
 StringjobInfo; 
 String[]infos; 
  
 if(dbClient!=null){ 
 dbClient.createConnection(); 
 } 
  
 if(dataInfos!=null){ 
 System.out.println("mapdatasizeis"+dataInfos.size()); 
  
 if(dbClient!=null&&dirType.equals("dateTimeDir")){ 
 dbClient.insertDataBatch(dataInfos); 
 } 
 } 
  
 /*for(Entry<String,String[]>entry:this.dataInfos.entrySet()){ 
 jobId=entry.getKey(); 
 infos=entry.getValue(); 
  
 jobInfo=String 
 .format("jobIdis%s,jobName:%s,usrname:%s,launchTime:%s,finishTime:%s,mapTaskNum:%s,reduceTaskNum:%s,querySql:%s", 
 jobId,infos[1],infos[2],infos[3],infos[4], 
 infos[5],infos[6],infos[7]); 
 //System.out.println("jobdetailinfo"+jobInfo); 
  
 if(dbClient!=null&&dirType.equals("dateTimeDir")){ 
 dbClient.insertHiveSqlStatData(infos); 
 } 
 }*/ 
  
 if(dbClient!=null){ 
 dbClient.closeConnection(); 
 } 
 } 
  
 publicsynchronizedFileStatusgetOneFile(){ 
 FileStatusfs; 
  
 fs=null; 
 if(fileStatusList!=null&fileStatusList.size()>0){ 
 fs=fileStatusList.poll(); 
 } 
  
 returnfs; 
 } 
  
 publicsynchronizedvoidaddDataToMap(StringjobId,String[]values){ 
 if(dataInfos!=null){ 
 dataInfos.put(jobId,values); 
 } 
 } 
 } 

解析线程代码ParseThread.java:

[java]view plaincopyprint? 
   
 packageorg.apache.hadoop.mapreduce.v2.hs.tool.sqlanalyse; 
  
 importjava.io.BufferedReader; 
 importjava.io.IOException; 
 importjava.io.InputStream; 
 importjava.io.InputStreamReader; 
 importjava.util.HashMap; 
 importjava.util.LinkedList; 
  
 importorg.apache.hadoop.conf.Configuration; 
 importorg.apache.hadoop.fs.FileStatus; 
 importorg.apache.hadoop.fs.FileSystem; 
 importorg.apache.hadoop.fs.Path; 
  
 publicclassParseThreadextendsThread{ 
 privateHiveSqlAnalyseTooltool; 
 privateLinkedList<FileStatus>fileStatus; 
 privateHashMap<String,String[]>dataInfos; 
  
 publicParseThread(HiveSqlAnalyseTooltool,LinkedList<FileStatus>fileStatus,HashMap<String,String[]>dataInfos){ 
 this.tool=tool; 
 this.fileStatus=fileStatus; 
 this.dataInfos=dataInfos; 
 } 
  
 @Override 
 publicvoidrun(){ 
 FileStatusfs; 
  
 while(fileStatus!=null&&!fileStatus.isEmpty()){ 
 fs=tool.getOneFile(); 
 parseFileInfo(fs); 
 } 
  
 super.run(); 
 } 
  
 privatevoidparseFileInfo(FileStatusfs){ 
 Stringstr; 
 Stringusername; 
 StringfileType; 
 StringjobId; 
 StringjobName; 
 StringhiveSql; 
  
 intstartPos; 
 intendPos; 
 inthiveSqlFlag; 
 longlaunchTime; 
 longfinishTime; 
 intmapTaskNum; 
 intreduceTaskNum; 
 StringxmlNameFlag; 
 StringlaunchTimeFlag; 
 StringfinishTimeFlag; 
 StringlaunchMapFlag; 
 StringlaunchReduceFlag; 
  
 Pathpath; 
 FileSystemfileSystem; 
 InputStreamin; 
  
 fileType=""; 
 hiveSql=""; 
 jobId=""; 
 jobName=""; 
 username=""; 
 hiveSqlFlag=0; 
 launchTime=0; 
 finishTime=0; 
 mapTaskNum=0; 
 reduceTaskNum=0; 
 xmlNameFlag="<value>"; 
 launchTimeFlag="\"launchTime\":"; 
 finishTimeFlag="\"finishTime\":"; 
 launchMapFlag="\"Launchedmaptasks\""; 
 launchReduceFlag="\"Launchedreducetasks\""; 
  
 path=fs.getPath(); 
 str=path.getName(); 
 if(str.endsWith(".xml")){ 
 fileType="config"; 
  
 endPos=str.lastIndexOf("_"); 
 jobId=str.substring(0,endPos); 
 }elseif(str.endsWith(".jhist")){ 
 fileType="info"; 
  
 endPos=str.indexOf("-"); 
 jobId=str.substring(0,endPos); 
 }else{ 
 return; 
 } 
  
 try{ 
 fileSystem=path.getFileSystem(newConfiguration()); 
 in=fileSystem.open(path); 
 InputStreamReaderisr; 
 BufferedReaderbr; 
  
 isr=newInputStreamReader(in,"UTF-8"); 
 br=newBufferedReader(isr); 
  
 while((str=br.readLine())!=null){ 
 if(str.contains("mapreduce.job.user.name")){ 
 startPos=str.indexOf(xmlNameFlag); 
 endPos=str.indexOf("</value>"); 
 username=str.substring(startPos+xmlNameFlag.length(), 
 endPos); 
 }elseif(str.contains("mapreduce.job.name")){ 
 startPos=str.indexOf(xmlNameFlag); 
 endPos=str.indexOf("</value>"); 
 jobName=str.substring(startPos+xmlNameFlag.length(), 
 endPos); 
 }elseif(str.contains("hive.query.string")){ 
 hiveSqlFlag=1; 
  
 hiveSql=str; 
 }elseif(hiveSqlFlag==1){ 
 hiveSql+=str; 
  
 if(str.contains("</value>")){ 
 startPos=hiveSql.indexOf(xmlNameFlag); 
 endPos=hiveSql.indexOf("</value>"); 
 hiveSql=hiveSql.substring( 
 startPos+xmlNameFlag.length(),endPos); 
  
 hiveSqlFlag=0; 
 } 
 }elseif(str.startsWith("{\"type\":\"JOB_INITED\"")){ 
 startPos=str.indexOf(launchTimeFlag); 
 str=str.substring(startPos+launchTimeFlag.length()); 
 endPos=str.indexOf(","); 
 launchTime=Long.parseLong(str.substring(0,endPos)); 
 }elseif(str.startsWith("{\"type\":\"JOB_FINISHED\"")){ 
 mapTaskNum=parseTaskNum(launchMapFlag,str); 
 reduceTaskNum=parseTaskNum(launchReduceFlag,str); 
  
 startPos=str.indexOf(finishTimeFlag); 
 str=str.substring(startPos+finishTimeFlag.length()); 
 endPos=str.indexOf(","); 
 finishTime=Long.parseLong(str.substring(0,endPos)); 
 } 
 } 
  
 /*System.out.println("jobIdis"+jobId); 
 System.out.println("jobNameis"+jobName); 
 System.out.println("usernameis"+username); 
 System.out.println("maptasknumis"+mapTaskNum); 
 System.out.println("reducetasknumis"+reduceTaskNum); 
 System.out.println("launchTimeis"+launchTime); 
 System.out.println("finishTimeis"+finishTime); 
 System.out.println("hivequerysqlis"+hiveSql);*/ 
 }catch(IOExceptione){ 
 //TODOAuto-generatedcatchblock 
 e.printStackTrace(); 
 } 
  
 if(fileType.equals("config")){ 
 insertConfParseData(jobId,jobName,username,hiveSql); 
 }elseif(fileType.equals("info")){ 
 insertJobInfoParseData(jobId,launchTime,finishTime,mapTaskNum, 
 reduceTaskNum); 
 } 
 } 
  
 privatevoidinsertConfParseData(StringjobId,StringjobName, 
 Stringusername,Stringsql){ 
 String[]array; 
  
 if(dataInfos.containsKey(jobId)){ 
 array=dataInfos.get(jobId); 
 }else{ 
 array=newString[BaseValues.DB_COLUMN_HIVE_SQL_LEN]; 
 } 
  
 array[BaseValues.DB_COLUMN_HIVE_SQL_JOBID]=jobId; 
 array[BaseValues.DB_COLUMN_HIVE_SQL_JOBNAME]=jobName; 
 array[BaseValues.DB_COLUMN_HIVE_SQL_USERNAME]=username; 
 array[BaseValues.DB_COLUMN_HIVE_SQL_HIVE_SQL]=sql; 
  
 tool.addDataToMap(jobId,array); 
 } 
  
 privatevoidinsertJobInfoParseData(StringjobId,longlaunchTime, 
 longfinishedTime,intmapTaskNum,intreduceTaskNum){ 
 String[]array; 
  
 if(dataInfos.containsKey(jobId)){ 
 array=dataInfos.get(jobId); 
 }else{ 
 array=newString[BaseValues.DB_COLUMN_HIVE_SQL_LEN]; 
 } 
  
 array[BaseValues.DB_COLUMN_HIVE_SQL_JOBID]=jobId; 
 array[BaseValues.DB_COLUMN_HIVE_SQL_START_TIME]=String 
 .valueOf(launchTime); 
 array[BaseValues.DB_COLUMN_HIVE_SQL_FINISH_TIME]=String 
 .valueOf(finishedTime); 
 array[BaseValues.DB_COLUMN_HIVE_SQL_MAP_TASK_NUM]=String 
 .valueOf(mapTaskNum); 
 array[BaseValues.DB_COLUMN_HIVE_SQL_REDUCE_TASK_NUM]=String 
 .valueOf(reduceTaskNum); 
  
 tool.addDataToMap(jobId,array); 
 } 
  
 privateintparseTaskNum(Stringflag,StringjobStr){ 
 inttaskNum; 
 intstartPos; 
 intendPos; 
  
 StringtmpStr; 
  
 taskNum=0; 
 tmpStr=jobStr; 
 startPos=tmpStr.indexOf(flag); 
  
 if(startPos==-1){ 
 return0; 
 } 
  
 tmpStr=tmpStr.substring(startPos+flag.length()); 
 endPos=tmpStr.indexOf("}"); 
 tmpStr=tmpStr.substring(0,endPos); 
 taskNum=Integer.parseInt(tmpStr.split(":")[1]); 
  
 returntaskNum; 
 } 
 }