Linux报错无法分配内存,hive job 提示Invalid sync和 无法分配内存 报错处理

近期发现分析部门同事告知,hive处理原始数据的时候总是不能执行完成,报错如下,这个问题是avro的文件不完整:

Diagnostic Messages for this Task:

Error: java.io.IOException: java.io.IOException: org.apache.avro.AvroRuntimeException:java.io.IOException: Invalid sync!

at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)

at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)

at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:273)

at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:183)

at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:198)

at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:184)

at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52)

at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)

at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)

at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:415)

at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1594)

at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)

Caused by: java.io.IOException: org.apache.avro.AvroRuntimeException: java.io.IOException: Invalid sync!

at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)

at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)

at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:352)

at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:101)

at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:41)

at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:115)

at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:271)

... 11 more

Caused by: org.apache.avro.AvroRuntimeException: java.io.IOException: Invalid sync!

at org.apache.avro.file.DataFileStream.hasNext(DataFileStream.java:210)

at org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader.next(AvroGenericRecordReader.java:149)

at org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader.next(AvroGenericRecordReader.java:52)

at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:347)

... 15 more

Caused by: java.io.IOException: Invalid sync!

at org.apache.avro.file.DataFileStream.nextRawBlock(DataFileStream.java:293)

at org.apache.avro.file.DataFileStream.hasNext(DataFileStream.java:198)

... 18 more

查看近期执行失败的job日志,发现提示服务器内存不足

Log Type: syslog

Log Length: 18946

2015-12-27 13:30:44,516 INFO [main] org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties

2015-12-27 13:30:44,540 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSinkAdapter: Sink ganglia started

2015-12-27 13:30:44,601 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).

2015-12-27 13:30:44,601 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system started

2015-12-27 13:30:44,609 INFO [main] org.apache.hadoop.mapred.YarnChild: Executing with tokens:

2015-12-27 13:30:44,609 INFO [main] org.apache.hadoop.mapred.YarnChild: Kind: mapreduce.job, Service: job_1451036614992_0057, Ident: (org.apache.hadoop.mapreduce.security.token.JobTokenIdentifier@afb3f4c)

2015-12-27 13:30:44,670 INFO [main] org.apache.hadoop.mapred.YarnChild: Sleeping for 0ms before retrying again. Got null now.

2015-12-27 13:30:44,907 INFO [main] org.apache.hadoop.mapred.YarnChild: mapreduce.cluster.local.dir for child: /diskb/hadoop/yarn/local/usercache/hdfs/appcache/application_1451036614992_0057,/diskc/hadoop/yarn/local/usercache/hdfs/appcache/application_1451036614992_0057

2015-12-27 13:30:45,345 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id

2015-12-27 13:30:45,669 INFO [main] org.apache.hadoop.mapred.Task:  Using ResourceCalculatorProcessTree : [ ]

2015-12-27 13:30:46,003 INFO [main] org.apache.hadoop.mapred.MapTask: Processing split: hdfs://BeiJing/data/raw/click/2015122710/http-topic.avro.192.168.2.12.avro:1342177280+47143758

2015-12-27 13:30:46,223 INFO [main] org.apache.hadoop.mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer

Java HotSpot(TM) 64-Bit Server VM warning: INFO: os::commit_memory(0x00000007efc80000, 272105472, 0) failed;error='无法分配内存' (errno=12)

分析是因为hadoop节点上有nodemanager和supervisor同时存在,且worker数量较多,跑任务的时候占用内存较大,所以导致内存不足,

将16个worker的数量减少为10个,重启下storm服务,有时候重启服务worker数量还是维持原来的那么多,所以直接到节点上删除所有的worker,之后再启动supervisor,就好。

观察一段时间发现hive job不会出现失败的情况了,感觉是因为原始数据的job在处理avro数据的时候,因为节点内存的问题,导致写入到HDFS的时候部分avro文件不完整,所以hive处理的时候会报错。

阅读(2640) | 评论(0) | 转发(0) |

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值