2020-11-20

spark日志分析

报错信息

  1. org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error
  2. org.apache.spark.SparkException: Failed to get broadcast_657_piece0 of broadcast_657

一、报错详细信息如下

20/11/19 09:03:31 INFO RetryInvocationHandler: Exception while invoking ClientNamenodeProtocolTranslatorPB.getFileInfo over yp-tyhj-apollo4200-7227/10.11.122.227:8020. Trying to failover immediately.
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error
at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:88)
at org.apache.hadoop.hdfs.server.namenode.NameNode N a m e N o d e H A C o n t e x t . c h e c k O p e r a t i o n ( N a m e N o d e . j a v a : 1956 ) a t o r g . a p a c h e . h a d o o p . h d f s . s e r v e r . n a m e n o d e . F S N a m e s y s t e m . c h e c k O p e r a t i o n ( F S N a m e s y s t e m . j a v a : 1376 ) a t o r g . a p a c h e . h a d o o p . h d f s . s e r v e r . n a m e n o d e . F S N a m e s y s t e m . g e t F i l e I n f o ( F S N a m e s y s t e m . j a v a : 2954 ) a t o r g . a p a c h e . h a d o o p . h d f s . s e r v e r . n a m e n o d e . N a m e N o d e R p c S e r v e r . g e t F i l e I n f o ( N a m e N o d e R p c S e r v e r . j a v a : 1106 ) a t o r g . a p a c h e . h a d o o p . h d f s . p r o t o c o l P B . C l i e n t N a m e n o d e P r o t o c o l S e r v e r S i d e T r a n s l a t o r P B . g e t F i l e I n f o ( C l i e n t N a m e n o d e P r o t o c o l S e r v e r S i d e T r a n s l a t o r P B . j a v a : 876 ) a t o r g . a p a c h e . h a d o o p . h d f s . p r o t o c o l . p r o t o . C l i e n t N a m e n o d e P r o t o c o l P r o t o s NameNodeHAContext.checkOperation(NameNode.java:1956) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1376) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:2954) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:1106) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:876) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos NameNodeHAContext.checkOperation(NameNode.java:1956)atorg.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1376)atorg.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:2954)atorg.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:1106)atorg.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:876)atorg.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtosClientNamenodeProtocol 2. c a l l B l o c k i n g M e t h o d ( C l i e n t N a m e n o d e P r o t o c o l P r o t o s . j a v a ) a t o r g . a p a c h e . h a d o o p . i p c . P r o t o b u f R p c E n g i n e 2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine 2.callBlockingMethod(ClientNamenodeProtocolProtos.java)atorg.apache.hadoop.ipc.ProtobufRpcEngineServer P r o t o B u f R p c I n v o k e r . c a l l ( P r o t o b u f R p c E n g i n e . j a v a : 455 ) a t o r g . a p a c h e . h a d o o p . i p c . R P C ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:455) at org.apache.hadoop.ipc.RPC ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:455)atorg.apache.hadoop.ipc.RPCServer.call(RPC.java:989)
at org.apache.hadoop.ipc.Server R p c C a l l . r u n ( S e r v e r . j a v a : 852 ) a t o r g . a p a c h e . h a d o o p . i p c . S e r v e r RpcCall.run(Server.java:852) at org.apache.hadoop.ipc.Server RpcCall.run(Server.java:852)atorg.apache.hadoop.ipc.ServerRpcCall.run(Server.java:795)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1961)
at org.apache.hadoop.ipc.Server H a n d l e r . r u n ( S e r v e r . j a v a : 2494 ) a t o r g . a p a c h e . h a d o o p . i p c . C l i e n t . g e t R p c R e s p o n s e ( C l i e n t . j a v a : 1495 ) a t o r g . a p a c h e . h a d o o p . i p c . C l i e n t . c a l l ( C l i e n t . j a v a : 1441 ) a t o r g . a p a c h e . h a d o o p . i p c . C l i e n t . c a l l ( C l i e n t . j a v a : 1351 ) a t o r g . a p a c h e . h a d o o p . i p c . P r o t o b u f R p c E n g i n e Handler.run(Server.java:2494) at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1495) at org.apache.hadoop.ipc.Client.call(Client.java:1441) at org.apache.hadoop.ipc.Client.call(Client.java:1351) at org.apache.hadoop.ipc.ProtobufRpcEngine Handler.run(Server.java:2494)atorg.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1495)atorg.apache.hadoop.ipc.Client.call(Client.java:1441)atorg.apache.hadoop.ipc.Client.call(Client.java:1351)atorg.apache.hadoop.ipc.ProtobufRpcEngineInvoker.invoke(ProtobufRpcEngine.java:235)
at org.apache.hadoop.ipc.ProtobufRpcEngine I n v o k e r . i n v o k e ( P r o t o b u f R p c E n g i n e . j a v a : 116 ) a t c o m . s u n . p r o x y . Invoker.invoke(ProtobufRpcEngine.java:116) at com.sun.proxy. Invoker.invoke(ProtobufRpcEngine.java:116)atcom.sun.proxy.Proxy17.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:796)
at sun.reflect.GeneratedMethodAccessor10.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:409)
at org.apache.hadoop.io.retry.RetryInvocationHandler C a l l . i n v o k e M e t h o d ( R e t r y I n v o c a t i o n H a n d l e r . j a v a : 163 ) a t o r g . a p a c h e . h a d o o p . i o . r e t r y . R e t r y I n v o c a t i o n H a n d l e r Call.invokeMethod(RetryInvocationHandler.java:163) at org.apache.hadoop.io.retry.RetryInvocationHandler Call.invokeMethod(RetryInvocationHandler.java:163)atorg.apache.hadoop.io.retry.RetryInvocationHandlerCall.invoke(RetryInvocationHandler.java:155)
at org.apache.hadoop.io.retry.RetryInvocationHandler C a l l . i n v o k e O n c e ( R e t r y I n v o c a t i o n H a n d l e r . j a v a : 95 ) a t o r g . a p a c h e . h a d o o p . i o . r e t r y . R e t r y I n v o c a t i o n H a n d l e r . i n v o k e ( R e t r y I n v o c a t i o n H a n d l e r . j a v a : 346 ) a t c o m . s u n . p r o x y . Call.invokeOnce(RetryInvocationHandler.java:95) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:346) at com.sun.proxy. Call.invokeOnce(RetryInvocationHandler.java:95)atorg.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:346)atcom.sun.proxy.Proxy18.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1704)
at org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1454)
at org.apache.hadoop.hdfs.DistributedFileSystem 27. d o C a l l ( D i s t r i b u t e d F i l e S y s t e m . j a v a : 1451 ) a t o r g . a p a c h e . h a d o o p . f s . F i l e S y s t e m L i n k R e s o l v e r . r e s o l v e ( F i l e S y s t e m L i n k R e s o l v e r . j a v a : 81 ) a t o r g . a p a c h e . h a d o o p . h d f s . D i s t r i b u t e d F i l e S y s t e m . g e t F i l e S t a t u s ( D i s t r i b u t e d F i l e S y s t e m . j a v a : 1466 ) a t p a r q u e t . h a d o o p . P a r q u e t F i l e R e a d e r . r e a d F o o t e r ( P a r q u e t F i l e R e a d e r . j a v a : 385 ) a t p a r q u e t . h a d o o p . P a r q u e t F i l e R e a d e r . r e a d F o o t e r ( P a r q u e t F i l e R e a d e r . j a v a : 371 ) a t o r g . a p a c h e . h a d o o p . h i v e . q l . i o . p a r q u e t . r e a d . P a r q u e t R e c o r d R e a d e r W r a p p e r . g e t S p l i t ( P a r q u e t R e c o r d R e a d e r W r a p p e r . j a v a : 252 ) a t o r g . a p a c h e . h a d o o p . h i v e . q l . i o . p a r q u e t . r e a d . P a r q u e t R e c o r d R e a d e r W r a p p e r . < i n i t > ( P a r q u e t R e c o r d R e a d e r W r a p p e r . j a v a : 99 ) a t o r g . a p a c h e . h a d o o p . h i v e . q l . i o . p a r q u e t . r e a d . P a r q u e t R e c o r d R e a d e r W r a p p e r . < i n i t > ( P a r q u e t R e c o r d R e a d e r W r a p p e r . j a v a : 85 ) a t o r g . a p a c h e . h a d o o p . h i v e . q l . i o . p a r q u e t . M a p r e d P a r q u e t I n p u t F o r m a t . g e t R e c o r d R e a d e r ( M a p r e d P a r q u e t I n p u t F o r m a t . j a v a : 72 ) a t o r g . a p a c h e . s p a r k . r d d . H a d o o p R D D 27.doCall(DistributedFileSystem.java:1451) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1466) at parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:385) at parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:371) at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.getSplit(ParquetRecordReaderWrapper.java:252) at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.<init>(ParquetRecordReaderWrapper.java:99) at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.<init>(ParquetRecordReaderWrapper.java:85) at org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:72) at org.apache.spark.rdd.HadoopRDD 27.doCall(DistributedFileSystem.java:1451)atorg.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)atorg.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1466)atparquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:385)atparquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:371)atorg.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.getSplit(ParquetRecordReaderWrapper.java:252)atorg.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.<init>(ParquetRecordReaderWrapper.java:99)atorg.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.<init>(ParquetRecordReaderWrapper.java:85)atorg.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:72)atorg.apache.spark.rdd.HadoopRDD$anon$1.liftedTree1 1 ( H a d o o p R D D . s c a l a : 246 ) a t o r g . a p a c h e . s p a r k . r d d . H a d o o p R D D 1(HadoopRDD.scala:246) at org.apache.spark.rdd.HadoopRDD 1(HadoopRDD.scala:246)atorg.apache.spark.rdd.HadoopRDD$anon 1. < i n i t > ( H a d o o p R D D . s c a l a : 245 ) a t o r g . a p a c h e . s p a r k . r d d . H a d o o p R D D . c o m p u t e ( H a d o o p R D D . s c a l a : 203 ) a t o r g . a p a c h e . s p a r k . r d d . H a d o o p R D D . c o m p u t e ( H a d o o p R D D . s c a l a : 94 ) a t o r g . a p a c h e . s p a r k . r d d . R D D . c o m p u t e O r R e a d C h e c k p o i n t ( R D D . s c a l a : 323 ) a t o r g . a p a c h e . s p a r k . r d d . R D D . i t e r a t o r ( R D D . s c a l a : 287 ) a t o r g . a p a c h e . s p a r k . r d d . M a p P a r t i t i o n s R D D . c o m p u t e ( M a p P a r t i t i o n s R D D . s c a l a : 38 ) a t o r g . a p a c h e . s p a r k . r d d . R D D . c o m p u t e O r R e a d C h e c k p o i n t ( R D D . s c a l a : 323 ) a t o r g . a p a c h e . s p a r k . r d d . R D D . i t e r a t o r ( R D D . s c a l a : 287 ) a t o r g . a p a c h e . s p a r k . r d d . M a p P a r t i t i o n s R D D . c o m p u t e ( M a p P a r t i t i o n s R D D . s c a l a : 38 ) a t o r g . a p a c h e . s p a r k . r d d . R D D . c o m p u t e O r R e a d C h e c k p o i n t ( R D D . s c a l a : 323 ) a t o r g . a p a c h e . s p a r k . r d d . R D D . i t e r a t o r ( R D D . s c a l a : 287 ) a t o r g . a p a c h e . s p a r k . r d d . U n i o n R D D . c o m p u t e ( U n i o n R D D . s c a l a : 105 ) a t o r g . a p a c h e . s p a r k . r d d . R D D . c o m p u t e O r R e a d C h e c k p o i n t ( R D D . s c a l a : 323 ) a t o r g . a p a c h e . s p a r k . r d d . R D D . i t e r a t o r ( R D D . s c a l a : 287 ) a t o r g . a p a c h e . s p a r k . r d d . M a p P a r t i t i o n s R D D . c o m p u t e ( M a p P a r t i t i o n s R D D . s c a l a : 38 ) a t o r g . a p a c h e . s p a r k . r d d . R D D . c o m p u t e O r R e a d C h e c k p o i n t ( R D D . s c a l a : 323 ) a t o r g . a p a c h e . s p a r k . r d d . R D D . i t e r a t o r ( R D D . s c a l a : 287 ) a t o r g . a p a c h e . s p a r k . r d d . M a p P a r t i t i o n s R D D . c o m p u t e ( M a p P a r t i t i o n s R D D . s c a l a : 38 ) a t o r g . a p a c h e . s p a r k . r d d . R D D . c o m p u t e O r R e a d C h e c k p o i n t ( R D D . s c a l a : 323 ) a t o r g . a p a c h e . s p a r k . r d d . R D D . i t e r a t o r ( R D D . s c a l a : 287 ) a t o r g . a p a c h e . s p a r k . r d d . M a p P a r t i t i o n s R D D . c o m p u t e ( M a p P a r t i t i o n s R D D . s c a l a : 38 ) a t o r g . a p a c h e . s p a r k . r d d . R D D . c o m p u t e O r R e a d C h e c k p o i n t ( R D D . s c a l a : 323 ) a t o r g . a p a c h e . s p a r k . r d d . R D D . i t e r a t o r ( R D D . s c a l a : 287 ) a t o r g . a p a c h e . s p a r k . s c h e d u l e r . S h u f f l e M a p T a s k . r u n T a s k ( S h u f f l e M a p T a s k . s c a l a : 96 ) a t o r g . a p a c h e . s p a r k . s c h e d u l e r . S h u f f l e M a p T a s k . r u n T a s k ( S h u f f l e M a p T a s k . s c a l a : 53 ) a t o r g . a p a c h e . s p a r k . s c h e d u l e r . T a s k . r u n ( T a s k . s c a l a : 108 ) a t o r g . a p a c h e . s p a r k . e x e c u t o r . E x e c u t o r 1.<init>(HadoopRDD.scala:245) at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:203) at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:94) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:105) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53) at org.apache.spark.scheduler.Task.run(Task.scala:108) at org.apache.spark.executor.Executor 1.<init>(HadoopRDD.scala:245)atorg.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:203)atorg.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:94)atorg.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)atorg.apache.spark.rdd.RDD.iterator(RDD.scala:287)atorg.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)atorg.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)atorg.apache.spark.rdd.RDD.iterator(RDD.scala:287)atorg.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)atorg.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)atorg.apache.spark.rdd.RDD.iterator(RDD.scala:287)atorg.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:105)atorg.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)atorg.apache.spark.rdd.RDD.iterator(RDD.scala:287)atorg.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)atorg.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)atorg.apache.spark.rdd.RDD.iterator(RDD.scala:287)atorg.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)atorg.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)atorg.apache.spark.rdd.RDD.iterator(RDD.scala:287)atorg.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)atorg.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)atorg.apache.spark.rdd.RDD.iterator(RDD.scala:287)atorg.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)atorg.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)atorg.apache.spark.scheduler.Task.run(Task.scala:108)atorg.apache.spark.executor.ExecutorTaskRunner.run(Executor.scala:335)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor W o r k e r . r u n ( T h r e a d P o o l E x e c u t o r . j a v a : 617 ) a t j a v a . l a n g . T h r e a d . r u n ( T h r e a d . j a v a : 745 ) 20 / 11 / 1909 : 03 : 31 I N F O R e t r y I n v o c a t i o n H a n d l e r : E x c e p t i o n w h i l e i n v o k i n g C l i e n t N a m e n o d e P r o t o c o l T r a n s l a t o r P B . g e t F i l e I n f o o v e r y p − t y h j − a p o l l o 4200 − 7227 / 10.11.122.227 : 8020. T r y i n g t o f a i l o v e r i m m e d i a t e l y . o r g . a p a c h e . h a d o o p . i p c . R e m o t e E x c e p t i o n ( o r g . a p a c h e . h a d o o p . i p c . S t a n d b y E x c e p t i o n ) : O p e r a t i o n c a t e g o r y R E A D i s n o t s u p p o r t e d i n s t a t e s t a n d b y . V i s i t h t t p s : / / s . a p a c h e . o r g / s b n n − e r r o r a t o r g . a p a c h e . h a d o o p . h d f s . s e r v e r . n a m e n o d e . h a . S t a n d b y S t a t e . c h e c k O p e r a t i o n ( S t a n d b y S t a t e . j a v a : 88 ) a t o r g . a p a c h e . h a d o o p . h d f s . s e r v e r . n a m e n o d e . N a m e N o d e Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) 20/11/19 09:03:31 INFO RetryInvocationHandler: Exception while invoking ClientNamenodeProtocolTranslatorPB.getFileInfo over yp-tyhj-apollo4200-7227/10.11.122.227:8020. Trying to failover immediately. org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:88) at org.apache.hadoop.hdfs.server.namenode.NameNode Worker.run(ThreadPoolExecutor.java:617)atjava.lang.Thread.run(Thread.java:745)20/11/1909:03:31INFORetryInvocationHandler:ExceptionwhileinvokingClientNamenodeProtocolTranslatorPB.getFileInfooveryptyhjapollo42007227/10.11.122.227:8020.Tryingtofailoverimmediately.org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):OperationcategoryREADisnotsupportedinstatestandby.Visithttps://s.apache.org/sbnnerroratorg.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:88)atorg.apache.hadoop.hdfs.server.namenode.NameNodeNameNodeHAContext.checkOperation(NameNode.java:1956)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1376)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:2954)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:1106)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:876)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol 2. c a l l B l o c k i n g M e t h o d ( C l i e n t N a m e n o d e P r o t o c o l P r o t o s . j a v a ) a t o r g . a p a c h e . h a d o o p . i p c . P r o t o b u f R p c E n g i n e 2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine 2.callBlockingMethod(ClientNamenodeProtocolProtos.java)atorg.apache.hadoop.ipc.ProtobufRpcEngineServer P r o t o B u f R p c I n v o k e r . c a l l ( P r o t o b u f R p c E n g i n e . j a v a : 455 ) a t o r g . a p a c h e . h a d o o p . i p c . R P C ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:455) at org.apache.hadoop.ipc.RPC ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:455)atorg.apache.hadoop.ipc.RPCServer.call(RPC.java:989)
at org.apache.hadoop.ipc.Server R p c C a l l . r u n ( S e r v e r . j a v a : 852 ) a t o r g . a p a c h e . h a d o o p . i p c . S e r v e r RpcCall.run(Server.java:852) at org.apache.hadoop.ipc.Server RpcCall.run(Server.java:852)atorg.apache.hadoop.ipc.ServerRpcCall.run(Server.java:795)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1961)

报错分析:由于任务连接227 standby namenode导致的报错
为啥会一直连接227节点:查看zkfc日志报错如下

2020-06-11 03:20:35,649 WARN ha.HealthMonitor (HealthMonitor.java:doHealthChecks(211)) - Transport-level exception trying to monitor health of NameNode at YP-TYHJ-APOLLO4200-7227/10.11.122.227:8070: java.net.ConnectException: Connection refused Call From YP-TYHJ-APOLLO4200-7227/10.11.122.227 to YP-TYHJ-APOLLO4200-7227:8070 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
2020-06-11 03:20:37,650 INFO ipc.Client (Client.java:handleConnectionFailure(937)) - Retrying connect to server: YP-TYHJ-APOLLO4200-7227/10.11.122.227:8070. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1000 MILLISECONDS)
2020-06-11 03:20:37,650 WARN ipc.Client (Client.java:handleConnectionFailure(919)) - Failed to connect to server: YP-TYHJ-APOLLO4200-7227/10.11.122.227:8070: retries get failed due to exceeded maximum allowed retries number: 1
java.net.ConnectException: Connection refused

at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)
at org.apache.hadoop.ipc.Client C o n n e c t i o n . s e t u p C o n n e c t i o n ( C l i e n t . j a v a : 687 ) a t o r g . a p a c h e . h a d o o p . i p c . C l i e n t Connection.setupConnection(Client.java:687) at org.apache.hadoop.ipc.Client Connection.setupConnection(Client.java:687)atorg.apache.hadoop.ipc.ClientConnection.setupIOstreams(Client.java:783)
at org.apache.hadoop.ipc.Client$Connection.access 3500 ( C l i e n t . j a v a : 415 ) a t o r g . a p a c h e . h a d o o p . i p c . C l i e n t . g e t C o n n e c t i o n ( C l i e n t . j a v a : 1556 ) a t o r g . a p a c h e . h a d o o p . i p c . C l i e n t . c a l l ( C l i e n t . j a v a : 1387 ) a t o r g . a p a c h e . h a d o o p . i p c . C l i e n t . c a l l ( C l i e n t . j a v a : 1351 ) a t o r g . a p a c h e . h a d o o p . i p c . P r o t o b u f R p c E n g i n e 3500(Client.java:415) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1556) at org.apache.hadoop.ipc.Client.call(Client.java:1387) at org.apache.hadoop.ipc.Client.call(Client.java:1351) at org.apache.hadoop.ipc.ProtobufRpcEngine 3500(Client.java:415)atorg.apache.hadoop.ipc.Client.getConnection(Client.java:1556)atorg.apache.hadoop.ipc.Client.call(Client.java:1387)atorg.apache.hadoop.ipc.Client.call(Client.java:1351)atorg.apache.hadoop.ipc.ProtobufRpcEngineInvoker.invoke(ProtobufRpcEngine.java:235)
at org.apache.hadoop.ipc.ProtobufRpcEngine I n v o k e r . i n v o k e ( P r o t o b u f R p c E n g i n e . j a v a : 116 ) a t c o m . s u n . p r o x y . Invoker.invoke(ProtobufRpcEngine.java:116) at com.sun.proxy. Invoker.invoke(ProtobufRpcEngine.java:116)atcom.sun.proxy.Proxy9.getServiceStatus(Unknown Source)
at org.apache.hadoop.ha.protocolPB.HAServiceProtocolClientSideTranslatorPB.getServiceStatus(HAServiceProtocolClientSideTranslatorPB.java:122)
at org.apache.hadoop.ha.HealthMonitor.doHealthChecks(HealthMonitor.java:202)
at org.apache.hadoop.ha.HealthMonitor.access 600 ( H e a l t h M o n i t o r . j a v a : 49 ) a t o r g . a p a c h e . h a d o o p . h a . H e a l t h M o n i t o r 600(HealthMonitor.java:49) at org.apache.hadoop.ha.HealthMonitor 600(HealthMonitor.java:49)atorg.apache.hadoop.ha.HealthMonitorMonitorDaemon.run(HealthMonitor.java:297)
2020-06-11 03:20:37,651 WARN ha.HealthMonitor (HealthMonitor.java:doHealthChecks(211)) - Transport-level exception trying to monitor health of NameNode at YP-TYHJ-APOLLO4200-7227/10.11.122.227:8070: java.net.ConnectException: Connection refused Call From YP-TYHJ-APOLLO4200-7227/10.11.122.227 to YP-TYHJ-APOLLO4200-7227:8070 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
2020-06-11 03:20:39,652 INFO ipc.Client (Client.java:handleConnectionFailure(937)) - Retrying connect to server: YP-TYHJ-APOLLO4200-7227/10.11.122.227:8070. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1000 MILLISECONDS)

加粗部分意思是连接不上227节点
原因分析:可能是namenode节点的内存不足导致的zkfc进程卡死,不能正常的选举出active节点
那namenode上内存为啥会不足:
查看了下hc上namenode节点告警,小文件太多,在这里插入图片描述小文件在这里插入图片描述

二、报错详细信息如下

20/11/19 07:03:44 INFO MapOutputTrackerWorker: Don’t have map outputs for shuffle 325, fetching them
20/11/19 07:03:44 INFO MapOutputTrackerWorker: Don’t have map outputs for shuffle 325, fetching them
20/11/19 07:03:45 INFO TorrentBroadcast: Started reading broadcast variable 657
20/11/19 07:03:45 ERROR Utils: Exception encountered
org.apache.spark.SparkException: Failed to get broadcast_657_piece0 of broadcast_657

at org.apache.spark.broadcast.TorrentBroadcastKaTeX parse error: Can't use function '$' in math mode at position 8: anonfun$̲org$apache$spar…readBlocks 1. a p p l y 1.apply 1.applymcVI s p ( T o r r e n t B r o a d c a s t . s c a l a : 178 ) a t o r g . a p a c h e . s p a r k . b r o a d c a s t . T o r r e n t B r o a d c a s t sp(TorrentBroadcast.scala:178) at org.apache.spark.broadcast.TorrentBroadcast sp(TorrentBroadcast.scala:178)atorg.apache.spark.broadcast.TorrentBroadcast a n o n f u n anonfun anonfunorg a p a c h e apache apachespark b r o a d c a s t broadcast broadcastTorrentBroadcastKaTeX parse error: Can't use function '$' in math mode at position 11: readBlocks$̲1.apply(Torrent…anonfun o r g org orgapache s p a r k spark sparkbroadcast T o r r e n t B r o a d c a s t TorrentBroadcast TorrentBroadcast$readBlocks 1. a p p l y ( T o r r e n t B r o a d c a s t . s c a l a : 150 ) a t s c a l a . c o l l e c t i o n . i m m u t a b l e . L i s t . f o r e a c h ( L i s t . s c a l a : 318 ) a t o r g . a p a c h e . s p a r k . b r o a d c a s t . T o r r e n t B r o a d c a s t . o r g 1.apply(TorrentBroadcast.scala:150) at scala.collection.immutable.List.foreach(List.scala:318) at org.apache.spark.broadcast.TorrentBroadcast.org 1.apply(TorrentBroadcast.scala:150)atscala.collection.immutable.List.foreach(List.scala:318)atorg.apache.spark.broadcast.TorrentBroadcast.orgapache s p a r k spark sparkbroadcast T o r r e n t B r o a d c a s t TorrentBroadcast TorrentBroadcast r e a d B l o c k s ( T o r r e n t B r o a d c a s t . s c a l a : 150 ) a t o r g . a p a c h e . s p a r k . b r o a d c a s t . T o r r e n t B r o a d c a s t readBlocks(TorrentBroadcast.scala:150) at org.apache.spark.broadcast.TorrentBroadcast readBlocks(TorrentBroadcast.scala:150)atorg.apache.spark.broadcast.TorrentBroadcast a n o n f u n anonfun anonfunreadBroadcastBlock 1. a p p l y ( T o r r e n t B r o a d c a s t . s c a l a : 222 ) a t o r g . a p a c h e . s p a r k . u t i l . U t i l s 1.apply(TorrentBroadcast.scala:222) at org.apache.spark.util.Utils 1.apply(TorrentBroadcast.scala:222)atorg.apache.spark.util.Utils.tryOrIOException(Utils.scala:1303)
at org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:206)
at org.apache.spark.broadcast.TorrentBroadcast._value l z y c o m p u t e ( T o r r e n t B r o a d c a s t . s c a l a : 66 ) a t o r g . a p a c h e . s p a r k . b r o a d c a s t . T o r r e n t B r o a d c a s t . v a l u e ( T o r r e n t B r o a d c a s t . s c a l a : 66 ) a t o r g . a p a c h e . s p a r k . b r o a d c a s t . T o r r e n t B r o a d c a s t . g e t V a l u e ( T o r r e n t B r o a d c a s t . s c a l a : 96 ) a t o r g . a p a c h e . s p a r k . b r o a d c a s t . B r o a d c a s t . v a l u e ( B r o a d c a s t . s c a l a : 70 ) a t o r g . a p a c h e . s p a r k . M a p O u t p u t T r a c k e r lzycompute(TorrentBroadcast.scala:66) at org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:66) at org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:96) at org.apache.spark.broadcast.Broadcast.value(Broadcast.scala:70) at org.apache.spark.MapOutputTracker lzycompute(TorrentBroadcast.scala:66)atorg.apache.spark.broadcast.TorrentBroadcast.value(TorrentBroadcast.scala:66)atorg.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:96)atorg.apache.spark.broadcast.Broadcast.value(Broadcast.scala:70)atorg.apache.spark.MapOutputTracker a n o n f u n anonfun anonfundeserializeMapStatuses 1. a p p l y ( M a p O u t p u t T r a c k e r . s c a l a : 663 ) a t o r g . a p a c h e . s p a r k . M a p O u t p u t T r a c k e r 1.apply(MapOutputTracker.scala:663) at org.apache.spark.MapOutputTracker 1.apply(MapOutputTracker.scala:663)atorg.apache.spark.MapOutputTracker a n o n f u n anonfun anonfundeserializeMapStatuses 1. a p p l y ( M a p O u t p u t T r a c k e r . s c a l a : 663 ) a t o r g . a p a c h e . s p a r k . i n t e r n a l . L o g g i n g 1.apply(MapOutputTracker.scala:663) at org.apache.spark.internal.Logging 1.apply(MapOutputTracker.scala:663)atorg.apache.spark.internal.Loggingclass.logInfo(Logging.scala:54)
at org.apache.spark.MapOutputTracker . l o g I n f o ( M a p O u t p u t T r a c k e r . s c a l a : 600 ) a t o r g . a p a c h e . s p a r k . M a p O u t p u t T r a c k e r .logInfo(MapOutputTracker.scala:600) at org.apache.spark.MapOutputTracker .logInfo(MapOutputTracker.scala:600)atorg.apache.spark.MapOutputTracker.deserializeMapStatuses(MapOutputTracker.scala:662)
at org.apache.spark.MapOutputTracker.getStatuses(MapOutputTracker.scala:205)
at org.apache.spark.MapOutputTracker.getMapSizesByExecutorId(MapOutputTracker.scala:144)
at org.apache.spark.shuffle.BlockStoreShuffleReader.read(BlockStoreShuffleReader.scala:49)
at org.apache.spark.sql.execution.ShuffledRowRDD.compute(ShuffledRowRDD.scala:165)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
at org.apache.spark.scheduler.Task.run(Task.scala:108)
at org.apache.spark.executor.Executor T a s k R u n n e r . r u n ( E x e c u t o r . s c a l a : 335 ) a t j a v a . u t i l . c o n c u r r e n t . T h r e a d P o o l E x e c u t o r . r u n W o r k e r ( T h r e a d P o o l E x e c u t o r . j a v a : 1142 ) a t j a v a . u t i l . c o n c u r r e n t . T h r e a d P o o l E x e c u t o r TaskRunner.run(Executor.scala:335) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor TaskRunner.run(Executor.scala:335)atjava.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)atjava.util.concurrent.ThreadPoolExecutorWorker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
20/11/19 07:03:45 INFO MapOutputTrackerWorker: Doing the fetch; tracker endpoint = NettyRpcEndpointRef(spark://MapOutputTracker@10.11.123.112:55603)
20/11/19 07:03:45 ERROR Executor: Exception in task 51.0 in stage 395.0 (TID 210587)
java.io.IOException: org.apache.spark.SparkException: Failed to get broadcast_657_piece0 of broadcast_657
at org.apache.spark.util.Utils . t r y O r I O E x c e p t i o n ( U t i l s . s c a l a : 1310 ) a t o r g . a p a c h e . s p a r k . b r o a d c a s t . T o r r e n t B r o a d c a s t . r e a d B r o a d c a s t B l o c k ( T o r r e n t B r o a d c a s t . s c a l a : 206 ) a t o r g . a p a c h e . s p a r k . b r o a d c a s t . T o r r e n t B r o a d c a s t . v a l u e .tryOrIOException(Utils.scala:1310) at org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:206) at org.apache.spark.broadcast.TorrentBroadcast._value .tryOrIOException(Utils.scala:1310)atorg.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:206)atorg.apache.spark.broadcast.TorrentBroadcast.valuelzycompute(TorrentBroadcast.scala:66)
at org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:66)
at org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:96)
at org.apache.spark.broadcast.Broadcast.value(Broadcast.scala:70)
at org.apache.spark.MapOutputTrackerKaTeX parse error: Can't use function '$' in math mode at position 8: anonfun$̲deserializeMapS…anonfun$deserializeMapStatuses 1. a p p l y ( M a p O u t p u t T r a c k e r . s c a l a : 663 ) a t o r g . a p a c h e . s p a r k . i n t e r n a l . L o g g i n g 1.apply(MapOutputTracker.scala:663) at org.apache.spark.internal.Logging 1.apply(MapOutputTracker.scala:663)atorg.apache.spark.internal.Loggingclass.logInfo(Logging.scala:54)
at org.apache.spark.MapOutputTracker . l o g I n f o ( M a p O u t p u t T r a c k e r . s c a l a : 600 ) a t o r g . a p a c h e . s p a r k . M a p O u t p u t T r a c k e r .logInfo(MapOutputTracker.scala:600) at org.apache.spark.MapOutputTracker .logInfo(MapOutputTracker.scala:600)atorg.apache.spark.MapOutputTracker.deserializeMapStatuses(MapOutputTracker.scala:662)

这个报错很奇怪,虽然报错原因是一个sparkcontext没有关闭又打开了另一个导致的
但是代码中只有一个sparkcontext对象 。至今无解,希望各位码农可以提示一下,谢谢啦

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值