namenode gc导致的故障一例

最新推荐文章于 2024-01-16 09:28:38 发布

mnasd

最新推荐文章于 2024-01-16 09:28:38 发布

阅读量3.1k

点赞数

分类专栏： hadoop gc故障

hadoop 同时被 2 个专栏收录

54 篇文章 3 订阅

订阅专栏

gc故障

7 篇文章 0 订阅

订阅专栏

故障现象：

6点1 分左右开始， Hadoop集群异常，所有的hdfs操作都出现问题。

几十个 job报以下错

 
        FAILED: 
       
        RuntimeException  
       
        org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): 
       
        Operation category WRITE is not supported in state standby 
       
        或者： FAILED: 
       
        RuntimeException  
       
        org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): 
       
        Operation category READ is not supported in state standby

故障原因分析:

1)看nn的日志，可以发现是写jn导致的，怀疑是网络问题导致，由于是5台jn，如果是网络问题的话，根据major的算法，需要3台写入失败才会导致出现问题，虽然jn和nn有些机器不在同一个交换机下，但是，从监控上来看，网络没有异常，排除网络问题

2)仔细查看nn的日志，和对源码的分析，发现是由于 namenode这个时间点进行了一次时间比较长的 full gc （96s ），导致写 journalnode 超时（默认是20s）， namenode进程退出。同时从jn的日志可以看出，后面是没有nn的请求过来的（因为nn已经挂掉了）。

Gc日志：

 
   
        2014 
        - 
        05 
        -25T05: 
        59 
        : 
        53.578  
        + 
        0800 
        : 
       
 
          
        2478553.316 
        : [Full GC [PSYoungGen: 2784K->0K(4179328K)] [PSOldGen:  
       
 
        28566944K->9208608K(28573696K)] 28569728K->9208608K(32753024K) 
       
 
        [PSPermGen: 41405K->41405K(524288K)],  
        95.6706740  
        secs] [Times: user= 
        95.62  
        sys= 
        0.00 
        , real=  
        95.67  
        secs] 
       
 
        Heap 
       
 
        PSYoungGen      total 4179328K, used 87037K [ 
        0x00007f70e5000000 
        ,  
        0x00007f71e5000000 
        ,  
        0x00007f71e5000000 
        ) 
       
 
           
        eden space 4164800K,  
        2 
        % used [ 
        0x00007f70e5000000 
        , 
        0x00007f70ea4ff590 
        , 
        0x00007f71e3330000 
        ) 
       
 
           
        from space 14528K,  
        0 
        % used [ 
        0x00007f71e3330000 
        , 
        0x00007f71e3330000 
        , 
        0x00007f71e4160000 
        ) 
       
 
           
        to   space 14336K,  
        0 
        % used [ 
        0x00007f71e4200000 
        , 
        0x00007f71e4200000 
        , 
        0x00007f71e5000000 
        ) 
       
 
        PSOldGen        total 28573696K, used 9208608K [ 
        0x00007f6a15000000 
        ,  
        0x00007f70e5000000 
        ,  
        0x00007f70e5000000 
        ) 
       
 
           
        object space 28573696K,  
        32 
        % used [ 
        0x00007f6a15000000 
        , 
        0x00007f6c470c8078 
        , 
        0x00007f70e5000000 
        ) 
       
 
        PSPermGen       total 524288K, used 41466K [ 
        0x00007f69f5000000 
        ,  
        0x00007f6a15000000 
        ,  
        0x00007f6a15000000 
        ) 
       
 
           
        object space 524288K,  
        7 
        % used [ 
        0x00007f69f5000000 
        , 
        0x00007f69f787e8f8 
        , 
        0x00007f6a15000000 
        ) 
       
 
 

jn日志：

 
   
        2014 
        - 
        05 
        - 
        25  
        05 
        : 
        58 
        : 
        45 
        , 
        432  
        INFO  
       
 
        org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing  
       
 
        edits file  
       
 
        /home/vipshop/hard_disk/nn_shared_disk/bipcluster/current/edits_inprogress_0000000001665741687 
       
 
          
        ->  
       
 
        /home/vipshop/hard_disk/nn_shared_disk/bipcluster/current/edits_0000000001665741687- 
        0000000001665777062 
       
 
        2014 
        - 
        05 
        - 
        25  
        06 
        : 
        44 
        : 
        54 
        , 
        299  
        ERROR org.apache.hadoop.hdfs.qjournal.server.JournalNode: RECEIVED SIGNAL  
        15 
        : SIGTERM 
       
 
        2014 
        - 
        05 
        - 
        25  
        06 
        : 
        44 
        : 
        54 
        , 
        302  
        INFO org.apache.hadoop.hdfs.qjournal.server.JournalNode: SHUTDOWN_MSG: 
       
 
 

nn错误日志：

 
   
        2014 
        - 
        05 
        - 
        25  
        06 
        : 
        01 
        : 
        29 
        , 
        258  
        WARN  
       
 
        org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited  
       
 
        96275  
        ms (timeout= 
        20000  
        ms)  
        for  
        a response  
        for  
        sendEdits. No responses  
       
 
        yet. 
       
 
        2014 
        - 
        05 
        - 
        25  
        06 
        : 
        01 
        : 
        29 
        , 
        259  
        WARN  
       
 
        org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host  
       
 
        machine (eg GC): pause of approximately 96145ms 
       
 
        GC pool  
        'PS MarkSweep'  
        had collection(s): count= 
        1  
        time=95670ms 
       
 
        GC pool  
        'PS Scavenge'  
        had collection(s): count= 
        1  
        time=592ms 
       
 
        2014 
        - 
        05 
        - 
        25 
       
 
          
        06 
        : 
        01 
        : 
        29 
        , 
        259  
        WARN  
       
 
        org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Took  
       
 
        96281ms to send a batch of  
        2  
        edits ( 
        358  
        bytes) to remote journal  
       
 
        xxx: 
        8485 
       
 
        2014 
        - 
        05 
        - 
        25  
        06 
        : 
        01 
        : 
        29 
        , 
        259  
        WARN  
       
 
        org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Took  
       
 
        96283ms to send a batch of  
        2  
        edits ( 
        358  
        bytes) to remote journal  
       
 
        xxx: 
        8485 
       
 
        2014 
        - 
        05 
        - 
        25  
        06 
        : 
        01 
        : 
        29 
        , 
        259  
        INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: 
       
 
        Number of transactions:  
        20019 
       
 
        Total time  
        for  
        transactions(ms):  
        436 
       
 
        Number of transactions batched in Syncs:  
        9441 
       
 
        Number of syncs:  
        5160 
       
 
        SyncTimes(ms):  
        41623  
        9305 
       
 
        2014 
        - 
        05 
        - 
        25 
       
 
          
        06 
        : 
        01 
        : 
        29 
        , 
        259  
        WARN  
       
 
        org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Took  
       
 
        96275ms to send a batch of  
        13  
        edits ( 
        1638  
        bytes) to remote journal  
       
 
        xxx: 
        8485 
       
 
        2014 
        - 
        05 
        - 
        25  
        06 
        : 
        01 
        : 
        29 
        , 
        259  
        FATAL  
       
 
        org.apache.hadoop.hdfs.server.namenode.FSEditLog: Error: flush failed  
       
 
        for  
        required journal (JournalAndStream(mgr=QJM to [xxx: 
        8485 
        , xxx: 
        8485 
        ,  
       
 
        xxx: 
        8485 
        , xxx: 
        8485 
        , xxx: 
        8485 
        ], stream=QuorumOutputStream starting at  
       
 
        txid  
        1665777063 
        )) 
       
 
        java.io.IOException: Timed out waiting 20000ms  
        for  
        a quorum of nodes to respond. 
       
 
                 
        at org.apache.hadoop.hdfs.qjournal.client.AsyncLoggerSet.waitForWriteQuorum(AsyncLoggerSet.java: 
        137 
        ) 
       
 
                 
        at org.apache.hadoop.hdfs.qjournal.client.QuorumOutputStream.flushAndSync(QuorumOutputStream.java: 
        107 
        ) 
       
 
                 
        at org.apache.hadoop.hdfs.server.namenode.EditLogOutputStream.flush(EditLogOutputStream.java: 
        113 
        ) 
       
 
                 
        at org.apache.hadoop.hdfs.server.namenode.EditLogOutputStream.flush(EditLogOutputStream.java: 
        107 
        ) 
       
 
                 
        at org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalSetOutputStream$ 
        8 
        .apply(JournalSet.java: 
        490 
        ) 
       
 
                 
        at org.apache.hadoop.hdfs.server.namenode.JournalSet.mapJournalsAndReportErrors(JournalSet.java: 
        350 
        ) 
       
 
                 
        at org.apache.hadoop.hdfs.server.namenode.JournalSet.access$ 
        100 
        (JournalSet.java: 
        53 
        ) 
       
 
                 
        at org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalSetOutputStream.flush(JournalSet.java: 
        486 
        ) 
       
 
                 
        at org.apache.hadoop.hdfs.server.namenode.FSEditLog.logSync(FSEditLog.java: 
        581 
        ) 
       
 
                 
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java: 
        1879 
        ) 
       
 
                 
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java: 
        1845 
        ) 
       
 
                 
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java: 
        439 
        ) 
       
 
                 
       
 
          
        at  
       
 
        org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java: 
        207 
        ) 
       
 
                 
       
 
          
        at  
       
 
        org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$ 
        2 
        .callBlockingMethod(ClientNamenodeProtocolProtos.java: 
        44942 
        ) 
       
 
                 
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java: 
        453 
        ) 
       
 
                 
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java: 
        1002 
        ) 
       
 
                 
        at org.apache.hadoop.ipc.Server$Handler$ 
        1 
        .run(Server.java: 
        1752 
        ) 
       
 
                 
        at org.apache.hadoop.ipc.Server$Handler$ 
        1 
        .run(Server.java: 
        1748 
        ) 
       
 
                 
        at java.security.AccessController.doPrivileged(Native Method) 
       
 
                 
        at javax.security.auth.Subject.doAs(Subject.java: 
        396 
        ) 
       
 
                 
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java: 
        1438 
        ) 
       
 
                 
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java: 
        1746 
        ) 
       
 
        2014 
        - 
        05 
        - 
        25 
       
 
          
        06 
        : 
        01 
        : 
        29 
        , 
        259  
        WARN  
       
 
        org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Took  
       
 
        96276ms to send a batch of  
        13  
        edits ( 
        1638  
        bytes) to remote journal  
       
 
        10.201 
        . 
        202.21 
        : 
        8485 
       
 
        2014 
        - 
        05 
        - 
        25  
        06 
        : 
        01 
        : 
        29 
        , 
        263  
        INFO  
       
 
        org.apache.hadoop.hdfs.StateChange: BLOCK* allocateBlock:  
       
 
        /tmp/hive-hdfs/hive_2014- 
        05 
        -25_05- 
        59 
        -16_296_5866222407565920996/_task_tmp.-ext- 
        10002 
        /_tmp.001379_0. 
       
 
          
        BP- 
        180494678 
        -xxx- 
        1366627257763  
       
 
        blk_-3717787322078480343_437088977{blockUCState=UNDER_CONSTRUCTION,  
       
 
        primaryNodeIndex=- 
        1 
        , replicas=[ReplicaUnderConstruction[xxx: 
        50010 
        |RBW],  
       
 
        ReplicaUnderConstruction[xxx: 
        50010 
        |RBW],  
       
 
        ReplicaUnderConstruction[xxx: 
        50010 
        |RBW]]} 
       
 
        2014 
        - 
        05 
        - 
        25  
        06 
        : 
        01 
        : 
        29 
        , 
        263  
       
 
        WARN  
       
 
        org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault: 
       
 
          
        Not able to place enough replicas, still in need of  
        2  
        to reach  
        3 
       
 
        For more information, please enable DEBUG log level on org.apache.commons.logging.impl.Log4JLogger 
       
 
        2014 
        - 
        05 
        - 
        25 
       
 
          
        06 
        : 
        01 
        : 
        29 
        , 
        263  
        WARN  
       
 
        org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Aborting  
       
 
        QuorumOutputStream starting at txid  
        1665777063 
       
 
        ...... 
       
 
        2014 
        - 
        05 
        - 
        25  
       
 
        06 
        : 
        01 
        : 
        29 
        , 
        268  
        WARN  
       
 
        org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Took  
       
 
        96285ms to send a batch of  
        13  
        edits ( 
        1638  
        bytes) to remote journal  
       
 
        xxx: 
        8485 
       
 
        ...... 
       
 
        2014 
        - 
        05 
        - 
        25  
        06 
        : 
        01 
        : 
        29 
        , 
        324  
        INFO org.apache.hadoop.util.ExitUtil: Exiting with status  
        1 
       
 
        2014 
        - 
        05 
        - 
        25  
        06 
        : 
        01 
        : 
        29 
        , 
        364  
        INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG: 
       
 
 

在写journalnode超时时，触发了 ExitUtil类的terminate 方法，终止当前的进程：

JournalSet类中：

 
        for  
        (JournalAndStream jas :  journals) { 
       
        try  
        { 
       
        closure.apply(jas); 
       
        }   
        catch  
        (Throwable t) { 
       
        if  
        (jas.isRequired()) { 
       
        final  
        String msg =  
        "Error: "   
        + status +   
        " failed for required journal (" 
       
        + jas +   
        ")"  
        ; 
       
        LOG .fatal(msg, t); 
       
        // If we fail on *any* of the required journals, then we must not 
       
        // continue on any of the other journals. Abort them to ensure that 
       
        // retry behavior doesn't allow them to keep going in any way. 
       
        abortAllJournals(); 
       
        // the current policy is to shutdown the NN on errors to shared edits 
       
        // dir  . There are many code paths to shared edits failures - syncs , 
       
        // roll of edits etc. All of them go through this common function 
       
        // where the isRequired() check is made. Applying exit policy here 
       
        // to catch all code paths. 
       
        terminate( 
        1 
        , msg); 
       
        }   
        else  
        { 
       
        LOG .error( 
        "Error: "   
        + status +   
        " failed for (journal "  
        + jas +  
        ")"   
        , t); 
       
        badJAS.add(jas);          
       
        } 
       
        } 
       
        }

ExitUtil类的terminate方法，调用了System.exit方法：

 
        /** 
       
        * Terminate the current process. Note that terminate is the *only* method 
       
        * that should be used to terminate the daemon processes. 
       
        * @param status exit code 
       
        * @param msg message used to create the ExitException 
       
        * @throws ExitException if System.exit is disabled for test purposes 
       
        */ 
       
        public  
        static  
        void  
        terminate( 
        int  
        status, String msg)  
        throws  
        ExitException { 
       
        LOG.info(  
        "Exiting with status "  
        + status); 
       
        if  
        (systemExitDisabled) { 
       
        ExitException ee =  
        new  
        ExitException(status, msg); 
       
        LOG.fatal(  
        "Terminate called" 
        , ee); 
       
        if  
        ( 
        null  
        == firstExitException) { 
       
        firstExitException = ee; 
       
        } 
       
        throw  
        ee; 
       
        } 
       
        System.exit(status); 
       
        }

处理方法：

重启集群，在 6:50左右恢复正常

后续解决方法：

1）调节journalnode 的写入超时时间

 
        dfs.qjournal.write-txns.timeout.ms

2）调整namenode 的java参数，提前触发 full gc，这样full gc 的时间就会小一些。

3）默认namenode的fullgc方式是parallel gc，是stw模式的，更改为cms的格式。调整namenode的启动参数：

 
        -XX:+UseCompressedOops  
       
        -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled  
       
        -XX:+UseCMSCompactAtFullCollection -XX:CMSFullGCsBeforeCompaction= 
        0  
       
        -XX:+CMSParallelRemarkEnabled -XX:+DisableExplicitGC  
       
        -XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction= 
        75 
       
        -XX:SoftRefLRUPolicyMSPerMB= 
        0

另外一个比较坑的是flume不会自动恢复，需要重启。。

否则会一直报java.lang.InterruptedException: sleep interrupted 错误。

mnasd

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
namenode gc导致的故障一例

故障现象：6点1 分左右开始， Hadoop集群异常，所有的hdfs操作都出现问题。几十个 job报以下错12345678FAILED: RuntimeException org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category WRITE is not sup...
复制链接

扫一扫

专栏目录