关闭Hadoop集群报错

关闭Hadoop集群报错

1. 报错如下:
[root@server4 sbin]# ./stop-yarn.sh 
stopping yarn daemons
no resourcemanager to stop
server5: no nodemanager to stop
server6: no nodemanager to stop
server4: no nodemanager to stop
no proxyserver to stop
[root@server4 sbin]# ./stop-dfs.sh 
Stopping namenodes on [server4]
server4: no namenode to stop
server5: no datanode to stop
server6: no datanode to stop
server4: no datanode to stop
Stopping secondary namenodes [0.0.0.0]
0.0.0.0: no secondarynamenode to stop

查看关闭的脚本

[root@server4 sbin]# cat -n yarn-daemon.sh 
     1	#!/usr/bin/env bash
     2	
    ··· #
    17	
    18	
    19	# Runs a yarn command as a daemon.
    20	#
    21	# Environment Variables
    22	#
    23	#   YARN_CONF_DIR  Alternate conf dir. Default is ${HADOOP_YARN_HOME}/conf.
    24	#   YARN_LOG_DIR   Where log files are stored.  PWD by default.
    25	#   YARN_MASTER    host:path where hadoop code should be rsync'd from
    26	#   YARN_PID_DIR   The pid files are stored. /tmp by default.
    27	#   YARN_IDENT_STRING   A string representing this instance of hadoop. $USER by default
    28	#   YARN_NICENESS The scheduling priority for daemons. Defaults to 0.
    29	##
    30	
    31	usage="Usage: yarn-daemon.sh [--config <conf-dir>] [--hosts hostlistfile] (start|stop) <yarn-command> "
    32	
    33	# if no args specified, show usage
    34	if [ $# -le 1 ]; then
    35	  echo $usage
    36	  exit 1
    37	fi
    38	
    39	bin=`dirname "${BASH_SOURCE-$0}"`
    40	bin=`cd "$bin"; pwd`
    41	
    42	DEFAULT_LIBEXEC_DIR="$bin"/../libexec
    43	HADOOP_LIBEXEC_DIR=${HADOOP_LIBEXEC_DIR:-$DEFAULT_LIBEXEC_DIR}
    44	. $HADOOP_LIBEXEC_DIR/yarn-config.sh
    45	
    46	# get arguments
    47	startStop=$1
    48	shift
    49	command=$1
    50	shift
    51	
    52	hadoop_rotate_log ()
    53	{
    54	    log=$1;
    55	    num=5;
    56	    if [ -n "$2" ]; then
    57		num=$2
    58	    fi
    59	    if [ -f "$log" ]; then # rotate logs
    60		while [ $num -gt 1 ]; do
    61		    prev=`expr $num - 1`
    62		    [ -f "$log.$prev" ] && mv "$log.$prev" "$log.$num"
    63		    num=$prev
    64		done
    65		mv "$log" "$log.$num";
    66	    fi
    67	}
    68	
    69	if [ -f "${YARN_CONF_DIR}/yarn-env.sh" ]; then
    70	  . "${YARN_CONF_DIR}/yarn-env.sh"
    71	fi
    72	
    73	if [ "$YARN_IDENT_STRING" = "" ]; then
    74	  export YARN_IDENT_STRING="$USER"
    75	fi
    76	
    77	# get log directory
    78	if [ "$YARN_LOG_DIR" = "" ]; then
    79	  export YARN_LOG_DIR="$HADOOP_YARN_HOME/logs"
    80	fi
    81	
    82	if [ ! -w "$YARN_LOG_DIR" ] ; then
    83	  mkdir -p "$YARN_LOG_DIR"
    84	  chown $YARN_IDENT_STRING $YARN_LOG_DIR 
    85	fi
    86	
    87	if [ "$YARN_PID_DIR" = "" ]; then
    88	  YARN_PID_DIR=/tmp
    89	fi
    90	
    91	# some variables
    92	export YARN_LOGFILE=yarn-$YARN_IDENT_STRING-$command-$HOSTNAME.log
    93	export YARN_ROOT_LOGGER=${YARN_ROOT_LOGGER:-INFO,RFA}
    94	log=$YARN_LOG_DIR/yarn-$YARN_IDENT_STRING-$command-$HOSTNAME.out
    95	pid=$YARN_PID_DIR/yarn-$YARN_IDENT_STRING-$command.pid
    96	YARN_STOP_TIMEOUT=${YARN_STOP_TIMEOUT:-5}
    97	
    98	# Set default scheduling priority
    99	if [ "$YARN_NICENESS" = "" ]; then
   100	    export YARN_NICENESS=0
   101	fi
   102	
   103	case $startStop in
   104	
   105	  (start)
   106	
   107	    [ -w "$YARN_PID_DIR" ] || mkdir -p "$YARN_PID_DIR"
   108	
   109	    if [ -f $pid ]; then
   110	      if kill -0 `cat $pid` > /dev/null 2>&1; then
   111	        echo $command running as process `cat $pid`.  Stop it first.
   112	        exit 1
   113	      fi
   114	    fi
   115	
   116	    if [ "$YARN_MASTER" != "" ]; then
   117	      echo rsync from $YARN_MASTER
   118	      rsync -a -e ssh --delete --exclude=.svn --exclude='logs/*' --exclude='contrib/hod/logs/*' $YARN_MASTER/ "$HADOOP_YARN_HOME"
   119	    fi
   120	
   121	    hadoop_rotate_log $log
   122	    echo starting $command, logging to $log
   123	    cd "$HADOOP_YARN_HOME"
   124	    nohup nice -n $YARN_NICENESS "$HADOOP_YARN_HOME"/bin/yarn --config $YARN_CONF_DIR $command "$@" > "$log" 2>&1 < /dev/null &
   125	    echo $! > $pid
   126	    sleep 1
   127	    head "$log"
   128	    # capture the ulimit output
   129	    echo "ulimit -a" >> $log
   130	    ulimit -a >> $log 2>&1
   131	    ;;
   132	          
   133	  (stop)
   134	
   135	    if [ -f $pid ]; then
   136	      TARGET_PID=`cat $pid`
   137	      if kill -0 $TARGET_PID > /dev/null 2>&1; then
   138	        echo stopping $command
   139	        kill $TARGET_PID
   140	        sleep $YARN_STOP_TIMEOUT
   141	        if kill -0 $TARGET_PID > /dev/null 2>&1; then
   142	          echo "$command did not stop gracefully after $YARN_STOP_TIMEOUT seconds: killing with kill -9"
   143	          kill -9 $TARGET_PID
   144	        fi
   145	      else
   146	        echo no $command to stop
   147	      fi
   148	      rm -f $pid
   149	    else
   150	      echo no $command to stop
   151	    fi
   152	    ;;
   153	
   154	  (*)
   155	    echo $usage
   156	    exit 1
   157	    ;;
   158	
   159	esac

过滤pid字段之后查看

[root@server4 sbin]# cat -n yarn-daemon.sh | grep pid 
    26	#   YARN_PID_DIR   The pid files are stored. /tmp by default.
    95	pid=$YARN_PID_DIR/yarn-$YARN_IDENT_STRING-$command.pid
   109	    if [ -f $pid ]; then
   110	      if kill -0 `cat $pid` > /dev/null 2>&1; then
   111	        echo $command running as process `cat $pid`.  Stop it first.
   125	    echo $! > $pid
   135	    if [ -f $pid ]; then
   136	      TARGET_PID=`cat $pid`
   148	      rm -f $pid
2.原因

针对上述的关闭脚本,可以看到这个是去找默认/tmp目录下的yarn-$YARN_IDENT_STRING-$command.pid这个进程,但是因为/tmp目录会被定时清空,导致无法找到这个.pid文件,所以无法执行关闭Hadoop进程,从而导致Hadoop集群关闭异常。

3.解决办法
3.1 方法一
  • 修改hadoop-env.sh脚本
    修改其中的export HADOOP_PID_DIR=${HADOOP_PID_DIR}值为某个除/tmp之外的固定路径即可【The directory where pid files are stored. /tmp by default.】。如笔者修改之后的样子如下:
[root@server4 hadoop]# tail -10 hadoop-env.sh 
# NOTE: this should be set to a directory that can only be written to by 
#       the user that will run the hadoop daemons.  Otherwise there is the
#       potential for a symlink attack.
#export HADOOP_PID_DIR=${HADOOP_PID_DIR}
#/usr/local/hadoop-2.6.4/pids
export HADOOP_PID_DIR=/usr/local/hadoop-2.6.4/pids
export HADOOP_SECURE_DN_PID_DIR=${HADOOP_PID_DIR}

# A string representing this instance of hadoop. $USER by default.
export HADOOP_IDENT_STRING=$USER

但是仅仅hadoop-env.sh脚本中的值还是远远不够的,假设此时我们启动一下hadoop集群,去到/usr/local/hadoop-2.6.4/pids中查看信息,如下:

[root@server4 pids]# ll
total 12
-rw-r--r--. 1 root root 6 Oct 23 16:53 hadoop-root-datanode.pid
-rw-r--r--. 1 root root 6 Oct 23 16:53 hadoop-root-namenode.pid
-rw-r--r--. 1 root root 6 Oct 23 16:54 hadoop-root-secondarynamenode.pid
[root@server4 pids]# pwd
/usr/local/hadoop-2.6.4/pids
[root@server4 pids]# 

可以看到这里仅仅只有hadoop(hdfs)的pids,没有yarn相关信息的pid,因为关于yarn的pids存储路径我们没有指定,所以启动集群之后还是会将pids写到/tmp路径下,查看如下:

[root@server4 pids]# cd /tmp
[root@server4 tmp]# ll
total 8
drwxr-xr-x. 3 root root 19 Oct 20 11:47 hbase-root
drwxr-xr-x. 2 root root 71 Oct 23 16:55 hsperfdata_root
drwxr-xr-x. 4 root root 32 Oct 23 16:53 Jetty_0_0_0_0_50070_hdfs____w2cu08
drwxr-xr-x. 4 root root 32 Oct 23 16:54 Jetty_0_0_0_0_50075_datanode____hwtdwq
drwxr-xr-x. 4 root root 32 Oct 23 16:54 Jetty_0_0_0_0_50090_secondary____y6aanv
drwxr-xr-x. 5 root root 46 Oct 23 16:55 Jetty_0_0_0_0_8042_node____19tj0x
drwxr-xr-x. 5 root root 46 Oct 23 16:55 Jetty_server4_8088_cluster____y51xml
drwx------. 3 root root 17 Oct  5 11:03 systemd-private-10dd88eabf284681a53d4e9aa58ca6ca-chronyd.service-9PTHJi
drwx------. 3 root root 17 Oct  5 11:03 systemd-private-10dd88eabf284681a53d4e9aa58ca6ca-cups.service-tbyfMo
drwx------. 3 root root 17 Oct  8 20:55 systemd-private-10dd88eabf284681a53d4e9aa58ca6ca-httpd.service-tZE9aa
drwx------. 2 root root  6 Oct 15 10:54 vmware-root
-rw-r--r--. 1 root root  6 Oct 23 16:54 yarn-root-nodemanager.pid
-rw-r--r--. 1 root root  6 Oct 23 16:54 yarn-root-resourcemanager.pid

所以现在我们需要在文件yarn-env.sh中追加关于YARN_PID_DIR内容

  • 修改yarn-env.sh如下:
    在该文件后追加如下一行export YARN_PID_DIR=/usr/local/hadoop-2.6.4/pids
[root@server4 hadoop]# tail -10 yarn-env.sh 
YARN_OPTS="$YARN_OPTS -Dyarn.home.dir=$YARN_COMMON_HOME"
YARN_OPTS="$YARN_OPTS -Dyarn.id.str=$YARN_IDENT_STRING"
YARN_OPTS="$YARN_OPTS -Dhadoop.root.logger=${YARN_ROOT_LOGGER:-INFO,console}"
YARN_OPTS="$YARN_OPTS -Dyarn.root.logger=${YARN_ROOT_LOGGER:-INFO,console}"
if [ "x$JAVA_LIBRARY_PATH" != "x" ]; then
  YARN_OPTS="$YARN_OPTS -Djava.library.path=$JAVA_LIBRARY_PATH"
fi  
YARN_OPTS="$YARN_OPTS -Dyarn.policy.file=$YARN_POLICYFILE"

export YARN_PID_DIR=/usr/local/hadoop-2.6.4/pids

执行命令start-yarn.sh,并查看相应文件夹:

[root@server4 shells]# cd /usr/local/hadoop-2.6.4/pids/
[root@server4 pids]# ll
total 8
-rw-r--r--. 1 root root 6 Oct 23 17:16 yarn-root-nodemanager.pid
-rw-r--r--. 1 root root 6 Oct 23 17:16 yarn-root-resourcemanager.pid
[root@server4 pids]# cd /tmp
[root@server4 tmp]# ll
total 0
drwxr-xr-x. 3 root root 19 Oct 20 11:47 hbase-root
drwxr-xr-x. 2 root root 32 Oct 23 17:16 hsperfdata_root
drwxr-xr-x. 4 root root 32 Oct 23 16:53 Jetty_0_0_0_0_50070_hdfs____w2cu08
drwxr-xr-x. 4 root root 32 Oct 23 16:54 Jetty_0_0_0_0_50075_datanode____hwtdwq
drwxr-xr-x. 4 root root 32 Oct 23 16:54 Jetty_0_0_0_0_50090_secondary____y6aanv
drwxr-xr-x. 5 root root 46 Oct 23 17:16 Jetty_0_0_0_0_8042_node____19tj0x
drwxr-xr-x. 5 root root 46 Oct 23 17:16 Jetty_server4_8088_cluster____y51xml
drwx------. 3 root root 17 Oct  5 11:03 systemd-private-10dd88eabf284681a53d4e9aa58ca6ca-chronyd.service-9PTHJi
drwx------. 3 root root 17 Oct  5 11:03 systemd-private-10dd88eabf284681a53d4e9aa58ca6ca-cups.service-tbyfMo
drwx------. 3 root root 17 Oct  8 20:55 systemd-private-10dd88eabf284681a53d4e9aa58ca6ca-httpd.service-tZE9aa
drwx------. 2 root root  6 Oct 15 10:54 vmware-root
[root@server4 tmp]# 
3.2 方法二

修改hadoop-daemon.shyarn-daemon.sh 文件,直接从jps获取pid,绕过从pid文件中获取值
同理,这种问题肯定不仅仅在hdfs,yarn中存在,hbase中也会存在这种情况。我这里不再赘述了。

4.参考文章
  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

说文科技

看书人不妨赏个酒钱?

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值