-大数据入门-2-Hadoop

最新推荐文章于 2022-08-20 09:19:56 发布

吾..二..二

最新推荐文章于 2022-08-20 09:19:56 发布

阅读量334

点赞数

分类专栏：若泽大数据=Hadoop

本文链接：https://blog.csdn.net/chen_2_2/article/details/95366805

版权

若泽大数据=Hadoop 专栏收录该内容

18 篇文章 0 订阅

订阅专栏

1启动hdfs

[hadoop@hadoop001 ~]$ cd app/hadoop
[hadoop@hadoop001 hadoop]$ ll
total 84
drwxr-xr-x  2 hadoop hadoop  4096 Jul  9 17:34 bin
drwxr-xr-x  2 hadoop hadoop  4096 Mar 24  2016 bin-mapreduce1
drwxr-xr-x  3 hadoop hadoop  4096 Mar 24  2016 cloudera
drwxr-xr-x  6 hadoop hadoop  4096 Mar 24  2016 etc
drwxr-xr-x  5 hadoop hadoop  4096 Mar 24  2016 examples
drwxr-xr-x  3 hadoop hadoop  4096 Mar 24  2016 examples-mapreduce1
drwxr-xr-x  2 hadoop hadoop  4096 Mar 24  2016 include
drwxr-xr-x  3 hadoop hadoop  4096 Mar 24  2016 lib
drwxr-xr-x  2 hadoop hadoop  4096 Mar 24  2016 libexec
-rw-r--r--  1 hadoop hadoop 17087 Mar 24  2016 LICENSE.txt
drwxrwxr-x  2 hadoop hadoop  4096 Jul  9 17:38 logs
-rw-r--r--  1 hadoop hadoop   101 Mar 24  2016 NOTICE.txt
drwxrwxr-x  2 hadoop hadoop  4096 Jul  9 19:16 output
-rw-r--r--  1 hadoop hadoop  1366 Mar 24  2016 README.txt
drwxr-xr-x  3 hadoop hadoop  4096 Jul  9 02:45 sbin
drwxr-xr-x  4 hadoop hadoop  4096 Mar 24  2016 share
drwxr-xr-x 17 hadoop hadoop  4096 Mar 24  2016 src
[hadoop@hadoop001 hadoop]$ cd sbin
[hadoop@hadoop001 sbin]$ ll
total 96
-rwxr-xr-x 1 hadoop hadoop 2752 Mar 24  2016 distribute-exclude.sh
-rwxr-xr-x 1 hadoop hadoop 6452 Mar 24  2016 hadoop-daemon.sh
-rwxr-xr-x 1 hadoop hadoop 1360 Mar 24  2016 hadoop-daemons.sh
-rwxr-xr-x 1 hadoop hadoop 1427 Mar 24  2016 hdfs-config.sh
-rwxr-xr-x 1 hadoop hadoop 3539 Mar 24  2016 httpfs.sh
-rwxr-xr-x 1 hadoop hadoop 3373 Mar 24  2016 kms.sh
drwxr-xr-x 2 hadoop hadoop 4096 Mar 24  2016 Linux
-rwxr-xr-x 1 hadoop hadoop 4080 Mar 24  2016 mr-jobhistory-daemon.sh
-rwxr-xr-x 1 hadoop hadoop 1648 Mar 24  2016 refresh-namenodes.sh
-rwxr-xr-x 1 hadoop hadoop 2145 Mar 24  2016 slaves.sh
-rwxr-xr-x 1 hadoop hadoop 1471 Mar 24  2016 start-all.sh
-rwxr-xr-x 1 hadoop hadoop 1128 Mar 24  2016 start-balancer.sh
-rwxr-xr-x 1 hadoop hadoop 3734 Mar 24  2016 start-dfs.sh
-rwxr-xr-x 1 hadoop hadoop 1357 Mar 24  2016 start-secure-dns.sh
-rwxr-xr-x 1 hadoop hadoop 1347 Mar 24  2016 start-yarn.sh
-rwxr-xr-x 1 hadoop hadoop 1462 Mar 24  2016 stop-all.sh
-rwxr-xr-x 1 hadoop hadoop 1179 Mar 24  2016 stop-balancer.sh
-rwxr-xr-x 1 hadoop hadoop 3206 Mar 24  2016 stop-dfs.sh
-rwxr-xr-x 1 hadoop hadoop 1340 Mar 24  2016 stop-secure-dns.sh
-rwxr-xr-x 1 hadoop hadoop 1340 Mar 24  2016 stop-yarn.sh
-rwxr-xr-x 1 hadoop hadoop 4295 Mar 24  2016 yarn-daemon.sh
-rwxr-xr-x 1 hadoop hadoop 1353 Mar 24  2016 yarn-daemons.sh
[hadoop@hadoop001 sbin]$ ./start-dfs.sh
ibrary for your platform... using builtin-java classes where applicable
[hadoop@hadoop001 sbin]$ jps     看进程
22922 NameNode
23020 DataNode
23213 SecondaryNameNode
23342 Jps

[root@hadoop001 ~]# netstat -nlp|grep 22922   root查看端口号
tcp        0      0 0.0.0.0:50070               0.0.0.0:*                   LISTEN      22922/java          
tcp        0      0 127.0.0.1:9000              0.0.0.0:*                   LISTEN      22922/java          
[root@hadoop001 ~]#

主从架构: 
namenode                 nn 名称节点               老大  统筹   
datanode                 dn 数据节点 小弟           老三真正干活 数据的读写
secondary namenode       snn 第二名称节点           老二每个小时备份数据

2配置参数：nn

[hadoop@hadoop001 hadoop]$ vi core-site.xml
[hadoop@hadoop001 hadoop]$ cat core-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
<property>
        <name>fs.defaultFS</name>
        <value>hdfs://hadoop001:9000</value>
                    机器名称在此修改此
    </property>
</configuration>

dn

[hadoop@hadoop001 hadoop]$ vi slaves
编辑机器名称
[hadoop@hadoop001 hadoop]$ cat slaves
hadoop001
[hadoop@hadoop001 hadoop]$

snn

如何找参数
在这里插入图片描述

[hadoop@hadoop001 hadoop]$ vi hdfs-site.xml
[hadoop@hadoop001 hadoop]$ cat hdfs-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
<property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>

<property>
        <name>dfs.namenode.secondary.http-address</name>
        <value>hadoop001:50090</value>
    </property>
<property>
        <name>dfs.namenode.secondary.https-address</name>
        <value>hadoop001:50091</value>
    </property>

[hadoop@hadoop001 sbin]$ ./stop-dfs.sh      停止
19/07/11 00:17:44 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Stopping namenodes on [hadoop001]
hadoop001: stopping namenode
hadoop001: stopping datanode
Stopping secondary namenodes [hadoop001]
hadoop001: stopping secondarynamenode
19/07/11 00:18:07 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[hadoop@hadoop001 sbin]$ jps
28586 Jps
[hadoop@hadoop001 sbin]$ ps -ef|grep hadoop    查看进程
root     23497 22774  0 Jul10 pts/0    00:00:00 su - hadoop
hadoop   23498 23497  0 Jul10 pts/0    00:00:00 -bash
root     25262 25245  0 00:02 pts/1    00:00:00 su - hadoop
hadoop   25263 25262  0 00:02 pts/1    00:00:00 -bash
root     27468 27453  0 00:14 pts/2    00:00:00 su - hadoop
hadoop   27469 27468  0 00:14 pts/2    00:00:00 -bash
hadoop   28630 25263  0 00:20 pts/1    00:00:00 ps -ef
hadoop   28631 25263  0 00:20 pts/1    00:00:00 grep hadoop

[hadoop@hadoop001 sbin]$ ./start-dfs.sh    重新启动全部都是Hadoop001
19/07/11 00:24:18 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [hadoop001]
hadoop001: starting namenode, logging to /home/hadoop/software/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-namenode-hadoop001.out
hadoop001: starting datanode, logging to /home/hadoop/software/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-datanode-hadoop001.out
Starting secondary namenodes [hadoop001]
hadoop001: starting secondarynamenode, logging to /home/hadoop/software/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-secondarynamenode-hadoop001.out
19/07/11 00:24:40 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[hadoop@hadoop001 sbin]$

jps 位置在哪，与Java平级

[hadoop@hadoop001 hadoop]$ which jps
/usr/java/jdk1.8.0_45/bin/jps
[hadoop@hadoop001 hadoop]$ which java
/usr/java/jdk1.8.0_45/bin/java
[hadoop@hadoop001 hadoop]$

jps查看进程

[hadoop@hadoop001 hadoop]$ jps     查看进程
28752 NameNode
28851 DataNode
29046 SecondaryNameNode
29289 Jps
[hadoop@hadoop001 hadoop]$

[hadoop@hadoop001 hadoop]$ jps -l  查看进程全称
28752 org.apache.hadoop.hdfs.server.namenode.NameNode
29265 sun.tools.jps.Jps
28851 org.apache.hadoop.hdfs.server.datanode.DataNode
29046 org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode

存储在哪

[hadoop@hadoop001 ~]$ cd /tmp
[hadoop@hadoop001 tmp]$ ll
total 1056052
-rw-r--r-- 1 root   root       10867 Jun 28 14:14 data.log
-rwxr--r-- 1 root   root          44 Jun 28 14:09 data.sh
-rwxr-xr-x 1 root   root      840944 Apr 28 20:25 disable
drwxrwxr-x 4 hadoop hadoop      4096 Jul  9 19:13 hadoop-hadoop
-rw-rw-r-- 1 hadoop hadoop         6 Jul 11 00:24 hadoop-hadoop-datanode.pid
-rw-rw-r-- 1 hadoop hadoop         6 Jul 11 00:24 hadoop-hadoop-namenode.pid
-rw-rw-r-- 1 hadoop hadoop         6 Jul 11 00:24 hadoop-hadoop-secondarynamenode.pid
drwxr-xr-x 2 hadoop hadoop      4096 Jul 11 00:33 hsperfdata_hadoop      hsperfdata用户名称

[hadoop@hadoop001 tmp]$ cd hsperfdata_hadoop
[hadoop@hadoop001 hsperfdata_hadoop]$ ll       jps的进程在此
total 96
-rw------- 1 hadoop hadoop 32768 Jul 11 00:37 28752
-rw------- 1 hadoop hadoop 32768 Jul 11 00:37 28851
-rw------- 1 hadoop hadoop 32768 Jul 11 00:37 29046
[hadoop@hadoop001 hsperfdata_hadoop]$ jps
28752 NameNode
29329 Jps
28851 DataNode
29046 SecondaryNameNode
[hadoop@hadoop001 hsperfdata_hadoop]$ cat 28752

进程所在的用户jps查看显示，
非root用户没有，查看不到
root显示process information unavailable

[root@hadoop001 etc]# jps
29507 Jps
28851 -- process information unavailable
[root@hadoop001 etc]# kill -9 28851
[root@hadoop001 etc]# jps
28851 -- process information unavailable
29531 Jps
[root@hadoop001 etc]# ps -ef|grep 28851
root     29556 29177  0 00:47 pts/3    00:00:00 grep 28851
[root@hadoop001 etc]# 
[hadoop@hadoop001 hsperfdata_hadoop]$ jps
29570 Jps

真真假假，假假真真
碰见这句 process information unavailable：
有可能可用 有可能不可用 ，有延时，记得 ps -ef|grep xxx是否实际存在就行

pid文件

进程启动 停止都所需的文件       

[hadoop@hadoop001 tmp]$ ll
total 1056064
drwxrwxr-x 4 hadoop   hadoop        4096 Jul  9 19:13 hadoop-hadoop
-rw-rw-r-- 1 hadoop   hadoop           6 Jul 11 00:52 hadoop-hadoop-datanode.pid
-rw-rw-r-- 1 hadoop   hadoop           6 Jul 11 00:52 hadoop-hadoop-namenode.pid      存储进程号，杀的时候读这个文件的内容
-rw-rw-r-- 1 hadoop   hadoop           6 Jul 11 00:52 hadoop-hadoop-secondarynamenode.pid
[root@hadoop001 tmp]# cat hadoop-hadoop-namenode.pid      存储进程的值
5842
[root@hadoop001 tmp]# 
[hadoop@hadoop001 sbin]$ jps
5842 NameNode
5939 DataNode
6243 Jps
6134 SecondaryNameNode
[hadoop@hadoop001 sbin]$

层层剖解PID文件

[hadoop@hadoop001 sbin]$ cat start-dfs.sh
#!/usr/bin/env bash

# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.


# Start hadoop dfs daemons.
# Optinally upgrade or rollback dfs state.
# Run this on master node.

usage="Usage: start-dfs.sh [-upgrade|-rollback] [other options such as -clusterId]"

bin=`dirname "${BASH_SOURCE-$0}"`
bin=`cd "$bin"; pwd`

DEFAULT_LIBEXEC_DIR="$bin"/../libexec
HADOOP_LIBEXEC_DIR=${HADOOP_LIBEXEC_DIR:-$DEFAULT_LIBEXEC_DIR}
. $HADOOP_LIBEXEC_DIR/hdfs-config.sh

# get arguments
if [[ $# -ge 1 ]]; then
  startOpt="$1"
  shift
  case "$startOpt" in
    -upgrade)
      nameStartOpt="$startOpt"
    ;;
    -rollback)
      dataStartOpt="$startOpt"
    ;;
    *)
      echo $usage
      exit 1
    ;;
  esac
fi

#Add other possible options
nameStartOpt="$nameStartOpt $@"

#---------------------------------------------------------
# namenodes

NAMENODES=$($HADOOP_PREFIX/bin/hdfs getconf -namenodes)

echo "Starting namenodes on [$NAMENODES]"

"$HADOOP_PREFIX/sbin/hadoop-daemons.sh" \
  --config "$HADOOP_CONF_DIR" \
  --hostnames "$NAMENODES" \
  --script "$bin/hdfs" start namenode $nameStartOpt

#---------------------------------------------------------
# datanodes (using default slaves file)

if [ -n "$HADOOP_SECURE_DN_USER" ]; then
  echo \
    "Attempting to start secure cluster, skipping datanodes. " \
    "Run start-secure-dns.sh as root to complete startup."
else
  "$HADOOP_PREFIX/sbin/hadoop-daemons.sh" \
    --config "$HADOOP_CONF_DIR" \
    --script "$bin/hdfs" start datanode $dataStartOpt
fi

#---------------------------------------------------------
# secondary namenodes (if any)

SECONDARY_NAMENODES=$($HADOOP_PREFIX/bin/hdfs getconf -secondarynamenodes 2>/dev/null)

if [ -n "$SECONDARY_NAMENODES" ]; then
  echo "Starting secondary namenodes [$SECONDARY_NAMENODES]"

  "$HADOOP_PREFIX/sbin/hadoop-daemons.sh" \
      --config "$HADOOP_CONF_DIR" \
      --hostnames "$SECONDARY_NAMENODES" \
      --script "$bin/hdfs" start secondarynamenode
fi

#---------------------------------------------------------
# quorumjournal nodes (if any)

SHARED_EDITS_DIR=$($HADOOP_PREFIX/bin/hdfs getconf -confKey dfs.namenode.shared.edits.dir 2>&-)

case "$SHARED_EDITS_DIR" in
qjournal://*)
  JOURNAL_NODES=$(echo "$SHARED_EDITS_DIR" | sed 's,qjournal://\([^/]*\)/.*,\1,g; s/;/ /g; s/:[0-9]*//g')
  echo "Starting journal nodes [$JOURNAL_NODES]"
  "$HADOOP_PREFIX/sbin/hadoop-daemons.sh" \
      --config "$HADOOP_CONF_DIR" \
      --hostnames "$JOURNAL_NODES" \
      --script "$bin/hdfs" start journalnode ;;
esac

#---------------------------------------------------------
# ZK Failover controllers, if auto-HA is enabled
AUTOHA_ENABLED=$($HADOOP_PREFIX/bin/hdfs getconf -confKey dfs.ha.automatic-failover.enabled)
if [ "$(echo "$AUTOHA_ENABLED" | tr A-Z a-z)" = "true" ]; then
  echo "Starting ZK Failover Controllers on NN hosts [$NAMENODES]"
  "$HADOOP_PREFIX/sbin/hadoop-daemons.sh" \
    --config "$HADOOP_CONF_DIR" \
    --hostnames "$NAMENODES" \
    --script "$bin/hdfs" start zkfc
fi
# eof

============================================================================================================
[hadoop@hadoop001 sbin]$ cat hadoop-daemons.sh
#!/usr/bin/env bash

# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.


# Run a Hadoop command on all slave hosts.

usage="Usage: hadoop-daemons.sh [--config confdir] [--hosts hostlistfile] [start|stop] command args..."

# if no args specified, show usage
if [ $# -le 1 ]; then
  echo $usage
  exit 1
fi

bin=`dirname "${BASH_SOURCE-$0}"`
bin=`cd "$bin"; pwd`

DEFAULT_LIBEXEC_DIR="$bin"/../libexec
HADOOP_LIBEXEC_DIR=${HADOOP_LIBEXEC_DIR:-$DEFAULT_LIBEXEC_DIR}
. $HADOOP_LIBEXEC_DIR/hadoop-config.sh

exec "$bin/slaves.sh" --config $HADOOP_CONF_DIR cd "$HADOOP_PREFIX" \; "$bin/hadoop-daemon.sh" --config $HADOOP_CONF_DIR "$@"

==========================================================================================================================
[hadoop@hadoop001 sbin]$ cat hadoop-daemon.sh
#!/usr/bin/env bash

# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.


# Runs a Hadoop command as a daemon.
#
# Environment Variables
#
#   HADOOP_CONF_DIR  Alternate conf dir. Default is ${HADOOP_PREFIX}/conf.
#   HADOOP_LOG_DIR   Where log files are stored.  PWD by default.
#   HADOOP_MASTER    host:path where hadoop code should be rsync'd from
#   HADOOP_PID_DIR   The pid files are stored. /tmp by default.
#   HADOOP_IDENT_STRING   A string representing this instance of hadoop. $USER by default
#   HADOOP_NICENESS The scheduling priority for daemons. Defaults to 0.
##

usage="Usage: hadoop-daemon.sh [--config <conf-dir>] [--hosts hostlistfile] [--script script] (start|stop) <hadoop-command> <args...>"

# if no args specified, show usage
if [ $# -le 1 ]; then
  echo $usage
  exit 1
fi

bin=`dirname "${BASH_SOURCE-$0}"`
bin=`cd "$bin"; pwd`

DEFAULT_LIBEXEC_DIR="$bin"/../libexec
HADOOP_LIBEXEC_DIR=${HADOOP_LIBEXEC_DIR:-$DEFAULT_LIBEXEC_DIR}
. $HADOOP_LIBEXEC_DIR/hadoop-config.sh

# get arguments

#default value
hadoopScript="$HADOOP_PREFIX"/bin/hadoop
if [ "--script" = "$1" ]
  then
    shift
    hadoopScript=$1
    shift
fi
startStop=$1
shift
command=$1
shift

hadoop_rotate_log ()
{
    log=$1;
    num=5;
    if [ -n "$2" ]; then
	num=$2
    fi
    if [ -f "$log" ]; then # rotate logs
	while [ $num -gt 1 ]; do
	    prev=`expr $num - 1`
	    [ -f "$log.$prev" ] && mv "$log.$prev" "$log.$num"
	    num=$prev
	done
	mv "$log" "$log.$num";
    fi
}

if [ -f "${HADOOP_CONF_DIR}/hadoop-env.sh" ]; then
  . "${HADOOP_CONF_DIR}/hadoop-env.sh"
fi

# Determine if we're starting a secure datanode, and if so, redefine appropriate variables
if [ "$command" == "datanode" ] && [ "$EUID" -eq 0 ] && [ -n "$HADOOP_SECURE_DN_USER" ]; then
  export HADOOP_PID_DIR=$HADOOP_SECURE_DN_PID_DIR
  export HADOOP_LOG_DIR=$HADOOP_SECURE_DN_LOG_DIR
  export HADOOP_IDENT_STRING=$HADOOP_SECURE_DN_USER
  starting_secure_dn="true"
fi

#Determine if we're starting a privileged NFS, if so, redefine the appropriate variables
if [ "$command" == "nfs3" ] && [ "$EUID" -eq 0 ] && [ -n "$HADOOP_PRIVILEGED_NFS_USER" ]; then
    export HADOOP_PID_DIR=$HADOOP_PRIVILEGED_NFS_PID_DIR
    export HADOOP_LOG_DIR=$HADOOP_PRIVILEGED_NFS_LOG_DIR
    export HADOOP_IDENT_STRING=$HADOOP_PRIVILEGED_NFS_USER
    starting_privileged_nfs="true"
fi

if [ "$HADOOP_IDENT_STRING" = "" ]; then
  export HADOOP_IDENT_STRING="$USER"
fi


# get log directory
if [ "$HADOOP_LOG_DIR" = "" ]; then
  export HADOOP_LOG_DIR="$HADOOP_PREFIX/logs"
fi

if [ ! -w "$HADOOP_LOG_DIR" ] ; then
  mkdir -p "$HADOOP_LOG_DIR"
  chown $HADOOP_IDENT_STRING $HADOOP_LOG_DIR
fi

if [ "$HADOOP_PID_DIR" = "" ]; then
  HADOOP_PID_DIR=/tmp
fi

# some variables
export HADOOP_LOGFILE=hadoop-$HADOOP_IDENT_STRING-$command-$HOSTNAME.log
export HADOOP_ROOT_LOGGER=${HADOOP_ROOT_LOGGER:-"INFO,RFA"}
export HADOOP_SECURITY_LOGGER=${HADOOP_SECURITY_LOGGER:-"INFO,RFAS"}
export HDFS_AUDIT_LOGGER=${HDFS_AUDIT_LOGGER:-"INFO,NullAppender"}
log=$HADOOP_LOG_DIR/hadoop-$HADOOP_IDENT_STRING-$command-$HOSTNAME.out
pid=$HADOOP_PID_DIR/hadoop-$HADOOP_IDENT_STRING-$command.pid
HADOOP_STOP_TIMEOUT=${HADOOP_STOP_TIMEOUT:-5}

# Set default scheduling priority
if [ "$HADOOP_NICENESS" = "" ]; then
    export HADOOP_NICENESS=0
fi

case $startStop in

  (start)启动

    [ -w "$HADOOP_PID_DIR" ] ||  mkdir -p "$HADOOP_PID_DIR"

    if [ -f $pid ]; then
      if kill -0 `cat $pid` > /dev/null 2>&1; then
        echo $command running as process `cat $pid`.  Stop it first.
        exit 1
      fi
    fi

    if [ "$HADOOP_MASTER" != "" ]; then
      echo rsync from $HADOOP_MASTER
      rsync -a -e ssh --delete --exclude=.svn --exclude='logs/*' --exclude='contrib/hod/logs/*' $HADOOP_MASTER/ "$HADOOP_PREFIX"
    fi

    hadoop_rotate_log $log
    echo starting $command, logging to $log
    cd "$HADOOP_PREFIX"
    case $command in
      namenode|secondarynamenode|datanode|journalnode|dfs|dfsadmin|fsck|balancer|zkfc)
        if [ -z "$HADOOP_HDFS_HOME" ]; then
          hdfsScript="$HADOOP_PREFIX"/bin/hdfs
        else
          hdfsScript="$HADOOP_HDFS_HOME"/bin/hdfs
        fi
        nohup nice -n $HADOOP_NICENESS $hdfsScript --config $HADOOP_CONF_DIR $command "$@" > "$log" 2>&1 < /dev/null &
      ;;
      (*)
        nohup nice -n $HADOOP_NICENESS $hadoopScript --config $HADOOP_CONF_DIR $command "$@" > "$log" 2>&1 < /dev/null &
      ;;
    esac
    echo $! > $pid
    sleep 1
    head "$log"
    # capture the ulimit output
    if [ "true" = "$starting_secure_dn" ]; then
      echo "ulimit -a for secure datanode user $HADOOP_SECURE_DN_USER" >> $log
      # capture the ulimit info for the appropriate user
      su --shell=/bin/bash $HADOOP_SECURE_DN_USER -c 'ulimit -a' >> $log 2>&1
    elif [ "true" = "$starting_privileged_nfs" ]; then
        echo "ulimit -a for privileged nfs user $HADOOP_PRIVILEGED_NFS_USER" >> $log
        su --shell=/bin/bash $HADOOP_PRIVILEGED_NFS_USER -c 'ulimit -a' >> $log 2>&1
    else
      echo "ulimit -a for user $USER" >> $log
      ulimit -a >> $log 2>&1
    fi
    sleep 3;
    if ! ps -p $! > /dev/null ; then
      exit 1
    fi
    ;;
          
  (stop)停止

    if [ -f $pid ]; then
      TARGET_PID=`cat $pid`
      if kill -0 $TARGET_PID > /dev/null 2>&1; then
        echo stopping $command
        kill $TARGET_PID
        sleep $HADOOP_STOP_TIMEOUT
        if kill -0 $TARGET_PID > /dev/null 2>&1; then
          echo "$command did not stop gracefully after $HADOOP_STOP_TIMEOUT seconds: killing with kill -9"
          kill -9 $TARGET_PID
        fi
      else
        echo no $command to stop
      fi
      rm -f $pid
    else
      echo no $command to stop
    fi
    ;;

  (*)
    echo $usage
    exit 1
    ;;

esac

测试缺少pid文件启动停止都有问题

原理：启动需要写pid文件，停止需要读pid文件

[root@hadoop001 tmp]# mv hadoop-hadoop-namenode.pid hadoop-hadoop-namenode.pid.back
[root@hadoop001 tmp]# ll
total 1056084
-rw-r--r-- 1 root       root         10867 Jun 28 14:14 data.log
-rwxr--r-- 1 root       root            44 Jun 28 14:09 data.sh
-rwxr-xr-x 1 root       root        840944 Apr 28 20:25 disable
drwxrwxr-x 4 hadoop     hadoop        4096 Jul 11 13:56 hadoop-hadoop
-rw-rw-r-- 1 hadoop     hadoop           5 Jul 11 14:04 hadoop-hadoop-datanode.pid
-rw-rw-r-- 1 hadoop     hadoop           5 Jul 11 14:04 hadoop-hadoop-namenode.pid.back
-rw-rw-r-- 1 hadoop     hadoop           5 Jul 11 14:05 hadoop-hadoop-secondarynamenode.pid

[hadoop@hadoop001 hadoop]$ sbin/stop-dfs.sh
19/07/11 14:07:26 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Stopping namenodes on [hadoop001]
hadoop001: no namenode to stop
hadoop001: stopping datanode
Stopping secondary namenodes [hadoop001]
hadoop001: stopping secondarynamenode
19/07/11 14:07:44 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[hadoop@hadoop001 hadoop]$ jps      未停止
9988 Jps
9276 NameNode
[hadoop@hadoop001 hadoop]$ 
[hadoop@hadoop001 hadoop]$ ps ef|grep 9276
10027 pts/1    S+     0:00  \_ grep 9276 HOSTNAME=hadoop001 SHELL=/bin/bash TERM=xterm HADOOP_HOME=/usr/home/hadoop HISTSIZE=1000 OLDPWD=/home/hadoop/app/hadoop/etc/hadoop USER=hadoop LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=01;05;37;41:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arj=01;31:*.taz=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.dz=01;31:*.gz=01;31:*.lz=01;31:*.xz=01;31:*.bz2=01;31:*.tbz=01;31:*.tbz2=01;31:*.bz=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.rar=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.jpg=01;35:*.jpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.axv=01;35:*.anx=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=01;36:*.au=01;36:*.flac=01;36:*.mid=01;36:*.midi=01;36:*.mka=01;36:*.mp3=01;36:*.mpc=01;36:*.ogg=01;36:*.ra=01;36:*.wav=01;36:*.axa=01;36:*.oga=01;36:*.spx=01;36:*.xspf=01;36: MAIL=/var/spool/mail/hadoop PATH=/usr/home/hadoop/bin:/bin:/usr/java/jdk1.8.0_45/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hadoop/bin PWD=/home/hadoop/app/hadoop JAVA_HOME=/usr/java/jdk1.8.0_45 LANG=en_US.UTF-8 HISTCONTROL=ignoredups SHLVL=1 HOME=/home/hadoop GREP_OPTIONS=--color=auto LOGNAME=hadoop LESSOPEN=||/usr/bin/lesspipe.sh %s MYSQL_HOME=/usr/local/mysql G_BROKEN_FILENAMES=1 _=/bin/grep

生产上，pid放/tmp真的合适吗？
不合适，/tmp默认30天会删除非规则意外的文件。解决方法如下。

创建/data/tmp目录，赋予777权限，修改路径

[hadoop@hadoop001 hadoop]$ pwd
/home/hadoop/app/hadoop/etc/hadoop
[hadoop@hadoop001 hadoop]$ cat hadoop-env.sh

# The directory where pid files are stored. /tmp by default.
# NOTE: this should be set to a directory that can only be written to by 
#       the user that will run the hadoop daemons.  Otherwise there is the
#       potential for a symlink attack.
export HADOOP_PID_DIR=${HADOOP_PID_DIR}  把这里给强行修改路径
export HADOOP_SECURE_DN_PID_DIR=${HADOOP_PID_D

[hadoop@hadoop001 hadoop]$ cat hadoop-env.sh
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements.  See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership.  The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License.  You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Set Hadoop-specific environment variables here.

# The only required environment variable is JAVA_HOME.  All others are
# optional.  When running a distributed configuration it is best to
# set JAVA_HOME in this file, so that it is correctly defined on
# remote nodes.

# The java implementation to use.
export JAVA_HOME=/usr/java/jdk1.8.0_45

# The jsvc implementation to use. Jsvc is required to run secure datanodes
# that bind to privileged ports to provide authentication of data transfer
# protocol.  Jsvc is not required if SASL is configured for authentication of
# data transfer protocol using non-privileged ports.
#export JSVC_HOME=${JSVC_HOME}

export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-"/etc/hadoop"}

# Extra Java CLASSPATH elements.  Automatically insert capacity-scheduler.
for f in $HADOOP_HOME/contrib/capacity-scheduler/*.jar; do
  if [ "$HADOOP_CLASSPATH" ]; then
    export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$f
  else
    export HADOOP_CLASSPATH=$f
  fi
done

# The maximum amount of heap to use, in MB. Default is 1000.
#export HADOOP_HEAPSIZE=
#export HADOOP_NAMENODE_INIT_HEAPSIZE=""

# Extra Java runtime options.  Empty by default.
export HADOOP_OPTS="$HADOOP_OPTS -Djava.net.preferIPv4Stack=true"

# Command specific options appended to HADOOP_OPTS when specified
export HADOOP_NAMENODE_OPTS="-Dhadoop.security.logger=${HADOOP_SECURITY_LOGGER:-INFO,RFAS} -Dhdfs.audit.logger=${HDFS_AUDIT_LOGGER:-INFO,NullAppender} $HADOOP_NAMENODE_OPTS"
export HADOOP_DATANODE_OPTS="-Dhadoop.security.logger=ERROR,RFAS $HADOOP_DATANODE_OPTS"

export HADOOP_SECONDARYNAMENODE_OPTS="-Dhadoop.security.logger=${HADOOP_SECURITY_LOGGER:-INFO,RFAS} -Dhdfs.audit.logger=${HDFS_AUDIT_LOGGER:-INFO,NullAppender} $HADOOP_SECONDARYNAMENODE_OPTS"

export HADOOP_NFS3_OPTS="$HADOOP_NFS3_OPTS"
export HADOOP_PORTMAP_OPTS="-Xmx512m $HADOOP_PORTMAP_OPTS"

# The following applies to multiple commands (fs, dfs, fsck, distcp etc)
export HADOOP_CLIENT_OPTS="-Xmx512m $HADOOP_CLIENT_OPTS"
#HADOOP_JAVA_PLATFORM_OPTS="-XX:-UsePerfData $HADOOP_JAVA_PLATFORM_OPTS"

# On secure datanodes, user to run the datanode as after dropping privileges.
# This **MUST** be uncommented to enable secure HDFS if using privileged ports
# to provide authentication of data transfer protocol.  This **MUST NOT** be
# defined if SASL is configured for authentication of data transfer protocol
# using non-privileged ports.
export HADOOP_SECURE_DN_USER=${HADOOP_SECURE_DN_USER}

# Where log files are stored.  $HADOOP_HOME/logs by default.
#export HADOOP_LOG_DIR=${HADOOP_LOG_DIR}/$USER

# Where log files are stored in the secure data environment.
export HADOOP_SECURE_DN_LOG_DIR=${HADOOP_LOG_DIR}/${HADOOP_HDFS_USER}

###
# HDFS Mover specific parameters
###
# Specify the JVM options to be used when starting the HDFS Mover.
# These options will be appended to the options specified as HADOOP_OPTS
# and therefore may override any similar flags set in HADOOP_OPTS
#
# export HADOOP_MOVER_OPTS=""

###
# Advanced Users Only!
###

# The directory where pid files are stored. /tmp by default.
# NOTE: this should be set to a directory that can only be written to by 
#       the user that will run the hadoop daemons.  Otherwise there is the
#       potential for a symlink attack.
#export HADOOP_PID_DIR=${HADOOP_PID_DIR}     之前的注销
export HADOOP_PID_DIR=/data/tmp  复制一份改

export HADOOP_SECURE_DN_PID_DIR=${HADOOP_PID_DIR}

# A string representing this instance of hadoop. $USER by default.
export HADOOP_IDENT_STRING=$USER
[hadoop@hadoop001 hadoop]$ ll
total 140
-rw-r--r-- 1 hadoop hadoop  4436 Mar 24  2016 capacity-scheduler.xml
-rw-r--r-- 1 hadoop hadoop  1335 Mar 24  2016 configuration.xsl
-rw-r--r-- 1 hadoop hadoop   318 Mar 24  2016 container-executor.cfg
-rw-r--r-- 1 hadoop hadoop   880 Jul 11 14:01 core-site.xml
-rw-r--r-- 1 hadoop hadoop  4267 Jul 11 14:43 hadoop-env.sh
-rw-r--r-- 1 hadoop hadoop  2598 Mar 24  2016 hadoop-metrics2.properties
-rw-r--r-- 1 hadoop hadoop  2490 Mar 24  2016 hadoop-metrics.properties
-rw-r--r-- 1 hadoop hadoop  9683 Mar 24  2016 hadoop-policy.xml
-rw-r--r-- 1 hadoop hadoop  1112 Jul 11 00:16 hdfs-site.xml
-rw-r--r-- 1 hadoop hadoop  1449 Mar 24  2016 httpfs-env.sh
-rw-r--r-- 1 hadoop hadoop  1657 Mar 24  2016 httpfs-log4j.properties
-rw-r--r-- 1 hadoop hadoop    21 Mar 24  2016 httpfs-signature.secret
-rw-r--r-- 1 hadoop hadoop   620 Mar 24  2016 httpfs-site.xml
-rw-r--r-- 1 hadoop hadoop  3523 Mar 24  2016 kms-acls.xml
-rw-r--r-- 1 hadoop hadoop  1611 Mar 24  2016 kms-env.sh
-rw-r--r-- 1 hadoop hadoop  1631 Mar 24  2016 kms-log4j.properties
-rw-r--r-- 1 hadoop hadoop  5511 Mar 24  2016 kms-site.xml
-rw-r--r-- 1 hadoop hadoop 11291 Mar 24  2016 log4j.properties
-rw-r--r-- 1 hadoop hadoop  1383 Mar 24  2016 mapred-env.sh
-rw-r--r-- 1 hadoop hadoop  4113 Mar 24  2016 mapred-queues.xml.template
-rw-r--r-- 1 hadoop hadoop   758 Mar 24  2016 mapred-site.xml.template
-rw-r--r-- 1 hadoop hadoop    10 Jul 10 19:19 slaves
-rw-r--r-- 1 hadoop hadoop  2316 Mar 24  2016 ssl-client.xml.example
-rw-r--r-- 1 hadoop hadoop  2268 Mar 24  2016 ssl-server.xml.example
-rw-r--r-- 1 hadoop hadoop  4567 Mar 24  2016 yarn-env.sh
-rw-r--r-- 1 hadoop hadoop   690 Mar 24  2016 yarn-site.xml
[hadoop@hadoop001 hadoop]$ cd ../
[hadoop@hadoop001 etc]$ cd ../
[hadoop@hadoop001 hadoop]$ ll
total 84
drwxr-xr-x  2 hadoop hadoop  4096 Jul  9 17:34 bin
drwxr-xr-x  2 hadoop hadoop  4096 Mar 24  2016 bin-mapreduce1
drwxr-xr-x  3 hadoop hadoop  4096 Mar 24  2016 cloudera
drwxr-xr-x  6 hadoop hadoop  4096 Mar 24  2016 etc
drwxr-xr-x  5 hadoop hadoop  4096 Mar 24  2016 examples
drwxr-xr-x  3 hadoop hadoop  4096 Mar 24  2016 examples-mapreduce1
drwxr-xr-x  2 hadoop hadoop  4096 Mar 24  2016 include
drwxr-xr-x  3 hadoop hadoop  4096 Mar 24  2016 lib
drwxr-xr-x  2 hadoop hadoop  4096 Mar 24  2016 libexec
-rw-r--r--  1 hadoop hadoop 17087 Mar 24  2016 LICENSE.txt
drwxrwxr-x  2 hadoop hadoop  4096 Jul 11 14:13 logs
-rw-r--r--  1 hadoop hadoop   101 Mar 24  2016 NOTICE.txt
drwxrwxr-x  3 hadoop hadoop  4096 Jul 11 13:57 output
-rw-r--r--  1 hadoop hadoop  1366 Mar 24  2016 README.txt
drwxr-xr-x  3 hadoop hadoop  4096 Jul  9 02:45 sbin
drwxr-xr-x  4 hadoop hadoop  4096 Mar 24  2016 share
drwxr-xr-x 17 hadoop hadoop  4096 Mar 24  2016 src
[hadoop@hadoop001 hadoop]$ sbin/start-dfs.sh
19/07/11 14:44:20 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [hadoop001]
hadoop001: starting namenode, logging to /home/hadoop/software/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-namenode-hadoop001.out
hadoop001: starting datanode, logging to /home/hadoop/software/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-datanode-hadoop001.out
Starting secondary namenodes [hadoop001]
hadoop001: starting secondarynamenode, logging to /home/hadoop/software/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-secondarynamenode-hadoop001.out
19/07/11 14:44:41 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[hadoop@hadoop001 hadoop]$

[root@hadoop001 data]# mkdir tmp
[root@hadoop001 data]# chmod -R 777 /data/tmp
[root@hadoop001 data]# ll
[root@hadoop001 data]# cd tmp
[root@hadoop001 tmp]# ll
total 0
[root@hadoop001 tmp]# pwd
/data/tmp
[root@hadoop001 tmp]# ll
total 12
-rw-rw-r-- 1 hadoop hadoop 6 Jul 11 14:44 hadoop-hadoop-datanode.pid
-rw-rw-r-- 1 hadoop hadoop 6 Jul 11 14:44 hadoop-hadoop-namenode.pid
-rw-rw-r-- 1 hadoop hadoop 6 Jul 11 14:44 hadoop-hadoop-secondarynamenode.pid
[root@hadoop001 tmp]#