hadoop安装

 

Hadoop有三种运行模式:

1.单机模式(非分布式模式)

2.伪分布式运行模式(用不同进程模拟分布式运行中的各类特点)

3. 真正的分布式模式

 

1、单机模式

[hadoop@ hadoop_home]$ cd hadoop-0.20.205.0

[hadoop@ hadoop-0.20.205.0]$ mkdir input

[hadoop@ hadoop-0.20.205.0]$ cp conf/*.xml input/

[hadoop@ hadoop-0.20.205.0]$ vim conf/hadoop-env.sh

export JAVA_HOME=/usr/java/jdk1.6.0_29

[hadoop@ hadoop-0.20.205.0]$ hadoop jar hadoop-examples-0.20.205.0.jar grep input output 'dfs[a-z.]+'

……

[hadoop@ hadoop-0.20.205.0]$ hadoop fs -cat output/part-*

1       dfsadmin

[hadoop@ hadoop-0.20.205.0]$

 

2、伪分布式运行模式

参考URL:http://hadoop.apache.org/common/docs/current/single_node_setup.html

一、安装JDK

1、  检查java版本号:

[root@ ~]# java -version

java version "1.4.2"

gcj (GCC) 3.4.5 20051201 (Red Hat 3.4.5-2)

Copyright (C) 2004 Free Software Foundation, Inc.

This is free software; see the source for copying conditions.  There is NO

warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

 

2、  http://www.oracle.com/technetwork/java/javase/downloads/jdk-6u29-download-513648.html:

下载【根据机器的位数选择相应的版本】:

Linux x64

81.45 MB  

jdk-6u29-linux-x64.bin

 

3、  以root权限执行:

chmod +x jdk-6u29-linux-x64.bin

./ jdk-6u29-linux-x64.bin

mv ./jdk1.6.0_29 /usr/java/jdk1.6.0_29

4、  修改~/.bash_profile

JAVA_HOME=/usr/java/jdk1.6.0_29

JAVA_BIN=/usr/java/jdk1.6.0_29/bin

PATH=$JAVA_HOME/bin:$PATH

CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar

export JAVA_HOME JAVA_BIN PATH CLASSPATH

5、  重新加载~/.bash_profile文件

source ~/.bash_profile

6、  检查java版本号:

[root@ jdk]# java -version

java version "1.6.0_29"

Java(TM) SE Runtime Environment (build 1.6.0_29-b11)

Java HotSpot(TM) 64-Bit Server VM (build 20.4-b02, mixed mode)

 

JDK安装成功!

 

二、创建hadoop用户

1、  useradd hadoop

2、  passwd hadoop

3、  以hadoop用户登录:su hadoop

 

三、ssh免登录

1、  ssh-keygen -t rsa -P ''

2、  cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

3、  chmod 600 ~/.ssh/authorized_keys

4、  ssh localhost

如果第4步不需要输入密码,那么ssh免登陆成功;

 

四、hadoop安装

5、  http://labs.renren.com/apache-mirror//hadoop/common/

下载hadoop-0.20.205.0.tar.gz

6、  tar –zxvf hadoop-0.20.205.0.tar.gz

7、  在~/.bash_profile中添加环境变量,修改后source该文件:

HADOOP_HOME=/home/hadoop/hadoop_home/hadoop-0.20.2

export HADOOP_HOME

PATH=$HADOOP_HOME/bin:$PATH

export PATH

8、  修改$HADOOP_HOME/conf/hadoop-env.sh

# The java implementation to use.  Required.

export JAVA_HOME=/usr/java/jdk1.6.0_29

 

9、  修改$HADOOP_HOME/conf/hdfs-site.xml文件

<?xml version="1.0"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

 

<!-- Put site-specific property overrides in this file. -->

 

<configuration>

         <property>

                <name>dfs.replication</name>

                <value>1</value>

                <final>true</final>

         </property>

</configuration>

10、              修改$HADOOP_HOME/conf/core-site.xml文件

 

<?xml version="1.0"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

 

<!-- Put site-specific property overrides in this file. -->

 

<configuration>

        <property>

                <name>fs.default.name</name>

                <value>hdfs://localhost:9000</value>

                <final>true</final>

        </property>

</configuration>

11、              修改$HADOOP_HOME/conf/mapred-site.xml文件

<?xml version="1.0"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

 

<!-- Put site-specific property overrides in this file. -->

 

<configuration>

        <property>

                <name>mapred.job.tracker</name>

                <value>localhost:9001</value>

                <final>true</final>

        </property>

</configuration>

12、              枚举进程:ps x

 

13、              格式化namenode:hadoop namenode -format

 

14、              start-all.sh(start-dfs.sh,start-mapred.sh)

 

15、              ps x

 

16、              jps

 

17、              hadoop fs –mkdir yangkai

18、              Hadoop fs –ls

19、              stop-all.sh

20、              ps x

 

21、              jps

 

22、              重要路径

路径树:

/tmp/hadoop-hadoop

. ==

|--dfs              //硬盘

| |--data           //datanode

| | |--blocksBeingWritten

| | |--current

| | | |--subdir0

| | | |--subdir1

| | | |--subdir10

| | | |--………….

| | | |--subdir63

| | | |--subdir7

| | | |--subdir8

| | | |--subdir9

| | |--detach

| | |--tmp

| |--name         //namenode

| | |--current

| | |--image

| | |--previous.checkpoint

| |--namesecondary

| | |--current

| | |--image

|--mapred        //mapreduce

| |--local

| | |--localRunner

| | | |--tmp

| | |--taskTracker  //taskTracker

| | |--tt_log_tmp

| | |--ttprivate

| | |--userlogs

| |--staging

| | |--hadoop1501661639

| | | |--.staging

| | |--hadoop1997916211

| | | |--.staging

 

默认硬盘目录:/tmp/hadoop-hadoop/dfs/name/

HADOOP日志路径:${HADOOP_HOME}/logs

JOB日志路径:${HADOOP_HOME}/logs/userlogs/job_201112141219_0001/attempt_201112141219_0001_m_000010_0【包含stderr,stdout,syslog三个日志文件】

 

五、测试是否安装成功

1、  hadoop jar ${HADOOP_HOME}/hadoop-test-0.20.205.0.jar TestDFSIO -write -nrFiles 10 -fileSize 10

[hadoop@ conf]$ hadoop jar ${HADOOP_HOME}/hadoop-test-0.20.205.0.jar TestDFSIO -write -nrFiles 10 -fileSize 10

Warning: $HADOOP_HOME is deprecated.

 

TestDFSIO.0.0.4

11/12/15 11:19:34 INFO fs.TestDFSIO: nrFiles = 10

11/12/15 11:19:34 INFO fs.TestDFSIO: fileSize (MB) = 10

11/12/15 11:19:34 INFO fs.TestDFSIO: bufferSize = 1000000

11/12/15 11:19:35 INFO fs.TestDFSIO: creating control file: 10 mega bytes, 10 files

11/12/15 11:19:36 INFO fs.TestDFSIO: created control files for: 10 files

11/12/15 11:19:36 INFO mapred.FileInputFormat: Total input paths to process : 10

11/12/15 11:19:36 INFO mapred.JobClient: Running job: job_201112151118_0001

11/12/15 11:19:37 INFO mapred.JobClient:  map 0% reduce 0%

11/12/15 11:19:57 INFO mapred.JobClient:  map 20% reduce 0%

11/12/15 11:20:03 INFO mapred.JobClient:  map 30% reduce 0%

11/12/15 11:20:06 INFO mapred.JobClient:  map 40% reduce 0%

11/12/15 11:20:09 INFO mapred.JobClient:  map 50% reduce 0%

11/12/15 11:20:15 INFO mapred.JobClient:  map 70% reduce 13%

11/12/15 11:20:21 INFO mapred.JobClient:  map 70% reduce 16%

11/12/15 11:20:24 INFO mapred.JobClient:  map 90% reduce 16%

11/12/15 11:20:27 INFO mapred.JobClient:  map 90% reduce 23%

11/12/15 11:20:30 INFO mapred.JobClient:  map 100% reduce 23%

11/12/15 11:20:39 INFO mapred.JobClient:  map 100% reduce 100%

11/12/15 11:20:44 INFO mapred.JobClient: Job complete: job_201112151118_0001

11/12/15 11:20:44 INFO mapred.JobClient: Counters: 30

11/12/15 11:20:44 INFO mapred.JobClient:   Job Counters

11/12/15 11:20:44 INFO mapred.JobClient:     Launched reduce tasks=1

11/12/15 11:20:44 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=68890

11/12/15 11:20:44 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0

11/12/15 11:20:44 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0

11/12/15 11:20:44 INFO mapred.JobClient:     Launched map tasks=10

11/12/15 11:20:44 INFO mapred.JobClient:     Data-local map tasks=10

11/12/15 11:20:44 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=39780

11/12/15 11:20:44 INFO mapred.JobClient:   File Input Format Counters

11/12/15 11:20:44 INFO mapred.JobClient:     Bytes Read=1120

11/12/15 11:20:44 INFO mapred.JobClient:   File Output Format Counters

11/12/15 11:20:44 INFO mapred.JobClient:     Bytes Written=76

11/12/15 11:20:44 INFO mapred.JobClient:   FileSystemCounters

11/12/15 11:20:44 INFO mapred.JobClient:     FILE_BYTES_READ=833

11/12/15 11:20:44 INFO mapred.JobClient:     HDFS_BYTES_READ=2360

11/12/15 11:20:44 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=237551

11/12/15 11:20:44 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=104857676

11/12/15 11:20:44 INFO mapred.JobClient:   Map-Reduce Framework

11/12/15 11:20:44 INFO mapred.JobClient:     Map output materialized bytes=887

11/12/15 11:20:44 INFO mapred.JobClient:     Map input records=10

11/12/15 11:20:44 INFO mapred.JobClient:     Reduce shuffle bytes=798

11/12/15 11:20:44 INFO mapred.JobClient:     Spilled Records=100

11/12/15 11:20:44 INFO mapred.JobClient:     Map output bytes=727

11/12/15 11:20:44 INFO mapred.JobClient:     Total committed heap usage (bytes)=1929248768

11/12/15 11:20:44 INFO mapred.JobClient:     CPU time spent (ms)=10520

11/12/15 11:20:44 INFO mapred.JobClient:     Map input bytes=260

11/12/15 11:20:44 INFO mapred.JobClient:     SPLIT_RAW_BYTES=1240

11/12/15 11:20:44 INFO mapred.JobClient:     Combine input records=0

11/12/15 11:20:44 INFO mapred.JobClient:     Reduce input records=50

11/12/15 11:20:44 INFO mapred.JobClient:     Reduce input groups=5

11/12/15 11:20:44 INFO mapred.JobClient:     Combine output records=0

11/12/15 11:20:44 INFO mapred.JobClient:     Physical memory (bytes) snapshot=2002497536

11/12/15 11:20:44 INFO mapred.JobClient:     Reduce output records=5

11/12/15 11:20:44 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=5408600064

11/12/15 11:20:44 INFO mapred.JobClient:     Map output records=50

11/12/15 11:20:44 INFO fs.TestDFSIO: ----- TestDFSIO ----- : write

11/12/15 11:20:44 INFO fs.TestDFSIO:            Date & time: Thu Dec 15 11:20:44 CST 2011

11/12/15 11:20:44 INFO fs.TestDFSIO:        Number of files: 10

11/12/15 11:20:44 INFO fs.TestDFSIO: Total MBytes processed: 100

11/12/15 11:20:44 INFO fs.TestDFSIO:      Throughput mb/sec: 24.48579823702253

11/12/15 11:20:44 INFO fs.TestDFSIO: Average IO rate mb/sec: 28.83795738220215

11/12/15 11:20:44 INFO fs.TestDFSIO:  IO rate std deviation: 8.542554732984893

11/12/15 11:20:44 INFO fs.TestDFSIO:     Test exec time sec: 68.417

11/12/15 11:20:44 INFO fs.TestDFSIO:

[hadoop@ conf]$

 

2、  Streaming 简单测试

脚本

[hadoop@ hadoopTest1]$ cat run.sh

#!/bin/sh

 

HADOOP_PATH="/user/hadoop/yangkai"

 

hadoop fs -test -d ${HADOOP_PATH}

if [ 0 -ne $? ]

then

        echo "${HADOOP_PATH} doesn't exist! We need to create it!"

        hadoop fs -mkdir ${HADOOP_PATH}

        if [ 0 -ne $? ]

        then

                echo "hadoop fs -mkdir ${HADOOP_PATH} faild!"

                exit 1

        fi

fi

 

PROGRAM_NAME="hadoopTest"

 

INPUT_FILES="${HADOOP_PATH}/data.txt"

hadoop fs -test -e ${INPUT_FILES}

if [ 0 -eq $? ]

then

        hadoop fs -rmr ${INPUT_FILES}

        if [ 0 -ne $? ]

        then

                echo "hadoop fs -rmr ${INPUT_FILES} faild!"

                exit 1

        fi

fi

 

hadoop fs -put data.txt ${INPUT_FILES}

if [ 0 -ne $? ]

then

        echo "hadoop fs -put data.txt ${INPUT_FILES} faild!"

        exit 1

fi

 

OUTPUT_DIR="${HADOOP_PATH}/test"

hadoop fs -test -d ${OUTPUT_DIR}

if [ 0 -eq $? ]

then

        echo "${OUTPUT_DIR} already exist! We need to remove it!"

        hadoop fs -rmr ${OUTPUT_DIR}

        if [ 0 -ne $? ]

        then

                echo "hadoop fs -rmr ${OUTPUT_DIR} faild!"

                exit 1

        fi

fi

 

hadoop jar ${HADOOP_HOME}/contrib/streaming/hadoop-streaming-0.20.205.0.jar \

-D mapred.job.name="${PROGRAM_NAME}" \

-input ${INPUT_FILES} \

-output ${OUTPUT_DIR} \

-mapper "mapper.sh" \

-reducer "reducer.sh" \

-file "mapper.sh" \

-file "reducer.sh"

 

if [ 0 -eq $? ]

then

        echo "success!"

else

        echo "failed!"

fi

 

[hadoop@ hadoopTest1]$ cat mapper.sh

#!/bin/sh

 

awk '

BEGIN{}

{

        print $0;

}

END{}

'

[hadoop@ hadoopTest1]$ cat reducer.sh

#!/bin/sh

 

awk '

BEGIN{}

{

        print $0;

}

END{}

'

 

运行结果:

[hadoop@ hadoopTest1]$ sh run.sh

Warning: $HADOOP_HOME is deprecated.

 

Warning: $HADOOP_HOME is deprecated.

 

Warning: $HADOOP_HOME is deprecated.

 

Deleted hdfs://localhost:9002/user/hadoop/yangkai/data.txt

Warning: $HADOOP_HOME is deprecated.

 

Warning: $HADOOP_HOME is deprecated.

 

test: File does not exist: /user/hadoop/yangkai/test

Warning: $HADOOP_HOME is deprecated.

 

packageJobJar: [mapper.sh, reducer.sh] [/home/hadoop/hadoop_home/hadoop-0.20.205.0/contrib/streaming/hadoop-streaming-0.20.205.0.jar] /tmp/streamjob2707334478803608361.jar tmpDir=null

11/12/15 11:37:00 INFO mapred.FileInputFormat: Total input paths to process : 1

11/12/15 11:37:00 INFO streaming.StreamJob: getLocalDirs(): [/tmp/hadoop-hadoop/mapred/local]

11/12/15 11:37:00 INFO streaming.StreamJob: Running job: job_201112151118_0003

11/12/15 11:37:00 INFO streaming.StreamJob: To kill this job, run:

11/12/15 11:37:00 INFO streaming.StreamJob: /home/hadoop/hadoop_home/hadoop-0.20.205.0/libexec/../bin/hadoop job  -Dmapred.job.tracker=localhost:9001 -kill job_201112151118_0003

11/12/15 11:37:00 INFO streaming.StreamJob: Tracking URL: http://localhost.localdomain:50030/jobdetails.jsp?jobid=job_201112151118_0003

11/12/15 11:37:01 INFO streaming.StreamJob:  map 0%  reduce 0%

11/12/15 11:37:13 INFO streaming.StreamJob:  map 100%  reduce 0%

11/12/15 11:37:22 INFO streaming.StreamJob:  map 100%  reduce 33%

11/12/15 11:37:28 INFO streaming.StreamJob:  map 100%  reduce 100%

11/12/15 11:37:34 INFO streaming.StreamJob: Job complete: job_201112151118_0003

11/12/15 11:37:34 INFO streaming.StreamJob: Output: /user/hadoop/yangkai/test

success!

输入文件

[hadoop@ hadoopTest1]$ more data.txt

1 2 3 4 5 6 7 8

1 2 3 4 5 6 7 8

1 2 3 4 5 6 7 8

1 2 3 4 5 6 7 8

1 2 3 4 5 6 7 8

1 2 3 4 5 6 7 8

1 2 3 4 5 6 7 8

1 2 3 4 5 6 7 8

1 2 3 4 5 6 7 8

1 2 3 4 5 6 7 8

1 2 3 4 5 6 7 8

1 2 3 4 5 6 7 8

1 2 3 4 5 6 7 8

1 2 3 4 5 6 7 8

1 2 3 4 5 6 7 8

1 2 3 4 5 6 7 8

1 2 3 4 5 6 7 8

1 2 3 4 5 6 7 8

1 2 3 4 5 6 7 8

1 2 3 4 5 6 7 8

1 1 2 3 5 6 7 8

1 1 2 3 5 6 7 8

1 1 2 3 5 6 7 8

1 1 2 3 5 6 7 8

1 1 2 3 5 6 7 8

1 1 2 3 5 6 7 8

1 1 2 3 5 6 7 8

1 1 2 3 5 6 7 8

1 1 2 3 5 6 7 8

1 1 2 3 5 6 7 8

1 1 2 3 5 6 7 8

1 1 2 3 5 6 7 8

1 1 2 3 5 6 7 8

1 1 2 3 5 6 7 8

1 1 2 3 5 6 7 8

1 1 2 3 5 6 7 8

1 1 2 3 5 6 7 8

1 1 2 3 5 6 7 8

1 1 2 3 5 6 7 8

1 1 2 3 5 6 7 8

1 1 2 3 5 6 7 8

1 3 4 5 6 7 8 9

1 3 4 5 6 7 8 9

1 3 4 5 6 7 8 9

1 3 4 5 6 7 8 9

 

结果文件

[hadoop@ hadoopTest1]$ more part-00000

1 1 2 3 5 6 7 8

1 1 2 3 5 6 7 8

1 1 2 3 5 6 7 8

1 1 2 3 5 6 7 8

1 1 2 3 5 6 7 8

1 1 2 3 5 6 7 8

1 1 2 3 5 6 7 8

1 1 2 3 5 6 7 8

1 1 2 3 5 6 7 8

1 1 2 3 5 6 7 8

1 1 2 3 5 6 7 8

1 1 2 3 5 6 7 8

1 1 2 3 5 6 7 8

1 1 2 3 5 6 7 8

1 1 2 3 5 6 7 8

1 1 2 3 5 6 7 8

1 1 2 3 5 6 7 8

1 1 2 3 5 6 7 8

1 1 2 3 5 6 7 8

1 1 2 3 5 6 7 8

1 1 2 3 5 6 7 8

1 2 3 4 5 6 7 8

1 2 3 4 5 6 7 8

1 2 3 4 5 6 7 8

1 2 3 4 5 6 7 8

1 2 3 4 5 6 7 8

1 2 3 4 5 6 7 8

1 2 3 4 5 6 7 8

1 2 3 4 5 6 7 8

1 2 3 4 5 6 7 8

1 2 3 4 5 6 7 8

1 2 3 4 5 6 7 8

1 2 3 4 5 6 7 8

1 2 3 4 5 6 7 8

1 2 3 4 5 6 7 8

1 2 3 4 5 6 7 8

1 2 3 4 5 6 7 8

1 2 3 4 5 6 7 8

1 2 3 4 5 6 7 8

1 2 3 4 5 6 7 8

1 2 3 4 5 6 7 8

1 3 4 5 6 7 8 9

1 3 4 5 6 7 8 9

1 3 4 5 6 7 8 9

1 3 4 5 6 7 8 9

六、出现错误及解决方法:

1. error: unable to get address of epoll functions

[hadoop@ attempt_201112141219_0001_m_000010_0]$ cat ${HADOOP_HOME}/logs/userlogs/job_201112141219_0001/attempt_201112141219_0001_m_000010_0/stderr

Exception in thread "main" java.lang.InternalError: unable to get address of epoll functions, pre-2.6 kernel?

        at sun.nio.ch.EPollArrayWrapper.init(Native Method)

        at sun.nio.ch.EPollArrayWrapper.<clinit>(EPollArrayWrapper.java:272)

        at sun.nio.ch.EPollSelectorImpl.<init>(EPollSelectorImpl.java:52)

        at sun.nio.ch.EPollSelectorProvider.openSelector(EPollSelectorProvider.java:18)

        at org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.get(SocketIOWithTimeout.java:407)

        at org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:322)

        at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:203)

        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:604)

        at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)

        at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)

        at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)

        at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)

        at org.apache.hadoop.ipc.Client.call(Client.java:1046)

        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)

        at $Proxy1.getProtocolVersion(Unknown Source)

        at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)

        at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:370)

        at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:420)

        at org.apache.hadoop.mapred.Child$1.run(Child.java:113)

        at org.apache.hadoop.mapred.Child$1.run(Child.java:110)

        at java.security.AccessController.doPrivileged(Native Method)

        at javax.security.auth.Subject.doAs(Subject.java:396)

        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)

        at org.apache.hadoop.mapred.Child.main(Child.java:109)

[hadoop@ attempt_201112141219_0001_m_000010_0]$

可能原因:JDK版本错误(机器为64位,JDK为32位)

 

2. Warning: $HADOOP_HOME is deprecated.

# The Hadoop command script

#

………….

bin=`dirname "$0"`

bin=`cd "$bin"; pwd`

 

if [ "$HADOOP_HOME_WARN_SUPPRESS" == "" ] && [ "$HADOOP_HOME" != "" ]; then

  echo "Warning: \$HADOOP_HOME is deprecated." 1>&2

  echo 1>&2

fi

http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20-security/bin/hadoop截取这个warning对系统不会产生影响,只是一个警告信息,因为hadoop默认会从$hadoop_home/conf下读取配置文件,所以会提示你配置文件的目录是否正确

解决方法:将$HADOOP_HOME/bin/hadoop第53-56行注释掉:

#if [ "$HADOOP_HOME_WARN_SUPPRESS" == "" ] && [ "$HADOOP_HOME" != "" ]; then

#  echo "Warning: \$HADOOP_HOME is deprecated." 1>&2

#  echo 1>&2

#fi

 

3、  datanode不启动

删除/tmp下内容

4、  namenode不启动

hadoop namenode –format

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
Hadoop是一个开源的分布式计算系统,用于处理大规模数据集的分布式存储和处理。下面是Hadoop安装步骤: 1. 下载Hadoop:你可以从Apache官方网站上下载Hadoop的最新版本。选择合适的版本和文件类型进行下载。 2. 解压Hadoop压缩包:将下载的压缩包解压到你想要安装Hadoop的路径下。 3. 配置环境变量:编辑你的环境变量文件(如.bashrc或.profile),将Hadoop的bin目录添加到PATH变量中。例如,在.bashrc文件中添加以下行: export HADOOP_HOME=/path/to/hadoop export PATH=$PATH:$HADOOP_HOME/bin 保存文件后,运行以下命令使其生效: source ~/.bashrc 4. 配置Hadoop:进入Hadoop安装目录,在conf目录下找到core-site.xml、hdfs-site.xml、mapred-site.xml和yarn-site.xml这四个配置文件,进行必要的配置。主要配置项包括文件系统的URI、数据存储路径、任务调度器等。 5. 设置SSH无密码登录:Hadoop使用SSH来管理集群节点之间的通信,因此需要设置SSH无密码登录。确保你可以通过SSH无密码登录到本地和所有集群节点。 6. 格式化HDFS文件系统:在Hadoop安装目录下执行以下命令,格式化HDFS文件系统: hdfs namenode -format 7. 启动Hadoop集群:执行以下命令启动Hadoop集群: start-dfs.sh # 启动HDFS start-yarn.sh # 启动YARN 执行以上命令后,你可以通过Web界面访问Hadoop的各个组件。 以上是基本的Hadoop安装步骤,请根据自己的需求进行相应的配置和调整。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值