linux中安装单机版hadoop

前置条件:

1、jdk安装成功(http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html)

2、下载hadoop-1.2.1.tar.gz(https://dist.apache.org/repos/dist/release/hadoop/common/hadoop-1.2.1/)

 

 

安装hadoop

首先将hadoop-1.2.1.tar.gz复制到usr下的local文件夹内,然后解压如图1


drwxr-xr-x.  2 root root     4096 9月  23 2011 bin
drwxr-xr-x.  2 root root     4096 9月  23 2011 etc
drwxr-xr-x.  2 root root     4096 9月  23 2011 games
drwxr-xr-x. 16 root root     4096 9月  14 16:42 hadoop-1.2.1
-rw-r--r--.  1 root root 63851630 9月  14 21:20 hadoop-1.2.1.tar.gz
drwxr-xr-x.  2 root root     4096 9月  23 2011 include
drwxr-xr-x.  8 uucp  143     4096 6月  22 17:50 jdk1.8.0_101
drwxr-xr-x.  2 root root     4096 9月  23 2011 lib
drwxr-xr-x.  2 root root     4096 9月  23 2011 libexec
drwxr-xr-x.  2 root root     4096 9月  23 2011 sbin
drwxr-xr-x.  5 root root     4096 9月  14 2016 share
drwxr-xr-x.  2 root root     4096 9月  23 2011 src


配置hadoop

0、浏览hadoop文件下都有些什么东西,如图13


drwxr-xr-x.  2 root root    4096 9月  14 16:32 bin
-rw-rw-r--.  1 root root  121130 7月  23 2013 build.xml
drwxr-xr-x.  4 root root    4096 7月  23 2013 c++
-rw-rw-r--.  1 root root  493744 7月  23 2013 CHANGES.txt
drwxr-xr-x.  2 root root    4096 9月  14 21:30 conf
drwxr-xr-x. 10 root root    4096 7月  23 2013 contrib
drwxr-xr-x.  6 root root    4096 9月  14 16:31 docs
-rw-rw-r--.  1 root root    6842 7月  23 2013 hadoop-ant-1.2.1.jar
-rw-rw-r--.  1 root root     414 7月  23 2013 hadoop-client-1.2.1.jar
-rw-rw-r--.  1 root root 4203147 7月  23 2013 hadoop-core-1.2.1.jar
-rw-rw-r--.  1 root root  142726 7月  23 2013 hadoop-examples-1.2.1.jar
-rw-rw-r--.  1 root root     417 7月  23 2013 hadoop-minicluster-1.2.1.jar
-rw-rw-r--.  1 root root 3126576 7月  23 2013 hadoop-test-1.2.1.jar
-rw-rw-r--.  1 root root  385634 7月  23 2013 hadoop-tools-1.2.1.jar
drwxr-xr-x.  2 root root    4096 9月  14 16:31 ivy
-rw-rw-r--.  1 root root   10525 7月  23 2013 ivy.xml
drwxr-xr-x.  5 root root    4096 9月  14 16:31 lib
drwxr-xr-x.  2 root root    4096 9月  14 16:32 libexec
-rw-rw-r--.  1 root root   13366 7月  23 2013 LICENSE.txt
drwxr-xr-x.  4 root root    4096 9月  14 22:03 logs
-rw-rw-r--.  1 root root     101 7月  23 2013 NOTICE.txt
-rw-rw-r--.  1 root root    1366 7月  23 2013 README.txt
drwxr-xr-x.  2 root root    4096 9月  14 16:32 sbin
drwxr-xr-x.  3 root root    4096 7月  23 2013 share
drwxr-xr-x. 16 root root    4096 9月  14 16:32 src
drwxr-xr-x.  9 root root    4096 7月  23 2013 webapps


1、打开conf/hadoop-env.sh,如图14

    vi /usr/local/hadoop-1.2.1/conf/hadoop-env.sh

export JAVA_HOME=/usr/local/jdk1.8.0_101
export HADOOP_HOME=/usr/local/hadoop-1.2.1
export PATH=$PATH:/usr/local/hadoop-1.2.1/bin

   如图15

---------------------------------------------------------------------------------------------

# remote nodes.

# The java implementation to use.  Required.
export JAVA_HOME=/usr/local/jdk1.8.0_101
export HADOOP_HOME=/usr/local/hadoop-1.2.1
export PATH=$PATH:/usr/local/hadoop-1.2.1/bin


-------------------------------------------------------------------------------------- 

2、打开conf/core-site.xml

   配置,如下内容:

Java代码   收藏代码
  1. <configuration>  
  2.  <property>  
  3.   <name>fs.default.name</name>  
  4.   <value>hdfs://localhost:9000</value>   
  5.  </property>  
  6.  <property>  
  7.   <name>dfs.replication</name>   
  8.   <value>1</value>   
  9.  </property>  
  10.  <property>  
  11.   <name>hadoop.tmp.dir</name>  
  12.   <value>/home/hadoop/tmp</value>   
  13.  </property>  
  14. </configuration> 

3、打开conf目录下的mapred-site.xml

  配置如下内容:

Java代码   收藏代码
  1. <configuration>  
  2.  <property>   
  3.   <name>mapred.job.tracker</name>  
  4.   <value>localhost:9001</value>   
  5.  </property>  
  6. </configuration>  

运行测试

1、格式化namenode,如图18

  hadoop namenode -format

 

[root@linux-01 hadoop-1.2.1]# hadoop namenode  -format
Warning: $HADOOP_HOME is deprecated.

16/09/14 22:10:18 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = linux-01/127.0.0.1
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 1.2.1
STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.2 -r 1503152; compiled by 'mattf' on Mon Jul 22 15:23:09 PDT 2013
STARTUP_MSG:   java = 1.8.0_101
************************************************************/
Re-format filesystem in /home/hadoop/tmp/dfs/name ? (Y or N) Y
16/09/14 22:10:23 INFO util.GSet: Computing capacity for map BlocksMap
16/09/14 22:10:23 INFO util.GSet: VM type       = 32-bit
16/09/14 22:10:23 INFO util.GSet: 2.0% max memory = 1013645312
16/09/14 22:10:23 INFO util.GSet: capacity      = 2^22 = 4194304 entries
16/09/14 22:10:23 INFO util.GSet: recommended=4194304, actual=4194304
16/09/14 22:10:23 INFO namenode.FSNamesystem: fsOwner=root
16/09/14 22:10:23 INFO namenode.FSNamesystem: supergroup=supergroup
16/09/14 22:10:23 INFO namenode.FSNamesystem: isPermissionEnabled=true
16/09/14 22:10:23 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
16/09/14 22:10:23 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)
16/09/14 22:10:23 INFO namenode.FSEditLog: dfs.namenode.edits.toleration.length = 0
16/09/14 22:10:23 INFO namenode.NameNode: Caching file names occuring more than 10 times
16/09/14 22:10:23 INFO common.Storage: Image file /home/hadoop/tmp/dfs/name/current/fsimage of size 110 bytes saved in 0 seconds.
16/09/14 22:10:23 INFO namenode.FSEditLog: closing edit log: position=4, editlog=/home/hadoop/tmp/dfs/name/current/edits
16/09/14 22:10:23 INFO namenode.FSEditLog: close success: truncate to 4, editlog=/home/hadoop/tmp/dfs/name/current/edits
16/09/14 22:10:23 INFO common.Storage: Storage directory /home/hadoop/tmp/dfs/name has been successfully formatted.
16/09/14 22:10:23 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at linux-01/127.0.0.1
************************************************************/



 

  可能遇到如下错误(倒腾这个过程次数多了),如图19


 

执行如图20,再次执行如图18


 

2、启动hadoop,如图21

 ./bin/start-all.sh


[root@linux-01 hadoop-1.2.1]# ./bin/start-all.sh
Warning: $HADOOP_HOME is deprecated.

starting namenode, logging to /usr/local/hadoop-1.2.1/logs/hadoop-root-namenode-linux-01.out
root@localhost's password:
localhost: Warning: $HADOOP_HOME is deprecated.
localhost:
localhost: starting datanode, logging to /usr/local/hadoop-1.2.1/logs/hadoop-root-datanode-linux-01.out
root@localhost's password:
localhost: Warning: $HADOOP_HOME is deprecated.
localhost:
localhost: starting secondarynamenode, logging to /usr/local/hadoop-1.2.1/logs/hadoop-root-secondarynamenode-linux-01.out
starting jobtracker, logging to /usr/local/hadoop-1.2.1/logs/hadoop-root-jobtracker-linux-01.out
root@localhost's password:
localhost: Warning: $HADOOP_HOME is deprecated.
localhost:
localhost: starting tasktracker, logging to /usr/local/hadoop-1.2.1/logs/hadoop-root-tasktracker-linux-01.out


3、验证hadoop是否成功启动,

   使用jps     如图22


[root@linux-01 hadoop-1.2.1]# jps
25991 DataNode
26361 TaskTracker
24827 FsShell
26428 Jps
26204 JobTracker
26124 SecondaryNameNode
25855 NameNode




运行自带wordcount例子(jidong啊)

1、准备需要进行wordcount的文件,如图23(在test.txt中随便输入字符串,保存并退出)

[root@linux-01 tmp]# mkdir test.txt

[root@linux-01 tmp]# vi test.txt

hello , welcome hadoop !!!

保存退出 :wq

-------------------------------------------------------------------------------------------

2、将上一步中的测试文件上传到dfs文件系统中的firstTest目录下,如图24(如果dfs下不包含firstTest目录的话自动创建一个同名目录,使用命令:bin/hadoop dfs -ls查看dfs文件系统中已有的目录)

 bin/hadoop dfs -copyFromLocal /tmp/test.txt firstTest

 

3、执行wordcount,如图25(对firstest下的所有文件执行wordcount,将统计结果输出到result文件夹中,若result文件夹不存在则自动创建)

 bin/hadoop jar hadoop-examples-1.2.1.jar wordcount firstTest result


[root@linux-01 hadoop-1.2.1]#  bin/hadoop jar hadoop-examples-1.2.1.jar wordcount firstTest result
Warning: $HADOOP_HOME is deprecated.

16/09/14 22:41:42 INFO input.FileInputFormat: Total input paths to process : 1
16/09/14 22:41:42 INFO util.NativeCodeLoader: Loaded the native-hadoop library
16/09/14 22:41:42 WARN snappy.LoadSnappy: Snappy native library not loaded
16/09/14 22:41:42 INFO mapred.JobClient: Running job: job_201609142236_0004
16/09/14 22:41:43 INFO mapred.JobClient:  map 0% reduce 0%
16/09/14 22:41:47 INFO mapred.JobClient:  map 100% reduce 0%
16/09/14 22:41:55 INFO mapred.JobClient:  map 100% reduce 33%
16/09/14 22:41:56 INFO mapred.JobClient:  map 100% reduce 100%
16/09/14 22:41:57 INFO mapred.JobClient: Job complete: job_201609142236_0004
16/09/14 22:41:57 INFO mapred.JobClient: Counters: 29
16/09/14 22:41:57 INFO mapred.JobClient:   Map-Reduce Framework
16/09/14 22:41:57 INFO mapred.JobClient:     Spilled Records=10
16/09/14 22:41:57 INFO mapred.JobClient:     Map output materialized bytes=63
16/09/14 22:41:57 INFO mapred.JobClient:     Reduce input records=5
16/09/14 22:41:57 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=589844480
16/09/14 22:41:57 INFO mapred.JobClient:     Map input records=1
16/09/14 22:41:57 INFO mapred.JobClient:     SPLIT_RAW_BYTES=106
16/09/14 22:41:57 INFO mapred.JobClient:     Map output bytes=47
16/09/14 22:41:57 INFO mapred.JobClient:     Reduce shuffle bytes=63
16/09/14 22:41:57 INFO mapred.JobClient:     Physical memory (bytes) snapshot=190750720
16/09/14 22:41:57 INFO mapred.JobClient:     Reduce input groups=5
16/09/14 22:41:57 INFO mapred.JobClient:     Combine output records=5
16/09/14 22:41:57 INFO mapred.JobClient:     Reduce output records=5
16/09/14 22:41:57 INFO mapred.JobClient:     Map output records=5
16/09/14 22:41:57 INFO mapred.JobClient:     Combine input records=5
16/09/14 22:41:57 INFO mapred.JobClient:     CPU time spent (ms)=760
16/09/14 22:41:57 INFO mapred.JobClient:     Total committed heap usage (bytes)=177016832
16/09/14 22:41:57 INFO mapred.JobClient:   File Input Format Counters
16/09/14 22:41:57 INFO mapred.JobClient:     Bytes Read=27
16/09/14 22:41:57 INFO mapred.JobClient:   FileSystemCounters
16/09/14 22:41:57 INFO mapred.JobClient:     HDFS_BYTES_READ=133
16/09/14 22:41:57 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=109333
16/09/14 22:41:57 INFO mapred.JobClient:     FILE_BYTES_READ=63
16/09/14 22:41:57 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=37
16/09/14 22:41:57 INFO mapred.JobClient:   Job Counters
16/09/14 22:41:57 INFO mapred.JobClient:     Launched map tasks=1
16/09/14 22:41:57 INFO mapred.JobClient:     Launched reduce tasks=1
16/09/14 22:41:57 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=8646
16/09/14 22:41:57 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0
16/09/14 22:41:57 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=4553
16/09/14 22:41:57 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0
16/09/14 22:41:57 INFO mapred.JobClient:     Data-local map tasks=1
16/09/14 22:41:57 INFO mapred.JobClient:   File Output Format Counters
16/09/14 22:41:57 INFO mapred.JobClient:     Bytes Written=37


4、查看结果,如图26

 bin/hadoop dfs -cat result/part-r-00000


[root@linux-01 hadoop-1.2.1]#  bin/hadoop dfs -cat result/part-r-00000
Warning: $HADOOP_HOME is deprecated.

!!!    1
,    1
hadoop    1
hello    1
welcome    1


  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值