Hadoop2.7.1伪分布式环境搭建

系统环境:
Ubuntu15.10
Hadoop:2.7.1
java:1.7.0_79
1.安装SSH 并产生公私钥
sudo apt-get install ssh
ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

2.安装同步工具:
sudo apt-get install rsync

3.下载jdk1.7.0_79
解压到/usr/lib/java/下:
4.下Hadoop2.7.1
解压到/hadoop下:
donald_draper@rain:/hadoop$ tar -zxvf hadoop-2.7.1
5.配置环境变量:
vim ./bashrc

在文件尾部添加:
export JAVA_HOME=/usr/lib/java/jdk1.7.0_79
export JRE_HOME=${JAVA_HOME}/jre
export CLASS_PATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
export HADOOP_HOME=/hadoop/hadoop-2.7.1
export PATH=${JAVA_HOME}/bin:${HADOOP_HOME}/bin:${HADOOP_HOME}/sbin:${PATH}

:wq
保存退出
6.配置hadoop
hadoop2.7.1的所有配置文件从存在/hadoop/hadoop-2.7.1/etc/hadoop之中。
cd /hadoop/hadoop-2.7.1/etc/hadoop
1)修改hadoop-env.sh 加入jdk家目录
export  JAVA_HOME=/usr/lib/java/jdk1.7.0_79

2)修改core-site.xml
donald_draper@rain:/hadoop/hadoop-2.7.1/etc/hadoop$ cat core-site.xml 
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->

<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://rain:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/hadoop/tmp</value>
</property>
</configuration>

3)修改hdfs-site.xml
donald_draper@rain:/hadoop/hadoop-2.7.1/etc/hadoop$ cat hdfs-site.xml 
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->

<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>

4)修改mapred-site.xml
donald_draper@rain:/hadoop/hadoop-2.7.1/etc/hadoop$ cat mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<!-- 启动historyserver -->
<property>
<name>mapreduce.jobhistory.address</name>
<value>rain:10020</value>
</property>

<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>rain:19888</value>
</property>
<!--dir为分布式文件系统中的文件目录,启动时先启动dfs,在启动historyserver -->
<property>
<name>mapreduce.jobhistory.intermediate-done-dir</name>
<value>/history/indone</value>
</property>
<!--dir为分布式文件系统中的文件目录,启动时先启动dfs,在启动historyserver -->
<property>
<name>mapreduce.jobhistory.done-dir</name>
<value>/history/done</value>
</property>
</configuration>


5)修改yarn-site.xml
donald_draper@rain:/hadoop/hadoop-2.7.1/etc/hadoop$ cat yarn-site.xml 
<?xml version="1.0"?>
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>

6)修改slaves
slaves是指定子节点的位置,因为要在name上启动HDFS、在amrm启动yarn,所以name上的slaves文件指定的是datanode的位置,amrm上的slaves文件指定的是nodemanager的位置
cd /hadoop/hadoop-2.7.1/etc/hadoop/
vim slaves
rain
6.格式化HDFS,执行格式化命令 bin/
hdfs  namenode  -format  

7.启动HDFS,
cd /hadoop/hadoop-2.7.1/sbin/
donald_draper@rain:/hadoop/hadoop-2.7.1/sbin$ ./start-dfs.sh
Starting namenodes on [rain]
rain: starting namenode, logging to /hadoop/hadoop-2.7.1/logs/hadoop-donald_draper-namenode-rain.out
localhost: starting datanode, logging to /hadoop/hadoop-2.7.1/logs/hadoop-donald_draper-datanode-rain.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /hadoop/hadoop-2.7.1/logs/hadoop-donald_draper-secondarynamenode-rain.out

8.启动历史服务器
donald_draper@rain:/hadoop/hadoop-2.7.1/sbin$ ./mr-jobhistory-daemon.sh  start historyserver
starting historyserver, logging to /hadoop/hadoop-2.7.1/logs/mapred-donald_draper-historyserver-rain.out

9.启动YARN
cd  /hadoop/hadoop-2.7.1/sbin/
donald_draper@rain:/hadoop/hadoop-2.7.1/sbin$ ./start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /hadoop/hadoop-2.7.1/logs/yarn-donald_draper-resourcemanager-rain.out
localhost: starting nodemanager, logging to /hadoop/hadoop-2.7.1/logs/yarn-donald_draper-nodemanager-rain.out

10.查看hdfs及yarn启动情况:
donald_draper@rain:/hadoop/hadoop-2.7.1/logs$ jps
7114 DataNode
7743 NodeManager
8921 Jps
7607 ResourceManager
7319 SecondaryNameNode
8779 JobHistoryServer
6984 NameNode

11.执行job
1)hdfs  dfs  -mkdir /test
2)hdfs dfs -mkdir /test/input
3)hdfs dfs -put etc/hadoop/*.xml /test/input
4)donald_draper@rain:/hadoop/hadoop-2.7.1$ hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar grep /test/input /test/output 'dfs[a-z.]+'

执行过程:
16/08/15 11:37:50 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
16/08/15 11:37:52 INFO input.FileInputFormat: Total input paths to process : 9
16/08/15 11:37:52 INFO mapreduce.JobSubmitter: number of splits:9
16/08/15 11:37:52 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1471230621598_0001
16/08/15 11:37:53 INFO impl.YarnClientImpl: Submitted application application_1471230621598_0001
16/08/15 11:37:53 INFO mapreduce.Job: The url to track the job: http://rain:8088/proxy/application_1471230621598_0001/
16/08/15 11:37:53 INFO mapreduce.Job: Running job: job_1471230621598_0001
16/08/15 11:38:16 INFO mapreduce.Job: Job job_1471230621598_0001 running in uber mode : false
16/08/15 11:38:16 INFO mapreduce.Job: map 0% reduce 0%
16/08/15 11:45:11 INFO mapreduce.Job: map 67% reduce 0%
16/08/15 11:48:06 INFO mapreduce.Job: map 74% reduce 22%
16/08/15 11:48:22 INFO mapreduce.Job: map 89% reduce 22%
16/08/15 11:48:23 INFO mapreduce.Job: map 100% reduce 22%
16/08/15 11:48:49 INFO mapreduce.Job: map 100% reduce 30%
16/08/15 11:48:51 INFO mapreduce.Job: map 100% reduce 33%
16/08/15 11:48:54 INFO mapreduce.Job: map 100% reduce 67%
16/08/15 11:49:03 INFO mapreduce.Job: map 100% reduce 100%
16/08/15 11:49:25 INFO mapreduce.Job: Job job_1471230621598_0001 completed successfully
16/08/15 11:49:45 INFO mapreduce.Job: Counters: 50
File System Counters
FILE: Number of bytes read=51
FILE: Number of bytes written=1156955
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=28205
HDFS: Number of bytes written=143
HDFS: Number of read operations=30
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters
Killed map tasks=2
Launched map tasks=11
Launched reduce tasks=1
Data-local map tasks=11
Total time spent by all maps in occupied slots (ms)=3308143
Total time spent by all reduces in occupied slots (ms)=227199
Total time spent by all map tasks (ms)=3308143
Total time spent by all reduce tasks (ms)=227199
Total vcore-seconds taken by all map tasks=3308143
Total vcore-seconds taken by all reduce tasks=227199
Total megabyte-seconds taken by all map tasks=3387538432
Total megabyte-seconds taken by all reduce tasks=232651776
Map-Reduce Framework
Map input records=781
Map output records=2
Map output bytes=41
Map output materialized bytes=99
Input split bytes=969
Combine input records=2
Combine output records=2
Reduce input groups=2
Reduce shuffle bytes=99
Reduce input records=2
Reduce output records=2
Spilled Records=4
Shuffled Maps =9
Failed Shuffles=0
Merged Map outputs=9
GC time elapsed (ms)=213752
CPU time spent (ms)=39770
Physical memory (bytes) snapshot=1636868096
Virtual memory (bytes) snapshot=7041122304
Total committed heap usage (bytes)=1388314624
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=27236
File Output Format Counters
Bytes Written=143
16/08/15 11:49:47 INFO ipc.Client: Retrying connect to server: rain/192.168.126.136:45795. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
16/08/15 11:49:48 INFO ipc.Client: Retrying connect to server: rain/192.168.126.136:45795. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
16/08/15 11:49:49 INFO ipc.Client: Retrying connect to server: rain/192.168.126.136:45795. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
16/08/15 11:49:50 INFO mapred.ClientServiceDelegate: Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
16/08/15 11:50:49 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
16/08/15 11:50:51 INFO input.FileInputFormat: Total input paths to process : 1
16/08/15 11:50:51 INFO mapreduce.JobSubmitter: number of splits:1
16/08/15 11:50:53 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1471230621598_0002
16/08/15 11:50:53 INFO impl.YarnClientImpl: Submitted application application_1471230621598_0002
16/08/15 11:50:53 INFO mapreduce.Job: The url to track the job: [color=red]http://rain:8088/proxy/application_1471230621598_0002/[/color]
16/08/15 11:50:53 INFO mapreduce.Job: Running job: job_1471230621598_0002
16/08/15 11:51:29 INFO mapreduce.Job: Job job_1471230621598_0002 running in uber mode : false
16/08/15 11:51:29 INFO mapreduce.Job: map 0% reduce 0%
16/08/15 11:51:39 INFO mapreduce.Job: map 100% reduce 0%
16/08/15 11:51:48 INFO mapreduce.Job: map 100% reduce 100%
16/08/15 11:51:51 INFO mapreduce.Job: Job job_1471230621598_0002 completed successfully
16/08/15 11:51:51 INFO mapreduce.Job: Counters: 49
File System Counters
FILE: Number of bytes read=51
FILE: Number of bytes written=230397
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=276
HDFS: Number of bytes written=29
HDFS: Number of read operations=7
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters
Launched map tasks=1
Launched reduce tasks=1
Data-local map tasks=1
Total time spent by all maps in occupied slots (ms)=6533
Total time spent by all reduces in occupied slots (ms)=8187
Total time spent by all map tasks (ms)=6533
Total time spent by all reduce tasks (ms)=8187
Total vcore-seconds taken by all map tasks=6533
Total vcore-seconds taken by all reduce tasks=8187
Total megabyte-seconds taken by all map tasks=6689792
Total megabyte-seconds taken by all reduce tasks=8383488
Map-Reduce Framework
Map input records=2
Map output records=2
Map output bytes=41
Map output materialized bytes=51
Input split bytes=133
Combine input records=0
Combine output records=0
Reduce input groups=1
Reduce shuffle bytes=51
Reduce input records=2
Reduce output records=2
Spilled Records=4
Shuffled Maps =1
Failed Shuffles=0
Merged Map outputs=1
GC time elapsed (ms)=59
CPU time spent (ms)=1660
Physical memory (bytes) snapshot=467501056
Virtual memory (bytes) snapshot=1429606400
Total committed heap usage (bytes)=276299776
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=143
File Output Format Counters
Bytes Written=29

查看结果
5)
donald_draper@rain:/hadoop/hadoop-2.7.1$  hdfs  dfs  -get /test/output   output
16/08/15 11:52:19 WARN hdfs.DFSClient: DFSInputStream has been closed already
16/08/15 11:52:19 WARN hdfs.DFSClient: DFSInputStream has been closed already

6)
donald_draper@rain:/hadoop/hadoop-2.7.1$ cat   output/* 
1 dfsadmin
1 dfs.replication


备注:另外一种查看结果的方式
 hdfs dfs -cat /test/output/* 

12.关闭hadoop
stop-yarn.sh
mr-jobhistory-daemon.sh stop historyserver
stop-dfs.sh

访问地址:
[url]http://192.168.126.136:50070 namenode[/url]

[img]http://dl2.iteye.com/upload/attachment/0119/3595/d0459fd7-005a-3d31-b21d-2539f473dbac.png[/img]

[url]http://192.168.126.136:8088 resourcemanager [/url]


[img]http://dl2.iteye.com/upload/attachment/0119/3597/1c9e74cc-aa15-30dd-8333-616e80c73b43.png[/img]

[url]http://192.168.126.136:19888 jobhistroysever [/url]


[img]http://dl2.iteye.com/upload/attachment/0119/3624/5be2679a-fa73-37ce-a151-c78b2d548a45.png[/img]


相关错误:
2016-08-15 11:28:50,625 FATAL org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer: Error starting JobHistoryServer
org.apache.hadoop.yarn.webapp.WebAppException: Error starting http server
at org.apache.hadoop.yarn.webapp.WebApps$Builder.start(WebApps.java:279)
at org.apache.hadoop.mapreduce.v2.hs.HistoryClientService.initializeWebApp(HistoryClientService.java:156)
at org.apache.hadoop.mapreduce.v2.hs.HistoryClientService.serviceStart(HistoryClientService.java:121)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
at org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer.serviceStart(JobHistoryServer.java:195)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer.launchJobHistoryServer(JobHistoryServer.java:222)
at org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer.main(JobHistoryServer.java:231)
[color=red]Caused by: java.net.SocketException: Unresolved address[/color]
问题解决:
查看mapred-site.xml的服务器地址,及web地址配置
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值