Ubuntu配置hadoop单机+伪分布式环境+eclipse--配置hadoop伪分布式环境(三)

Ubuntu配置hadoop伪分布式环境


本篇文章介绍Ubuntu配置hadoop伪分布式环境。

快速入口链接:
Ubuntu安装hadoop的准备工作
配置hadoop单机环境
Ubuntu配置hadoop+eclipse


1)修改配置文件

需要修改的配置的文件为“mapred-site.xml”和“yarn-site.xml”。
在目录“HADOOP_HOME/etc/hadoop”中可以找到文件”mapred-site.xml.template”,进入目录“HADOOP_HOME/etc/hadoop”,将文件名称修改为“mapred-site.xml”,
$mv mapred-site.xml.template mapred-site.xml
打开“mapred-site.xml”,
$gedit mapred-site.xml
修改为如下:

<configuration>
        <property>
             <name>mapreduce.framework.name</name>
             <value>yarn</value>
        </property>
</configuration>

找到文件“yarn-site.xml”,并修改,
打开“yarn-site.xml”,
$gedit yarn-site.xml
修改为如下:

<configuration>
        <property>
             <name>yarn.nodemanager.aux-services</name>
             <value>mapreduce_shuffle</value>
            </property>
</configuration>

如果不想启动 yarn,需要把配置文件 “mapred-site.xml ”重命名为 “mapred-site.xml.template”,需要用时再改回来。否则在该配置文件存在,而未开启 yarn的情况下,运行程序会提示 “Retrying connect to server: 0.0.0.0/0.0.0.0:8032″ 的错误,这也是该配置文件初始文件名为 “mapred-site.xml.template”的原因。

2)启动hadoop和yarn

单机环境读取的是本地文件,伪分布式环境读取的是HDFS上的文件。
启动hadoop
$hdfs-start.sh
输入
$jps
启动hadoop
启动yarn
$yarn-start.sh
输入
$jps
启动jps
可以看到增加了”ResourceMAnager”和”NodeManager”。
首先创建用户目录使用的用户为hadoop,故用户名hadoop,
$ hdfs dfs -mkdir -p /user/hadoop
(-mkdir -p 目录,表示没有后面的目录才新建)
再创建“input”目录,
$ hdfs dfs -mkdir input
使用hadoop用户时,当前目录在“/user/hadoop”中,
$hdfs dfs -put *.xml input
查看hadoop文件目录,
$hdfs dfs -ls input
或者$hdfs dfs -ls user/hadoop/input(好像配置环境变量后这个不能用了,不知道什么原因)
查看input目录

运行一个hadoop任务
$hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar grep input output ‘dfs[a-z.]+’
运行hadoop任务

出现错误

16/10/23 01:38:48 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
16/10/23 01:38:57 WARN mapreduce.JobSubmitter: No job jar file set.  User classes may not be found. See Job or Job#setJar(String).
16/10/23 01:38:59 INFO input.FileInputFormat: Total input paths to process : 9
16/10/23 01:39:04 INFO mapreduce.JobSubmitter: number of splits:9
16/10/23 01:39:05 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1477209859612_0001
16/10/23 01:39:05 INFO mapred.YARNRunner: Job jar is not present. Not adding any jar to the list of resources.
16/10/23 01:39:07 INFO impl.YarnClientImpl: Submitted application application_1477209859612_0001
16/10/23 01:39:08 INFO mapreduce.Job: The url to track the job: http://ubuntu:8088/proxy/application_1477209859612_0001/
16/10/23 01:39:08 INFO mapreduce.Job: Running job: job_1477209859612_0001
16/10/23 01:40:45 INFO mapreduce.Job: Job job_1477209859612_0001 running in uber mode : false
16/10/23 01:40:50 INFO mapreduce.Job:  map 0% reduce 0%
16/10/23 02:03:58 INFO mapreduce.Job:  map 67% reduce 0%

然后虚拟机卡死
可能是未配置环境变量,重新配置hadoop环境变量,

export HADOOP_MAPRED_HOME=\$HADOOP_HOME 
export HADOOP_COMMON_HOME=\$HADOOP_HOME 
export HADOOP_HDFS_HOME=\$HADOOP_HOME 
export YARN_HOME=\$HADOOP_HOME 
export HADOOP_COMMON_LIB_NATIVE_DIR=\$HADOOP_HOME/lib/native 
export PATH=\$PATH:\$HADOOP_HOME/sbin:\$HADOOP_HOME/bin 
export HADOOP_INSTALL=\$HADOOP_HOME

我也是醉了,现在要用
$start-dfs.sh
开启hadoop,再开启yarn,
$start-yarn.sh
启动以后查看了http://localhost:8088/http://localhost:50070/,一切正常,
运行一个hadoop任务,进入HAOOP_HOME,
$hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar grep input output ‘dfs[a-z.]+’
或者$hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar grep input output ‘dfs[a-z.]+’
运行出现错误

16/10/23 06:20:24 INFO mapred.ClientServiceDelegate: Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
16/10/23 06:20:25 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
16/10/23 06:20:26 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
16/10/23 06:20:27 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
16/10/23 06:20:28 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
16/10/23 06:20:29 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
16/10/23 06:20:30 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
16/10/23 06:20:31 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
16/10/23 06:20:32 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
16/10/23 06:20:33 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
16/10/23 06:20:34 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
16/10/23 06:20:34 INFO mapred.ClientServiceDelegate: Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
16/10/23 06:20:35 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
16/10/23 06:20:36 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
16/10/23 06:20:37 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
16/10/23 06:20:38 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
16/10/23 06:20:39 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
16/10/23 06:20:40 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
16/10/23 06:20:41 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
16/10/23 06:20:42 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
16/10/23 06:20:43 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
16/10/23 06:20:44 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
16/10/23 06:20:44 INFO mapred.ClientServiceDelegate: Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
16/10/23 06:20:45 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
16/10/23 06:20:46 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
16/10/23 06:20:47 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
16/10/23 06:20:48 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
16/10/23 06:20:49 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
16/10/23 06:20:50 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
16/10/23 06:20:51 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
16/10/23 06:20:53 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
16/10/23 06:20:54 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
16/10/23 06:20:55 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
java.io.IOException: java.net.ConnectException: Call From ubuntu/127.0.1.1 to 0.0.0.0:10020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
    at org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:337)
    at org.apache.hadoop.mapred.ClientServiceDelegate.getJobStatus(ClientServiceDelegate.java:422)
    at org.apache.hadoop.mapred.YARNRunner.getJobStatus(YARNRunner.java:575)
    at org.apache.hadoop.mapreduce.Job$1.run(Job.java:325)
    at org.apache.hadoop.mapreduce.Job$1.run(Job.java:322)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
    at org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:322)
    at org.apache.hadoop.mapreduce.Job.isSuccessful(Job.java:622)
    at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1329)
    at org.apache.hadoop.examples.Grep.run(Grep.java:77)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
    at org.apache.hadoop.examples.Grep.main(Grep.java:101)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
    at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
    at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.net.ConnectException: Call From ubuntu/127.0.1.1 to 0.0.0.0:10020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
    at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731)
    at org.apache.hadoop.ipc.Client.call(Client.java:1472)
    at org.apache.hadoop.ipc.Client.call(Client.java:1399)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
    at com.sun.proxy.$Proxy14.getJobReport(Unknown Source)
    at org.apache.hadoop.mapreduce.v2.api.impl.pb.client.MRClientProtocolPBClientImpl.getJobReport(MRClientProtocolPBClientImpl.java:133)
    at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:323)
    ... 26 more
Caused by: java.net.ConnectException: Connection refused
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
    at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494)
    at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:607)
    at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:705)
    at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368)
    at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521)
    at org.apache.hadoop.ipc.Client.call(Client.java:1438)
    ... 34 more

错误

后来又试,
$hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduceexamples-2.2.0.jar wordcount input ouput
运行成功,
查看结果,注意后面的*,
$hdfs dfs -cat output/*
这里只截取部分结果
查看结果
如果之前存在output文件,可能无法运行,故要先删除,
$hdfs dfs -rm -r output

先关闭yarn,
$stop-yarn.sh
再关闭hadoop,
$stop-dfs.sh


hadoop web控制台页面的端口:

50070
8088

这两个就够用了


下篇文章介绍Ubuntu配置hadoop+eclipse。
Ubuntu配置hadoop单机+伪分布式环境+eclipse(四)

  • 1
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值