hadoop在linux上的初实验：环境搭建和伪分布模式搭建

最新推荐文章于 2021-11-03 19:55:34 发布

所有昵称都被用了呢

最新推荐文章于 2021-11-03 19:55:34 发布

阅读量444

点赞数

分类专栏： hadoop 文章标签： hadoop

本文链接：https://blog.csdn.net/a33b33C33/article/details/41107839

版权

hadoop 专栏收录该内容

0 篇文章 0 订阅

订阅专栏

光是这个就差不多把我的耐性耗没了，遇到了无数困难。。。别人都开始叫我电脑杀手了汗。。。
话不多说
首先是安装系统咯，我用的是Ubuntu 10.04 LTS - Lucid Lynx。
由于锐捷这个客户端比较血腥，同时我们寝室区的服务器貌似也和我不大对盘，就直接买了无线来用。。。不过现在软件中心完全不能用的说。。完全靠手动安装啊。。。
jdk的安装曾经让我也很头疼。。。嘿嘿有个好网站，其他方法我用这都不行，这个网址给的方法最管用！
http://www.linuxdiyf.com/viewarticle.php?id=216845
我也在虚拟机下弄过，但是真的很慢，很麻烦，后期会更痛苦吧。。。
所以我又回归了。。。
感觉最大的误区还是看了ｈａｄｏｏｐ快速入门这个中文官方网站！看了英文的才发现，这个中文的网站上次更新时间完全是猴年马月的事情了！！我失败了那么多次。。。。迷茫了那么多次。。。完全是在绕圈子！！！
下载hadoop完全不用迷茫啦！http://mirror.bit.edu.cn/apache//hadoop/common/hadoop-1.0.0/
因为我的是Ubuntu，所以直接只下 hadoop-1.0.0.tar.gz就好
其他的ｌｉｎｕｘ系统貌似redhat是rmp结尾的那个。
hadoop-0.20.2.tar.gz 配置conf/hadoop-env.sh 文件。增加 export JAVA_HOME=/usr/lib/jvm/java-6-sun 这里修改为你的jdk 的安装位置。
～／．ｂａｓｈｒｃ里也要增加ＰＡＴＨ，在现有的后面加：／ｈａｄｏｏｐ的安装目录／ｂｉｎ（：是需要的）

检查是否运行：source ~/.bashrc

hadoop version

hadoop 出现版本信息的时候就成功了~
然后是单机模式和伪分布模式的搭建
完全就靠英文版的ｈａｄｏｏｐ网站啦
1单机模式
$ mkdir input
$ cp conf/*.xml input
$ bin/hadoop jar hadoop-examples-*.jar grep input output'dfs[a-z.]+'
$ cat output/*
这时候只显示一个 1 dfsadmin 是正确的！
２伪分布，修改配置
例子：geditconf/core-site.xml 就可以打开文件，以下代码复制粘贴即可（格式不大对，也懒得弄，去官网哦！）
conf/core-site.xml:

<configuration>
     <property>
         <name>fs.default.name</name>
         <value>hdfs://localhost:9000</value>
     </property>
</configuration>

conf/hdfs-site.xml:

<configuration>
     <property>
         <name>dfs.replication</name>
         <value>1</value>
     </property>
</configuration>

conf/mapred-site.xml:

<configuration>
     <property>
         <name>mapred.job.tracker</name>
         <value>localhost:9001</value>
     </property>
</configuration>



３配置ｓｓｈ
我直接

ssh localhost  就成功了，不需要密码

若是不行的话试试这个

$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa


                  
$ cat ~/.ssh/id_dsa.pub >> 
~/.ssh/authorized_keys
然后再 ssh localhost


４格式化ＨＤＦＳ

$ bin/hadoop namenode -format



5 开启ｈａｄｏｏｐ守护进程:

          
$ bin/start-all.sh

这里要是提示

his script is Deprecated. Instead use start-dfs.sh and start-mapred.sh
namenode running as processxxxx. Stop it first.
localhost: datanode running as processxxxx. Stop it first.
localhost: secondarynamenode running as processxxxx. Stop it first.
jobtracker running as processxxxx. Stop it first.
localhost: tasktracker running as processxxxx.Stop it first.

就输入$bin/stop-all.sh

然后再$bin/start-all.sh就行啦！

６将hadoop安装目录下的conf文件夹及内容复制到hdfs中去：
$ bin/hadoop fs -put conf input

７运行提供的例子
$ bin/hadoop jar hadoop-examples-*.jar grep input output'dfs[a-z.]+'

使结果显示： $ bin/hadoop fs -catoutput/*
若是输出是：1 dfs.replication
1 dfs.server.namenode.
1 dfsadmin
就成功啦！

以上完全是出于记录目的。。。。个人记性不大好。。。。

附:遇到此类错误:

碰到错误：

11/08/18 14:56:37 INFO ipc.Client: Retrying connect to server: localhost/127.0.0
.1:8888. Already tried 0 time（s）.
11/08/18 14:56:39 INFO ipc.Client: Retrying connect to server: localhost/127.0.0
.1:8888. Already tried 1 time（s）.
11/08/18 14:56:41 INFO ipc.Client: Retrying connect to server: localhost/127.0.0
.1:8888. Already tried 2 time（s）.
11/08/18 14:56:43 INFO ipc.Client: Retrying connect to server: localhost/127.0.0
.1:8888. Already tried 3 time（s）.
11/08/18 14:56:45 INFO ipc.Client: Retrying connect to server: localhost/127.0.0
.1:8888. Already tried 4 time（s）.
11/08/18 14:56:47 INFO ipc.Client: Retrying connect to server: localhost/127.0.0
.1:8888. Already tried 5 time（s）.
11/08/18 14:56:49 INFO ipc.Client: Retrying connect to server: localhost/127.0.0
.1:8888. Already tried 6 time（s）.
11/08/18 14:56:51 INFO ipc.Client: Retrying connect to server: localhost/127.0.0
.1:8888. Already tried 7 time（s）.
11/08/18 14:56:53 INFO ipc.Client: Retrying connect to server: localhost/127.0.0
.1:8888. Already tried 8 time（s）.
11/08/18 14:56:55 INFO ipc.Client: Retrying connect to server: localhost/127.0.0
.1:8888. Already tried 9 time（s）.

Bad connection to FS. command aborted. exception: Call to localhost/127.0.0.1:88
88 failed on connection exception: java.net.ConnectException: Connection refused
: no further information
<span>
</span>

错误提示“Bad connection to FS. command aborted. exception: Call tolocalhost/127.0.0.1:88

88 failed on connection exception: java.net.ConnectException:Connection refused

: no further information”

从头格局化namenode，就可以了。

格局化指令如下

 bin/hadoop namenode -format

成功之后重启hadoop就可以了

参考：１http://hadoop.apache.org/common/docs/stable/single_node_setup.html
２http://www.linuxdiyf.com/viewarticle.php?id=216845
3 http://www.bwkeji.com/a/wangzhanjichu/kaifa/20110819/1643.html