配置Hadoop环境集群搭建-1

版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/DH520HW520/article/details/79974556
以下1-5的操作均以root用户身份针对虚拟机中的Linux系统
1.root下分别配置静态IP、子网掩码、网关、域名解析
IPADDR=192.168.220.128 (linux系统的IP)
NETMASK=255.255.255.0
GATEWAY=192.168.220.2 (linux系统的网关)
DNS1=202.106.0.20       (可以是这个)


2.root下vi /etc/sysconfig/network-scripts/ifcfg-ens33后追加如下内容:
IPADDR=192.168.220.128
NETMASK=255.255.255.0
GATEWAY=192.168.220.2
DNS1=202.106.0.20
并修改BOOTPROTO=static


3.systemctl restart network 重启网络


4.root下修改主机名称
默认情况下的主机名:localhost,修改为python333
vi /etc/hostname后只留存内容:python333
修改主机映射
vi /etc/hosts后追加内容:192.168.220.128 python333


5.重启linux系统,命令:reboot


6.修改Windows主机对虚拟机中linux系统的认知,在C:\Windows\System32\drivers\etc下的hosts中追加192.168.220.128 python333,
即可以ssh python333远程登录linux
-----------------------------------------------------------------------------------
以下步骤均以hadoop用户身份操作
1.hadoop用户解压文件于/home/hadoop/opt/下,命令:tar -zvxf hadoop-xxx.gx -C opt


2.配置hadoop环境变量,hadoop用户在工作目录下vi .bashrc后追加如下内容:
export HADOOP_HOME=/home/hadoop/opt/hadoop-2.9.0
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin


3.之后source .bashrc刷新,输入hadoop验证


4.配置hadoop配置文件,进入/home/hadoop/opt/hadoop-2.9.0/etc/hadoop下,添加如下信息
4.1.core-site.xml:默认文件系统hdfs,HDFS浏览器请求地址
<property>
<name>fs.defaultFS</name>
<value>hdfs://python333:9000</value> 
</property>


4.2.hdfs-site,xml:修改 Hadoop 文件块的默认备份数3为1
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///home/hadoop/opt/tmp/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///home/hadoop/opt/tmp/dfs/data</value>
</property>
<property>
<name>dfs.namenode.http-address</name>
<value>python333:50070</value>
</property>


4.3.mapred-site.xml:启用yarn的资源调度框架,
注:需要备份cp mapred-site.xml.template mapred-site.xml
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>


4.4.yarn-site.xml:配置yarn主机  
<property>
<name>yarn.resourcemanager.hostname</name>
<value>python333</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>


4.5.slaves:配置dataname主机名称
python333


5.root用户关闭防火墙
5.1.选择永久关闭(临时关闭:setenforce 0)
vi /etc/selinux/config后修改:
SELINUX=disabled
5.2关闭防火墙(查看防火墙状态systemctl status firewalld)
临时关闭 systemctl stop firewalld
永久关闭 systemctl disable firewalld
5.3重启电脑reboot


6.退出管理员用户,以hadoop进入/home/hadoop/opt/下,
6.1新建tmp目录:mkdir tmp
6.2执行hdfs文件系统格式化,输入:hdfs namenode -format


7.配置密匙(公匙,私匙;可以在工作目录下)
7.1执行ssh-keygen -t rsa,一路回车生成密匙
7.2想无密码登陆到哪台电脑:ssh-copy-id python333,之后输入yes确认


8.验证(hadoop在工作目录下)
(start-dfs.sh
 start-yarn.sh)
8.1输入start-all.sh
8.2输入jps


------------------------------------------------------------
1.浏览器输入:python3:50070


2.hadoop fs -mkdir -p /user/hadoop
注:如果创建过程出现错误,如‘Name node is in safe mode’,需要进入hadoop-2.9.0目录下,执行命令:bin/hadoop dfsadmin -safemode leave


3.hadoop fs -put data1.txt


http://rpmfind.net/linux/rpm2html/search.php?query=libmpfr.so.4%28%29%2864bit%29+&submit=Search+...&system=&arch=  #下载linux软件地址

hadoop集群搭建

01-12

今天第一次搭建了一个hadoop集群,发个帖记录一下,方便查询也期待各位高手指教。rnrn系统环境rn3台linux主机(221.10.38.1,221.10.38.2,221.10.38.3 ) hadoop版本:hadoop-0.20.2-cdh3u0 rn目标:rn在3台主机上搭建hadoop集群,一个namenode(221.10.38.1) 2个datanode(221.10.38.2,221.10.38.3)rnrn前提:rn3台主机分别已安装了hadoop-0.20.2-cdh3u0,且是相同用户,相同目录,namenode和2个datanode之间ssh配置也已完成。(这里我有个疑问,namenode到各datanode的ssh是否必须相互ssh无密码登录?另外2个datanode是否需要互相无密码ssh登录?)rn我在这里只做了namenode和2个datanode的互相ssh。rnrn步骤一:rn配置各主机/etc/hosts rnrn221.10.38.1作为namenode的hosts文件大概是这样的:rnrn127.0.0.1 hadooptest1 localhostrn221.10.38.1 hadooptest1 hadooptest1rn221.10.38.2 hadooptest2 hadooptest2rn221.10.38.3 hadooptest3 hadooptest3rnrn221.10.38.2作为datenode的hosts文件大概是这样的:rn127.0.0.1 hadooptest2 localhostrn221.10.38.1 hadooptest1 hadooptest1rn221.10.38.2 hadooptest2 hadooptest2rn221.10.38.3 hadooptest3 hadooptest3rnrn221.10.38.3同样hosts是这样:rn127.0.0.1 hadooptest3 localhostrn221.10.38.1 hadooptest1 hadooptest1rn221.10.38.2 hadooptest2 hadooptest2rn221.10.38.3 hadooptest3 hadooptest3rnrn步骤二:rn修改hadoop配置文件rn1,core-site.xmlrn2. mapred-site.xmlrn3.hdfs-site.xmlrn4.masterrn5.salvesrnrn步骤三:rn复制配置文件到各datanodernscp core-site.xml hadooptest2:/home/hadoop/hadoop-0.20.2-cdh3u0/confrnscp core-site.xml hadooptest3:/home/hadoop/hadoop-0.20.2-cdh3u0/confrnscp mapred-site.xml hadooptest2:/home/hadoop/hadoop-0.20.2-cdh3u0/confrnscp mapred-site.xml hadooptest3:/home/hadoop/hadoop-0.20.2-cdh3u0/confrn。rn。rn。rn。rnrn步骤四:rn格式化namenodern/bin/hadoop namenode -format rnrn启动[hadoop@hadooptest1 bin]$ ./start-all.sh rn如果没问题的话应该就是启动成功!rnrn但是大部分情况是不会一次启动成功rn我遇到的第一个问题。namenode 和jobtrask都起来了 但是datanode都没有起来。rn到datanode节点查看日志 hadoop-hadoop-datanode-hadooptest3.logrn2012-01-12 11:52:36,381 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG: rn/************************************************************rnSTARTUP_MSG: Starting DataNodernSTARTUP_MSG: host = hadooptest3/10.10.36.218rnSTARTUP_MSG: args = []rnSTARTUP_MSG: version = 0.20.2-cdh3u0rnSTARTUP_MSG: build = -r 81256ad0f2e4ab2bd34b04f53d25a6c23686dd14; compiled by 'hudson' on Fri Mar 25 19:56:23 PDT 2011rn************************************************************/rn2012-01-12 11:52:36,922 INFO org.apache.hadoop.security.UserGroupInformation: JAAS Configuration already set up for Hadoop, not re-installing.rn2012-01-12 11:52:38,033 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /10.10.38.29:9000. Already tried 0 time(s).rn2012-01-12 11:52:39,038 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /10.10.38.29:9000. Already tried 1 time(s).rn2012-01-12 11:52:40,043 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /10.10.38.29:9000. Already tried 2 time(s).rn2012-01-12 11:52:41,048 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /10.10.38.29:9000. Already tried 3 time(s).rn2012-01-12 11:52:42,052 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /10.10.38.29:9000. Already tried 4 time(s).rn2012-01-12 11:52:43,057 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /10.10.38.29:9000. Already tried 5 time(s).rn2012-01-12 11:52:44,060 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /10.10.38.29:9000. Already tried 6 time(s).rn2012-01-12 11:52:45,064 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /10.10.38.29:9000. Already tried 7 time(s).rn2012-01-12 11:52:46,068 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /10.10.38.29:9000. Already tried 8 time(s).rn2012-01-12 11:52:47,073 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /10.10.38.29:9000. Already tried 9 time(s).rn2012-01-12 11:52:47,078 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Call to /10.10.38.29:9000 failed on local exception: java.net.NoRouteToHostException: No route to hostrn at org.apache.hadoop.ipc.Client.wrapException(Client.java:1139)rn at org.apache.hadoop.ipc.Client.call(Client.java:1107)rn at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:226)rn at $Proxy4.getProtocolVersion(Unknown Source)rn at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:398)rn at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:342)rn at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:317)rn at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:297)rn at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:344)rn at org.apache.hadoop.hdfs.server.datanode.DataNode.(DataNode.java:280)rn at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1533)rn at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1473)rn at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1491)rn at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1616)rn at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1626)rnCaused by: java.net.NoRouteToHostException: No route to hostrn at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)rn at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)rn at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)rn at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:408)rn at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:425)rn at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:532)rn at org.apache.hadoop.ipc.Client$Connection.access$2300(Client.java:210)rn at org.apache.hadoop.ipc.Client.getConnection(Client.java:1244)rn at org.apache.hadoop.ipc.Client.call(Client.java:1075)rn ... 13 morern因为不清楚引起的原因,检查了很多遍配置文件和hosts 发现都正确,百度了下说是关闭防火墙rn在各机器执行命令:rnservice iptables stop 关闭防火墙rn再次启动!rnrnrn错误改变,变为如下提示:rnrn/************************************************************rnSTARTUP_MSG: Starting DataNodernSTARTUP_MSG: host = hadooptest3/10.10.36.218rnSTARTUP_MSG: args = []rnSTARTUP_MSG: version = 0.20.2-cdh3u0rnSTARTUP_MSG: build = -r 81256ad0f2e4ab2bd34b04f53d25a6c23686dd14; compiled by 'hudson' on Fri Mar 25 19:56:23 PDT 2011rn************************************************************/rn2012-01-12 15:38:34,378 INFO org.apache.hadoop.security.UserGroupInformation: JAAS Configuration already set up for Hadoop, not re-installing.rn2012-01-12 15:38:34,647 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Incompatible namespaceIDs in /home/hadoop/dfs/data: namenode namespaceID = 908563396; datanode namespaceID = 2085669283rn at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:233)rn at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:148)rn at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:373)rn at org.apache.hadoop.hdfs.server.datanode.DataNode.(DataNode.java:280)rn at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1533)rn at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1473)rn at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1491)rn at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1616)rn at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1626rnrn网上找了下 说这是个一个经典错误,原因为namenode和datanode的namespaceID 不一致导致的rn我的原因起因是:datenode节点做过别的namenode的节点,所以保存的namespaceID 与新格式化的不一致rn解决办法:这个问题的解决要感谢[url=http://forum.hadoop.tw/viewtopic.php?f=4&t=43][/url] rn第一种:rn修改 datanode 的 namespaceIDrn编辑每台 datanode 的 /tmp/hadoop/hadoop-root/dfs/data/current/VERSION 把rnnamespaceID=2085669283rn改成rnnamespaceID=908563396rn第二种:rn修改namenode的 namespaceIDrn/tmp/hadoop/hadoop-root/dfs/name/current/VERSION rnnamespaceID=908563396rn改成rnnamespaceID=2085669283rn我这样做了发现namenode的namespaceID根本就改不了rnrn采用第一种办法后:rn重启hadoop 重启成功!rnrn注:写的有些匆忙 回头我再补充下。欢迎批评指正。rn

没有更多推荐了,返回首页

私密
私密原因:
请选择设置私密原因
  • 广告
  • 抄袭
  • 版权
  • 政治
  • 色情
  • 无意义
  • 其他
其他原因:
120
出错啦
系统繁忙,请稍后再试