Hadoop 2.2.0单机版安装

最新推荐文章于 2024-07-17 17:53:42 发布

weixin_34121282

最新推荐文章于 2024-07-17 17:53:42 发布

阅读量69

点赞数

文章标签：大数据 python java

原文链接：https://my.oschina.net/heartdong/blog/186026

版权

2019独角兽企业重金招聘Python工程师标准>>>

以前对Hadoop有过一点了解，但没有深入，现在越来越感觉这东西挺有意思的，打算学习下，前两天买了两本Hadoop相关的书，先粗略的翻了下，今天就动手先把环境搭起来。

环境：centos6.2,jdk7_u45,hadoop2.2.0

下载，解压过程就不说了，直接环境配置（包括JAVA_HOME的配置，以及HADOOP_HOME的环境变量配置，都略过了）。参考文档http://hadoop.apache.org/docs/r2.2.0/hadoop-project-dist/hadoop-common/SingleCluster.html

1，修改hadoop-env.sh中修改JAVA_HOME

2，修改core-site.xml配置文件

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
	<property>
  		<name>hadoop.tmp.dir</name>
  		<value>/data/hadoop/tmp</value>
	</property>
	
	<property>  
	  <name>fs.defaultFS</name>  
	  <value>hdfs://localhost:9000</value>  
	  <final>true</final>  
	</property>  
	
</configuration>

3，修改hdfs-site.xml配置文件

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
	<property>
	  <name>dfs.namenode.name.dir</name>
	  <value>file:///data/hadoop/dfs/name</value>
	  <final>true</final>
	</property>

	<property>
	  <name>dfs.datanode.data.dir</name>
	  <value>file:///data/hadoop/dfs/data</value>
	  <final>true</final>
	</property>

	<property>
	  <name>dfs.replication</name>
	  <value>1</value>
	</property>

	<property>
	  <name>dfs.permissions.enabled</name>
	  <value>false</value>
	</property>

</configuration>

4，复制mapred-site.xml.template成mapred-site.xml，修改mapred-site.xml

cp mapred-site.xml.template mapred-site.xml
vi mapred-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
	<property>
	  <name>mapreduce.framework.name</name>
	  <value>yarn</value>
	</property>
	<!--
	<property>
	  <name>mapreduce.cluster.temp.dir</name>
	  <value></value>
	  <final>true</final>
	</property>

   <property>
     <name>mapreduce.cluster.local.dir</name>
     <value></value>
     <final>true</final>
   </property>
   -->
</configuration>

5,修改yarn-site.xml配置文件

<?xml version="1.0"?>
<configuration>
	<property>
	  <name>yarn.resourcemanager.hostname</name>
	  <value>localhost</value>
	  <description>hostanem of RM</description>
	</property>


	<property>
    <name>yarn.resourcemanager.resource-tracker.address</name>
    <value>localhost:5274</value>
    <description>host is the hostname of the resource manager and 
    port is the port on which the NodeManagers contact the Resource Manager.
    </description>
  </property>

  <property>
    <name>yarn.resourcemanager.scheduler.address</name>
    <value>localhost:5273</value>
    <description>host is the hostname of the resourcemanager and port is the port
    on which the Applications in the cluster talk to the Resource Manager.
    </description>
  </property>

  <property>
    <name>yarn.resourcemanager.scheduler.class</name>
    <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler</value>
    <description>In case you do not want to use the default scheduler</description>
  </property>

  <property>
    <name>yarn.resourcemanager.address</name>
    <value>localhost:5271</value>
    <description>the host is the hostname of the ResourceManager and the port is the port on
    which the clients can talk to the Resource Manager. </description>
  </property>

  <property>
    <name>yarn.nodemanager.local-dirs</name>
    <value></value>
    <description>the local directories used by the nodemanager</description>
  </property>

  <property>
    <name>yarn.nodemanager.address</name>
    <value>localhost:5272</value>
    <description>the nodemanagers bind to this port</description>
  </property>  

  <property>
    <name>yarn.nodemanager.resource.memory-mb</name>
    <value>10240</value>
    <description>the amount of memory on the NodeManager in GB</description>
  </property>
 
  <property>
    <name>yarn.nodemanager.remote-app-log-dir</name>
    <value>/app-logs</value>
    <description>directory on hdfs where the application logs are moved to </description>
  </property>

   <property>
    <name>yarn.nodemanager.log-dirs</name>
    <value></value>
    <description>the directories used by Nodemanagers as log directories</description>
  </property>

  <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
    <description>shuffle service that needs to be set for Map Reduce to run </description>
  </property>
	
</configuration>

到此为止，hadoop单机版配置已经完成。

1）接下来我们先格式化namenode,然后启动namenode

hadoop namenode –format

hadoop-daemon.sh start namenode

可以查看http://localhost:50070/dfshealth.jsp中logs的日志（带namenode*.log字眼），确认是否启动成功，如果没有报错则启动成功。

2）接着启动hdfs datanode

hadoop-daemon.sh start datanode

同时也可以在开始页面上查询对应的日志文件（带datanode *.log字眼），如果没有报错，和namenode通信成功，即启动成功。

还可以在命令行数据Jps查看是否有结果

3)继续启动yarn

yarn-daemon.sh start resourcemanager
yarn-daemon.sh start nodemanager

判断启动成功与否方法同上面一致。

最后进入hadoop-2.2.0\share\hadoop\mapreduce录入中，测试运行

hadoop jar hadoop-mapreduce-examples-2.2.0.jar randomwriter out

查看运行是否成功

转载于:https://my.oschina.net/heartdong/blog/186026

weixin_34121282

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Hadoop 2.2.0单机版安装

2019独角兽企业重金招聘Python工程师标准>>> ...
复制链接

扫一扫

Hadoop 2.2.0单机版安装

“相关推荐”对你有帮助么？