hadoop学习笔记-第五天-重新修复全集群环境

基本配置记录

    之前虽然看起来全集群配置好了,但在后续的学习测试中(例如PIG),发现还是会有各种报错。切换到伪集群模式正常,怀疑和全集群环境的配置有关系。今天重新折腾一番。

    放弃之前从各个网络环境查到的资料(之前的配置文件其实是个“融合”版),到官网http://hadoop.apache.org/docs/r2.2.0/hadoop-project-dist/hadoop-common/ClusterSetup.html查看相关资料,重新修订配置,官网提到的配置,70%都配置了,还有一些,本能认为现在没必要配置。总不能把http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-common/yarn-default.xml里边的内容都配进去吧。

core-site.xml

<configuration>
  <property>
    <name>fs.defaultFS</name>
    <value>hdfs://hdpNameNode:9000</value>
  </property>
  <property>
    <name>io.file.buffer.size</name>
    <value>131072</value>
  </property>
</configuration>

hdfs-site.xml

<configuration>
   <property>
      <name>dfs.namenode.name.dir</name>
      <value>file:/home/hdpuser/dfs/name</value>
   </property>
   <property>
       <name>dfs.datanode.data.dir</name>
       <value>file:/home/hdpuser/dfs/data</value>
   </property>
   <property>
       <name>dfs.blocksize</name>
       <value>268435456</value>
   </property>
   <property>
       <name>dfs.namenode.handler.count</name>
       <value>100</value>
   </property>
</configuration>

mapred-site.xml

红色字体不配置,jobhistory服务不启动,会有相应的警告,这里没有截图记录,大概就是一直尝试连接,但连接不上的错误。
<configuration>
<property>
 <name>mapreduce.framework.name</name>
 <value>yarn</value>
</property>

<property>
 <name>mapreduce.jobhistory.address</name>
 <value>hdpNameNode:10020</value>
</property>

<property>
 <name>mapreduce.jobhistory.webapp.address</name>
 <value>hdpNameNode:19888</value>
</property>


</configuration>

yarn-site.xml

<configuration>

<!-- Site specific YARN configuration properties -->

  <property>
    <name>yarn.acl.enable</name>
    <value>false</value>
  </property>
  <property>
    <name>yarn.admin.acl</name>
    <value>*</value>
  </property>
  <property>
    <name>yarn.log-aggregation-enable</name>
    <value>false</value>
  </property>

  <property>
    <description>The address of the applications manager interface in the RM.</description>
    <name>yarn.resourcemanager.address</name>
    <value>hdpNameNode:18040</value>
  </property>

  <property>
    <description>The address of the scheduler interface.</description>
    <name>yarn.resourcemanager.scheduler.address</name>
    <value>hdpNameNode:18030</value>
  </property>

  <property>
    <description>The address of the resource tracker interface.</description>
    <name>yarn.resourcemanager.resource-tracker.address</name>
    <value>hdpNameNode:8025</value>
  </property>

  <property>
    <name>yarn.resourcemanager.admin.address</name>
    <value>hdpNameNode:8026</value>
  </property>

  <property> 
    <description>The address of the RM web application.</description>
    <name>Yarn.resourcemanager.webapp.address</name>
    <value>hdpNameNode:18088</value>
  </property>

   <property>
    <name>yarn.resourcemanager.scheduler.class</name>
    <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler</value>
  </property>



   <property>
    <name>yarn.scheduler.minimum-allocation-mb</name>
    <value>1024</value>
  </property>

   <property>
    <name>yarn.scheduler.maximum-allocation-mb</name>
    <value>8196</value>
  </property>


   <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
  </property>


</configuration>

测试验证

1、配置环境变量export HADOOP_CONF_DIR=$HADOOP_HOME/conf,之前一直使用默认的$HADOOP_HOME/etc/hadoop,现在环境多了,通过这个环境变量指定不同的配置文件。
2、格式化HDFS,hadoop namenode -format,并创建/user/hdpuser目录,
3、start-all.sh
4、启动jobhistory,mr-jobhistory-daemon.sh start historyserver --config $HADOOP_CONF_DIR
5、copy文件到HDFS,
hadoop fs -copyFromLocal  /home/hdpuser/pig-0.12.0/tutorial/data/excite-small.log /user/hdpuser/excite-small.log
6、pig
7、grunt> log = LOAD '/user/hdpuser//excite-small.log' AS (user:chararray, time:long, query:chararray);
8、lmt = LIMIT log 4;
9、DUMP lmt; 
10、结果正常
2013-12-08 14:45:59,790 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
2013-12-08 14:45:59,793 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2013-12-08 14:45:59,793 [main] INFO  org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code.
2013-12-08 14:45:59,801 [main] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2013-12-08 14:45:59,801 [main] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
(2A9EABFB35F5B954,970916105432,+md foods +proteins)
(BED75271605EBD0C,970916001949,yahoo chat)
(BED75271605EBD0C,970916001954,yahoo chat)
(BED75271605EBD0C,970916003523,yahoo chat)

同样环境下,进行 WordCount测试,正常(先把要统计的测试文件copy到HDFS)

hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/sources/hadoop-mapreduce-examples-2.2.0-sources.jar org.apache.hadoop.examples.WordCount /user/hdpuser/input /user/hdpuser/output

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值