Hadoop CDH4.4集群搭建

结束了一年多的paas开发,转战大数据,备份一些安装文档。


集群示例


hadoop-001     10.168.204.55  NameNode,secondaryNameNode,ResourceManager
hadoop-002     10.168.204.56  DataNode,NodeManager
hadoop-003     10.168.204.57  DataNode,NodeManager
hadoop-004     10.168.204.58  DataNode,NodeManager

hadoop版本:CDH4.4.0
centos版本:6.3

一、准备

   1. jdk 1.7

        http://download.oracle.com/otn-pub/java/jdk/7u45-b18/jdk-7u45-linux-x64.rpm

   sudo rpm -ivh jdk-7u45-linux-x64.rpm
   alternatives --install /usr/bin/java java /usr/java/jdk1.7.0_45/bin/java 300
   alternatives --install /usr/bin/javac javac /usr/java/jdk1.7.0_45/bin/javac 300
   alternatives --config java


   2. 修改hostname

      

vim /etc/sysconfig/network  #修改每个服务器的hostname,重启生效


      配置/etc/hosts

  192.168.204.55 hadoop-001
  192.168.204.56 hadoop-002
  192.168.204.57 hadoop-003
  192.168.204.58 hadoop-004

  3. 防火墙关闭    

service iptables status
service iptables stop 
chkconfig iptables stop

  4. selinux disabled   

#修改为disable
vim /etc/selinux/config

  5. 创建hadoop用户,配置为sudoer

       

adduser hadoop
passwd  hadoop
   
sudo vim /etc/sudoers

  6. ssh without passwd

#切换至hadoop用户
ssh-keygen -t rsa
cat id_rsa.pub >> authorized_keys
chmod 600 authorized_keys


测试 ssh hadoop-001是否可以连本机
将authorized_keys   scp 至其它slaves服务器上。

      

二、安装

  1. 下载CDH4.4 tar

     mkdir cdh4.4.0
     wget http://archive.cloudera.com/cdh4/cdh/4/hadoop-2.0.0-cdh4.4.0.tar.gz
     tar -xvzf hadoop-2.0.0-cdh4.4.0.tar.gz

  2. 设置环境变量 

   修改/etc/profile或 ~/.bashrc,这里改的是bashrc,都一样。

export JAVA_HOME=/usr/java/jdk1.7.0_45
export HADOOP_HOME=/home/hadoop/cdh4.4.0/hadoop-2.0.0-cdh4.4.0
export HADOOP_COMMOM_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_YARN_HOME=$HADOOP_HOME
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export HDFS_CONF_DIR=$HADOOP_HOME/etc/hadoop
export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop
export HADOOP_LIB=$HADOOP_HOME/lib
export JAVA_LIBRARY_PATH=$HADOOP_HOME/lib/native

export PATH=$PATH:/etc/haproxy/sbin/:$JAVA_HOME/bin:$JAVA_HOME/jre/bin
export CLASSPATH=.:$JAVA_HOME/lib/tools.jar:$JAVA_HOME/lib/dt.jar:$HADOOP_LIB/native/libhadoop.so

libhadoop.so其实是后面安装impala时要用到。

  3.  配置文件设置 

core-site.xml


<configuration>
  <property>
   <name>fs.default.name</name>
   <value>hdfs://hadoop-001:8020</value>
  </property>
  <property>
    <name>hadoop.tmp.dir</name>
     <value>/hadoop/tmp</value>
  </property>
  <property>
     <name>fs.trash.interval</name>
     <value>10080</value>
  </property>
  <property>
     <name>fs.trash.checkpoint.interval</name>
     <value>10080</value>
  </property>
<!--  <property>
     <name>io.compression.codecs</name>
     <value>org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.BZip2Codec,com.hadoop.compression.lzo.LzopCodec,org.apache.hadoop.io.compress.SnappyCodec
    </value>
  </property>
  <property>
     <name>io.compression.codec.lzo.class</name>
     <value>com.hadoop.compression.lzo.LzoCodec</value>
  </property>-->
  <!-- OOZIE -->
  <property>
      <name>hadoop.proxyuser.hadoop.hosts</name>
      <value>hadoop-001</value>
  </property>
  <property>
      <name>hadoop.proxyuser.hadoop.groups</name>
      <value>hadoop</value>
  </property>

</configuration>


hdfs-site.xml

<configuration>
  <property>
    <name>dfs.replication</name>
    <value>2</value>
  </property>
<!--  <property>
    <name>hadoop.tmp.dir</name>
    <value>/hadoop/tmp</value>
  </property>-->
  <property>
    <name>dfs.namenode.name.dir</name>
    <value>file:/hadoop/name</value>
    <final>ture</final>
   </property>
   <property>
     <name>dfs.datanode.data.dir</name>
     <value>file:/hadoop/data</value>
     <final>ture</final>
   </property>
   <property>
      <name>dfs.permissions</name>
      <value>false</value>
   </property>
   <property>
      <name>dfs.namenode.http-address</name>
      <value>hadoop-001:50070</value>
   </property>
   <property>
      <name>dfs.secondary.http.address</name>
      <value>hadoop-001:50090</value>
   </property>
   <property>
      <name>dfs.webhdfs.enabled</name>
      <value>true</value>
   </property>
   <!--for impala
   <property>
      <name>dfs.client.read.shortcircuit</name>
      <value>true</value>
   </property>
   <property>
      <name>dfs.domain.socket.path</name>
      <value>/var/run/hadoop-hdfs/dn._PORT</value>
   </property>
   <property>
      <name>dfs.client.file-block-storage-locations.timeout</name>
      <value>3000</value>
   </property>
   <property>
      <name>dfs.datanode.hdfs-blocks-metadata.enabled</name>
      <value>true</value>
   </property>-->
</configuration>


yarn-site.xml


<configuration>  
  
<!-- Site specific YARN configuration properties -->  
  <property>  
     <name>yarn.resourcemanager.resource-tracker.address</name>  
     <value>hadoop-001:18025</value>  
  </property>  
  <property>  
     <name>yarn.resourcemanager.address </name>  
     <value>hadoop-001:18040</value>  
  </property>  
  <property>  
     <name>yarn.resourcemanager.scheduler.address </name>  
     <value>hadoop-001:18030</value>  
  </property>  
  <property>  
     <name>yarn.resourcemanager.admin.address </name>  
     <value>hadoop-001:18141</value>  
  </property>  
  <property>  
      <name>yarn.resourcemanager.webapp.address </name>  
      <value>hadoop-001:8088</value>  
   </property>  
   <property>  
       <name>yarn.nodemanager.aux-services</name>  
       <value>mapreduce.shuffle</value>  
   </property>  
   <property>  
       <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>  
       <value>org.apache.hadoop.mapred.ShuffleHandler</value>  
   </property>  
   <property>  
     <name>yarn.application.classpath</name>  
     <value>$HADOOP_CONF_DIR,$HADOOP_COMMON_HOME/share/hadoop/common/*,$HADOOP_COMMON_HOME/share/hadoop/common/lib/*,$HADOOP_HDFS_HOME/share/hadoop/hdfs/*,$HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/*,$HADOOP_YARN_HOME/share/hadoop/yarn/*,$HADOOP_YARN_HOME/share/hadoop/yarn/lib/*</value>  
   </property>  
</configuration>


mapred-site.xml

<configuration>  
   <property>  
      <name>mapreduce.framework.name</name>  
      <value>yarn</value>  
  </property>  
  <property>  
      <name>mapreduce.jobhistory.address</name>  
      <value>hadoop-001:10020</value>  
  </property>  
  <property>  
      <name>mapreduce.jobhistory.webapp.address</name>  
      <value>hadoop-001:19888</value>  
  </property>  
  <property>  
      <name>mapreduce.job.tracker</name>  
      <value>hadoop-001:8021</value>  
      <final>ture</final>  
  </property>  
  <property>  
      <name>mapred.system.dir</name>  
      <value>file:/hadoop/mapred/system</value>  
      <final>ture</final>  
  </property>  
  <property>  
       <name>mapred.local.dir</name>  
       <value>file:/hadoop/mapred/local</value>  
       <final>ture</final>  
  </property>  
  <property>    
      <name>mapred.child.env</name>    
      <value>LD_LIBRARY_PATH=/usr/local/lib</value>    
  </property>   
  <!--<property>  
      <name>mapreduce.map.output.compress</name>  
      <value>true</value>  
  </property>  
  <property>  
      <name>mapreduce.map.output.compress.codec</name>  
      <value>com.hadoop.compression.lzo.LzoCodec</value>  
  </property>-->  
</configuration>

4. 准备hdfs的文件路径

/hadoop/tmp  
/hadoop/mapred/system  
/hadoop/mapred/local  
/hadoop/name  
/hadoop/data  
sudo chown hadoop:hadoop -R /hadoop


5. 将 CDH4.4 scp至slaves节点

     scp -r cdh4.4.0/ hadoop-002:~/.  
     scp -r cdh4.4.0/ hadoop-003:~/.  
     scp -r cdh4.4.0/ hadoop-004:~/.  
     


三、启动

1. 格式化文件系统

#在hadoop-001 master节点上

cd cdh4.4.0/hadoop-2.0.0-cdh4.4.0/bin  
hadoop namenode -format


2. 启动

cd cdh4.4.0/hadoop-2.0.0-cdh4.4.0/sbin  
./start-all.sh

jps一下,看有没有相应的进程。

四、遇到的问题



微博:http://weibo.com/kingjames3
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值