云计算Hadoop部署和配置详情(二)



 

  1. 部署HDFS(更正确详细参考书:CDH4-Installation-Guide,P104~)

  1. 设置网络名称

    编辑hosts文件:

    vim /etc/hosts

    192.168.1.2   server1

    192.168.1.3   server2

    192.168.1.4   server3

    192.168.1.5   server4


这一步在server1上执行,然后通过远程复制命令scp传给集群中的其他节点。


  1. HDFS架构

    关于Hadoop\cdh\hadoop-2.2.0\share\doc\hadoop\index.html

    牢记网址:http://archive.cloudera.com/cdh4/

    HDFS是一个具有高度容错性的分布式文件系统,适合部署在廉价的机器上,HDFS能提供高吞吐量的数据访问,非常适合大规模数据集上的应用。HDFS的架构总体上采用了主从架构,主要有以下几个组件组成:ClientNameNodeSecondary NodeNodeDataNode


  1. Client

    Client通过NameNodeDataNode交互访问HDFS中的文件。

  2. NameNode

    NameNode是整个系统的中枢,复制

     

    HDFS分布式文件系统:核心

    namenode:存储metadata

    datanode:存储普通数据

    jobtracker:分配资源、监视进程

    只有reduce的结果放在HDFS

    HDFS的副本存放策略:第一个副本放在本地节点上;

    第二个副本存放在另外一个机架的节点上;

    第三个副本存放在本地机架的另外一个节点上。

    各节点部署模块:

    server1RMHSPSZK-serverclient

    server2NNNMDNMRZKclient

    server3NMDNMRZKclient

    server4NMDNMRZKclient

    chmod og+r 文件:指定某文件(夹)为可读

    cat /etc/passwd .

    cat /etc/group .

    在集群上配置CHD4

    创建cloudera-cdh4.reposcp到所有的节点)

    baseurl = http://192.168.1.2/cdh/4/

    gpgkey = http://192.168.1.2/cdh/RPM-GPG-KEY-cloudera

     

    server1

    yum install zookeeper-server

    yum install hadoop-yarn-resourcemanager

    yum install hadoop-mapreduce-historyserver

    yum install hadoop-yarn-proxyserver

    yum install hadoop-client

     

    cloudera-cdh4.reposcpserver2server3server4

     

    server2

    yum install hadoop-hdfs-namenode

    yum install hadoop-yarn-nodemanager

    yum install hadoop-hdfs-datanode

    yum install hadoop-mapreduce

    yum install zookeeper

    yum install hadoop-client

    server3server4

    yum install hadoop-yarn-nodemanager

    yum install hadoop-hdfs-datanode

    yum install hadoop-mapreduce

    yum install zookeeper

    yum install hadoop-client

     

    拷出默认配置文件(考虑目录):(rm /var/lib/hadoop-hdfs/cache/hdfs/dfs/* -rf

    cp -r /etc/hadoop/conf.dist/opt/hadoop/conf/

    alternatives--verbose --install /etc/hadoop/conf hadoop-conf /etc/hadoop/conf.empty 10


alternatives --verbose --install/etc/hadoop/conf hadoop-conf /etc/hadoop/conf 50


Linux版本更新快,update时不影响配置文件


vim /var/lib/alternatives/hadoop-conf


 


cd /etc/hadoop/conf/


vim core-site.xml


<configuration>


<property>


<name>fs.defaultFS</name>


<value>hdfs://server2/</value>


</property>


</configuration>


vim hdfs-site.xml


<configuration>


<property>


<name>dfs.permissions.superusergroup</name>


<value>hadoop</value>


</property>


<property>


<name>dfs.namenode.name.dir</name>


<value>/data/1/dfs/nn,/nfsmount/dfs/nn</value>


</property>


<property>


<name>dfs.namenode.name.dir</name>


<value>/data/1/dfs/nn,/nfsmount/dfs/nn</value>


</property>


<property>


<name>dfs.datanode.data.dir</name>


<value>/data/1/dfs/dn,/data/2/dfs/dn,/data/3/dfs/dn,/data/4/dfs/dn</value>


</property>


<property>


<name>dfs.webhdfs.enabled</name>


<value>true</value>


</property>


</configuration>


 


配置HDFS本地存储目录


server2


mkdir -p /data/1/dfs/nn /nfsmount/dfs/nn


server2server3server4


mkdir -p /data/1/dfs/dn /data/2/dfs/dn/data/3/dfs/dn /data/4/dfs/dn




chown -R hdfs:hdfs /data/1/dfs/nn/nfsmount/dfs/nn /data/1/dfs/dn /data/2/dfs/dn /data/3/dfs/dn /data/4/dfs/dn


chown -R hdfs:hdfs /data/1/dfs/dn/data/2/dfs/dn /data/3/dfs/dn /data/4/dfs/dn


 

 


vim mapred-site.xml


<configuration>


<property>


<name>mapreduce.framework.name</name>


<value>yarn</value>


</property>


<property>


<name>mapreduce.jobhistory.address</name>


<value>server1:10020</value>


</property>


<property>


<name>mapreduce.jobhistory.webapp.address</name>


<value>server1:19888</value>


</property>


<property>


<name>yarn.app.mapreduce.am.staging-dir</name>


<value>/user</value>


</property>


</configuration>


 


vim yarn-site.xml


<configuration>


<property>


<name>yarn.resourcemanager.resource-tracker.address</name>


<value>server1:8031</value>


</property>


<property>


<name>yarn.resourcemanager.address</name>


<value>server1:8032</value>


</property>


<property>


<name>yarn.resourcemanager.scheduler.address</name>


<value>server1:8030</value>


</property>


<property>


<name>yarn.resourcemanager.admin.address</name>


<value>server1:8033</value>


</property>


<property>


<name>yarn.resourcemanager.webapp.address</name>


<value>server1:8088</value>


</property>


<property>


<description>Classpath for typicalapplications.</description>


<name>yarn.application.classpath</name>


<value>


$HADOOP_CONF_DIR,


$HADOOP_COMMON_HOME/*,$HADOOP_COMMON_HOME/lib/*,


$HADOOP_HDFS_HOME/*,$HADOOP_HDFS_HOME/lib/*,


$HADOOP_MAPRED_HOME/*,$HADOOP_MAPRED_HOME/lib/*,


$YARN_HOME/*,$YARN_HOME/lib/*


</value>


</property>


<property>


<name>yarn.nodemanager.aux-services</name>


<value>mapreduce.shuffle</value>


</property>


<property>


<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>


<value>org.apache.hadoop.mapred.ShuffleHandler</value>


</property>


<property>


<name>yarn.nodemanager.local-dirs</name>


<value>/data/1/yarn/local,/data/2/yarn/local,/data/3/yarn/local</value>


</property>


<property>


<name>yarn.nodemanager.log-dirs</name>


<value>/data/1/yarn/logs,/data/2/yarn/logs,/data/3/yarn/logs</value>


</property>


<property>


<description>Where to aggregatelogs</description>


<name>yarn.nodemanager.remote-app-log-dir</name>


<value>/var/log/hadoop-yarn/apps</value>


</property>


</configuration>


 


格式化namenode


sudo -u hdfs hadoop namenode -format


Re-format filesystem in /data/namedir ? (Yor N)


注意:Respondwith an upper-case Y


 


配置yarn本地存储目录


mkdir -p /data/1/yarn/local/data/2/yarn/local /data/3/yarn/local /data/4/yarn/local


mkdir -p /data/1/yarn/logs/data/2/yarn/logs /data/3/yarn/logs /data/4/yarn/logs


chown -R yarn:yarn /data/1/yarn/local/data/2/yarn/local /data/3/yarn/local /data/4/yarn/local


chown -R yarn:yarn /data/1/yarn/logs/data/2/yarn/logs /data/3/yarn/logs /data/4/yarn/logs


 


启动namenodedatanode


cd /etc/init.d


ls hadoop-hdfs-*


server2hadoop-hdfs-namenode hadoop-hdfs-datanode


server3server4hadoop-hdfs-datanode


server2server3server4


for x in `cd /etc/init.d ; lshadoop-hdfs-*` ; do sudo service $x start; done


 


以下各步在server2上操作:


创建HDFS目录(权限设置为drwxrwxrwt


sudo -u hdfs hdfs dfs -mkdir /tmp


sudo -u hdfs hdfs dfs -chmod -R 1777 /tmp


创建history目录


sudo -u hdfs hadoop fs -mkdir /usr/history


sudo -u hdfs hadoop fs -chmod -R 1777/usr/history


sudo -u hdfs hadoop fs -chown yarn/usr/history


创建log目录


sudo -u hdfs hadoop fs -mkdir/var/log/hadoop-yarn


sudo -u hdfs hadoop fs -chown yarn:mapred/var/log/hadoop-yarn


检查HDFS文件结构


sudo -u hdfs hadoop fs -ls -R /


看到:


drwxrwxrwt - hdfs supergroup 0 2012-04-1914:31 /tmp


drwxr-xr-x - hdfs supergroup 0 2012-05-3110:26 /usr


drwxrwxrwt - yarn supergroup 0 2012-04-1914:31 /usr/history


drwxr-xr-x - hdfs supergroup 0 2012-05-3115:31 /var


drwxr-xr-x - hdfs supergroup 0 2012-05-3115:31 /var/log


drwxr-xr-x - yarn mapred 0 2012-05-31 15:31/var/log/hadoop-yarn


CID-d218af4f-4418-4820-9067-c0a4cc34b165


启动YARN


server1


service hadoop-yarn-resourcemanager start


service hadoop-mapreduce-historyserverstart


server2server3server4


service hadoop-yarn-nodemanager start


为用户创建Home目录


sudo -u hdfs hadoop fs -mkdir /usr/$USER


sudo -u hdfs hadoop fs -chown $USER/usr/$USER


$USER:当前用户,可通过echo $USER查看


设置HADOOP_MAPRED_HOME


cd /etc/profile.d


vim hadoop.sh


exportHADOOP_MAPRED_HOME=/usr/lib/hadoop-mapreduce


因集群中每个服务器安装的功能模块都不一样,所以只能分别启动每个服务器上的各个功能模块,而不能用诸如start-all.sh这种脚本文件启动。


查看网页结果:

server1:8088 /cluster/nodes



server2/3/4:8042/node



server1:19888/jobhistory


server2:


datanode: server4:50075


              1006不可用



datanode: server2:50070/dfshealth.jsp



  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值