Hadoop的伪分布和完全分布安装配置过程

客户机环境处理

修改win7 hosts文件以便查看web UI

--------------------

         [C:\Windows\System32\drivers\etc\hosts]

         127.0.0.1localhost

         192.168.238.128 s100

         192.168.238.129 s102

         192.168.238.130 s103

         192.168.238.131 s104

         192.168.238.132 s105

--------------------

 

         ip               :100 - 104

         主机名:   s100

 

准备5台主机或者虚拟机并修改主机名

--------------

         [/etc/hostname]

         s100

 

修改dns解析

-------------------

         [/etc/hosts]

         127.0.0.1localhost

         192.168.231.100s100

         192.168.231.101s101

         192.168.231.102s102

         192.168.231.103s103

         192.168.231.104s104

 

修改ip地址

-----------------------

         [/etc/network/interfaces]

         #This file describes the network interfaces available on your system

         #and how to activate them. For more information, see interfaces(5).

 

         #The loopback network interface

         autolo

         ifacelo inet loopback

 

         #ifaceeth0 inet static

         ifaceeth0 inet static

         address192.168.231.100

         netmask255.255.255.0

         gateway192.168.231.2

         dns-nameservers192.168.231.2

         autoeth0

 

hadoop的安装

------------------

         0.创建/soft目录,并更改用户和组

                   $>sudomkdir /soft

                   $>sudochown ubuntu:ubuntu /soft

 

         1.安装jdk

                   a.复制jdk-8u65-linux-x64.tar.gz到 ~/Downloads

                            $>cp/mnt/hgfs/downloads/bigdata/jdk-8u65-linux-x64.tar.gz ~/Downloads

                   b.tarjdk-8u65-linux-x64.tar.gz

                            $>cd~/Downloads

                            $>tatar -xzvf jdk-8u65-linux-x64.tar.gz

 

                   c.移动到jdk1.8.0_65到/soft下

                            $>mv~/Downloads/jdk1.8.0_65 /soft

                            $>ln-s /soft/jdk-xxx jdk                      //创建符号连接

 

                   d.配置环境变量

                            [/etc/environment]

                            JAVA_HOME=/soft/jdk

                            PATH="...:/soft/jdk/bin"

 

                   e.让环境变量生效

                            $>source/etc/environment

 

                   f.检验安装是否成功

                            $>java-version

 

         2.安装hadoop        

                   a.复制并tar开hadoop.tar.gz

                            $>cp/mnt/hgfs/downloads/bigdata/hadoop-2.7.2.tar.gz ~/Downloads/

                            $>cd~/Downloads

                            $>tar-xzvf hadoop-2.7.2.tar.gz

                            $>mv~/Downloads/hadoop-2.7.2 /soft                    //移动到/soft下

                            $>cd/soft

                            $>ln-s hadoop-2.7.2 hadoop                                        //创建hadoop符号连接

 

                   b.配置环境变量.

                            $>sudonano /etc/environment

                            [/etc/environment]

                            JAVA_HOME=/soft/jdk

                            HADOOP_HOME=/soft/hadoop

                            PATH="...:/soft/jdk/bin:/soft/hadoop/bin:/soft/hadoop/sbin"

                  

                   c.重启系统

                            $>sudoreboot

                  

                   d.验证hadoop安装是否成功

                            $>hadoopversion

配置Hadoop

----------------

         1.Standalone/local

                   独立/本地模式,使用的本地文件系统。

                   nothing!!!

                   查看文件系统的方式:

                   $>hadoopfs -ls

 

                   没有启动任何java进程。

                   用于测试和开发环境.

 

         2.Pseudodistributedmode

                   伪分布模式

                   [配置过程]

                   a.core-site.xml

<?xml version="1.0" ?>

<configuration>

         <property>

                   <name>fs.defaultFS</name>

                   <value>hdfs://localhost/</value>

         </property>

</configuration>

                   b.hdfs-site.xml

<?xml version="1.0"?>

<configuration>

         <property>

                   <name>dfs.replication</name>

                   <value>1</value>

         </property>

</configuration>

                   c.mapred-site.xml(本文件自己新建)

<?xml version="1.0"?>

<configuration>

         <property>

                   <name>mapreduce.framework.name</name>

                   <value>yarn</value>

         </property>

</configuration>

                   d.yarn-site.xml

<?xml version="1.0"?>

<configuration>

         <property>

                  <name>yarn.resourcemanager.hostname</name>

                   <value>localhost</value>

         </property>

         <property>

                   <name>yarn.nodemanager.aux-services</name>

                   <value>mapreduce_shuffle</value>

         </property>

</configuration>

 

                   e.配置SSH

                            安全登录.

                            1)安装ssh

                                     $>sudoapt-get install ssh

                            2)生成秘钥对

                                     $>ssh-keygen-t rsa -P '' -f ~/.ssh/id_rsa

                                     $>cd~/.ssh                                                                        //查看生成的公私秘钥

                            3)导入公钥数据到授权库中

                                     $>cat~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

                            4)登录到localhost

                                     $>sshlocalhost

                                     $>....yes

                                     $>exit

                                     $>sshlocalhost                                    //不需要密码

 

                            5)格式化hdfs文件系统

                                     $>hadoopnamenode -format

 

                            6)启动所有进程

                                     $>start-all.sh

                           

                            7)查看进程

                                     $>jps                                     //5 RM NMNN DN 2NN

                           

                            8)查看文件系统

                                     $>hadoopfs -ls

                           

                            9)创建文件系统

                                     $>hadoopfs -mkdir -p /user/ubuntu/data

                                     $>hadoopfs -ls -R /                                       //-lsr

 

 

         3.Fullydistributed mode

                   完全分布式

 

 

安装SSH注意事项

---------------

         1.禁用wifi

         2.关闭防火墙

         3.client能够访问外网

                   $>pingwww.baidu.com

         4.修改ubuntu的软件源

                   [/etc/apt/sources.list]

                   ...

                   163

 

                   [aliyun源]

                   debhttp://mirrors.aliyun.com/ubuntu/ precise main restricted universe multiverse

                   debhttp://mirrors.aliyun.com/ubuntu/ precise-security main restricted universemultiverse

                   debhttp://mirrors.aliyun.com/ubuntu/ precise-updates main restricted universemultiverse

                   debhttp://mirrors.aliyun.com/ubuntu/ precise-proposed main restricted universemultiverse

                   debhttp://mirrors.aliyun.com/ubuntu/ precise-backports main restricted universemultiverse

                   deb-srchttp://mirrors.aliyun.com/ubuntu/ precise main restricted universe multiverse

                   deb-srchttp://mirrors.aliyun.com/ubuntu/ precise-security main restricted universemultiverse

                   deb-srchttp://mirrors.aliyun.com/ubuntu/ precise-updates main restricted universemultiverse

                   deb-srchttp://mirrors.aliyun.com/ubuntu/ precise-proposed main restricted universemultiverse

                   deb-srchttp://mirrors.aliyun.com/ubuntu/ precise-backports main restricted universemultiverse

        

         5.安装ssh

                   $>sudoapt-get install ssh

        

         6.查看进程,是否启动了sshd服务

                   $>ps-Af | grep ssh

        

         7.生成秘钥对

                   $>ssh-keygen-t rsa -P '' -f ~/.ssh/id_rsa

        

         8.导入公钥到授权keys文件

                   $>cat~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

 

         9.登录localhost

                   $>sshlocalhost

                   $>输入yes

         10.退出之后,再登录

                   $>sshlocalhost

 

 

使用nc在两个client之间传递文件

------------------------------

         0.描述

                   100向101传递文件.

         1.在101机器

                   $>nc-l 8888 > ~/.ssh/id_rsa.pub.100

         2.在100机器

                   $>nc192.168.231.101 8888 < ~/.ssh/id_rsa.pub

         3.在101上添加公钥文件

                   $>cat~/.ssh/id_rsa.pub.100 >> ~/.ssh/authorized_keys

 

首次启动hadoop

-------------------

         1.格式化文件系统

                   $>hadoopnamenode -format

         2.启动所有进程

                   $>start-all.sh

         3.查询进程

                   $>jps

         4.停止所有进程

                   $stop-all.sh

 

使用webui访问hadoop hdfs

----------------------------

         1.hdfswebui

                   http://localhost:50070/

         2.datanode

                   http://localhost:50075/

         3.2nn

                   http://localhost:50090/

完全分布式

 

自定义脚本xsync,在集群上分发文件。

-------------------------------------

         循环复制文件到所有节点的相同目录下。

         rsync-rvl /home/ubuntu ubuntu@s101:

         xsynchello.txt

 

         [/usr/local/bin/xsync]

         #!/bin/bash

         pcount=$#

         if(( pcount<1 )) ; then

                   echono args;

                   exit;

         fi

 

         p1=$1;

         fname=`basename$p1`

         #echofname=$fname;

 

         pdir=`cd-P $(dirname $p1) ; pwd`

         #echopdir=$pdir

 

         cuser=`whoami`

         for(( host=100 ; host<105 ; host=host+1 )) ; do

           echo ------------ s$host ---------------

           rsync -rvl $pdir/$fname $cuser@s$host:$pdir

         done

 

 

编写/usr/local/bin/xcall脚本,在所有主机上执行相同的命令

---------------------------------------------------------

         [/usr/local/bin/xcall]

         #!/bin/bash

         pcount=$#

         if(( pcount<1 )) ; then

                   echono args;

                   exit;

         fi

 

         echo   -------- localhost --------

         $@

         for(( host=101 ; host<105 ; host=host+1 )) ; do

           echo -------- s$host --------

           ssh s$host $@

         done

 

         1.准备5台客户机

         2.安装jdk

                   略

         3.配置环境变量

                   JAVA_HOME

                   PATH

         4.安装hadoop

                   略

         5.配置环境变量

                   HADOOP_HOME

                   PATH

         6.安装ssh

         7.配置文件

                   [/soft/hadoop/etc/hadoop/core-site.xml]

                   fs.defaultFS=hdfs://s100/

 

                   [/soft/hadoop/etc/hadoop/hdfs-site.xml]

                   [hdfs-site.xml]

                   <?xmlversion="1.0"?>

                   <configuration>

                            <property>

                                     <name>dfs.replication</name>

                                     <value>3</value>

                            </property>

                            <property>

                                     <name>dfs.namenode.secondary.http-address</name>

                                     <value>s104:50090</value>

                            </property>

                   </configuration>

 

                   [/soft/hadoop/etc/hadoop/yarn-site.xml]

                   yarn.resourcemanager.hostname=s100

 

                   [/soft/hadoop/etc/hadoop/slaves]

                   s101

                   s102

                   s103

         8.在集群上分发以上三个文件

                   $>cd/soft/hadoop/etc/hadoop

                   $>xsynccore-site.xml

                   $>xsyncyarn-site.xml

                   $>xsyncslaves

       $>xsync hdfs-site.xml

 

修改本地的临时目录

-----------------------      

         1.修改增加hadoop.tmp.dir

                   在[core-site.xml]增加以下设置:

                   <property>

                     <name>hadoop.tmp.dir</name>

                     <value>/home/ubuntu/hadoop/</value>

                   </property>

 

         2.分发core-site.xml

                   $>xsynccore-site.xml

 

         3.停止进程

                   $>stop-all.sh

                  

         4.格式化

                   $>hadoopnamenode -format

         5.启动所有进程

                   $>start-all.sh

         6.重启系统

                   $>sudoreboot                   //是否ok.

  • 1
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值