Ubuntu Server 12.04 搭建 hadoop 集群版环境——基于VirtualBox

最新推荐文章于 2020-12-20 17:11:58 发布

spring8743

最新推荐文章于 2020-12-20 17:11:58 发布

阅读量1k

点赞数

本文链接：https://blog.csdn.net/spring8743/article/details/41008061

版权

1 . 下载和安装Oracle VM VitualBox

http://www.oracle.com/technetwork/server-storage/virtualbox/downloads/index.html (VirtualBox-4.2.6-82870-Win.exe)

2. 下载Linux系统Ubuntu Server版本

http://www.ubuntu.com/download/server （ubuntu-12.04.2-server-i386.iso）

3. 在VirtualBox中虚拟三台linux主机，分别起名为

feixu-master

feixu-slave1

feixu-slave2

VirtualBox中，需要把每一个VM的网络设置为Bridge Adapter/Host Only，Bridge连接方式可以连接外网，Host Only相当于局域网，不能访问外网。

4. 为master和slave分别创建hadoop用户和用户组

先创建hadoop用户组：

sudo addgroup hadoop

然后创建hadoop用户：

sudo adduser -ingroup hadoop hadoop

给hadoop用户添加权限，打开/etc/sudoers文件：

sudo vi f /etc/sudoers

在root ALL=(ALL:ALL) ALL下添加hadoop ALL=(ALL:ALL) ALL

 hadoop  ALL=(ALL:ALL) ALL

5. 修改master和slave机器名

打开/etc/hostname文件：

sudo vi /etc/hostname

6. 修改master和slave的hosts文件

sudo vi /etc/hosts

在后面添加内容为：

192.168.1.100 feixu-master

192.168.1.101 feixu-slave1

192.168.1.102 feixu-slave2

注：hosts文件用于ip地址与主机名的映射，这样就可以通过主机名直接访问机器，不用通过IP。每次IP地址变动的时候都需要改动此文件。

7. 为master和slave安装ssh服务和建立ssh无密码登陆环境

执行下面的命令来安装：

sudo apt-get install ssh openssh-server

采用rsa方式创建ssh-key

ssh-keygen -t rsa -P ""

(注：回车后会在~/.ssh/下生成两个文件：id_rsa和id_rsa.pub这两个文件是成对出现的)

进入~/.ssh/目录下，将id_rsa.pub追加到authorized_keys授权文件中，开始是没有authorized_keys文件，完成下面的命令后，文件会自动生成出来；

 cd ~/.ssh
  cat id_rsa.pub >> authorized_keys

为每一台机器做完以上步骤后，可以用以下命令测试，正确的应该是无需密码直接访问本机

  ssh feixu-master;   ssh  feixu-slave1;   ssh feixu-slave2

另外还需要将master的密钥id_rsa.pub文件追加到slave机器的授权文件中

 cat master. id_rsa.pub >> slave. authorized_keys

测试从master机器访问slave,正确的应该是无需密码直接访问slave

 ssh  feixu-slave1; ssh feixu-slave2

8. 为master和slave安装Oracle JDK

依次执行以下命令：

sudo apt-get install python-software-properties
sudo apt-get install apt-file && apt-file update

sudo apt-get purge openjdk*

sudo apt-get install software-properties-common

#如果网络不是在proxy后面，可以省掉export以及-E的参数

export http_proxy = http : //<proxy>:<port>
export https_proxy = http : //<proxy>:<port>

sudo -E add-apt-repository ppa:webupd8team/java

sudo apt-get update

sudo vim /etc/apt/apt.conf

  #如果文件中没有http和https的代理设置，就加上，当然，如果你不是要代理来上网的话，就不要设置了
  Acquire::http::proxy "http://<proxy>:<port>/";
  Acquire::https::proxy "https://<proxy>:<port>/";

sudo apt-get install oracle-java7-installer

9. 为master和slave安装Hadoop

安装FTP服务，可以从window下载hadoop,然后通过ftp传到虚拟机上

 sudo apt-get install vsftpd

修改FTP配置文件，使其能够读写

sudo vi /etc/vsftpd.conf

重启FTP服务

cd /srv/ftp; sudo /etc/init.d/vsftpd restart

下载hadoop-1.2.1.tar.gz，将它复制到安装目录 /usr/local/下

sudo cp hadoop-1.2.1.tar.gz /usr/local/

解压hadoop-1.2.1.tar.gz

cd /usr/local

sudo tar -zxf hadoop-1.2.1..tar.gz

将解压出的文件夹改名为hadoop

sudo mv hadoop-1.2.1 hadoop

将该hadoop文件夹的属主用户设为hadoop

sudo chown -R hadoop:hadoop hadoop

打开hadoop/conf/hadoop-env.sh文件

sudo vim hadoop/conf/hadoop-env.sh

配置conf/hadoop-env.sh（找到#export JAVA_HOME=...,去掉#，然后加上本机jdk的路径）

export JAVA_HOME=/usr/lib/jvm/java-7-oracle

打开conf/core-site.xml文件

sudo vim hadoop/conf/core-site.xml

编辑如下：

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="http://ziscloud.blog.163.com/blog/configuration.xsl"?>
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://master:9000</value>
</property>
</configuration>

打开conf/mapred-site.xml文件

sudo vim hadoop/conf/mapred-site.xml

编辑如下：

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="http://ziscloud.blog.163.com/blog/configuration.xsl"?>
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>master:9001</value>
</property>
</configuration>

打开conf/hdfs-site.xml文件

sudo vim hadoop/conf/hdfs-site.xml

编辑如下：