HADOOP之旅——部署集群

好吧,把上个月部署hadoop的步骤记一下。因为之前写的文档是英文的,也就不翻译了。


1. Preparation

7 nodes: 1 name node, 6 data nodes

Install OS (Ubuntu 12.04 64bits)

Install all nodes. Modify the default /bin/sh -> bash

Install JDK 1.7

Download JDK-1.7.0-u21, exact to

/usr/local/share/jdk-1.7.0-u21

Download hadoop 1.0.4 tar ball

Download hadoop-1.0.4.tar.gz to name node machine.

 

2. Configure host names

1.    Name the host names with the following list:

(Insert the below to /etc/hosts)

10.67.254.12       namenode

10.67.254.17       datanode-1

10.67.254.18       datanode-2

10.67.254.19       datanode-3

10.67.254.20       datanode-4

10.67.254.21       datanode-5

 

2.    Edit /etc/hostname for each node according to the IP

 

 

 

3. Create User and Group

For all nodes, do the following:

1.    Createa group hadoop

#

groupadd hadoop -g 1001

 

2.    Createa user

#

useradd -m hadoopor -g hadoop

#

passwd hadoopor

 

4. Setup passphraseless ssh

Make namenode could ssh to all data nodes without typing any password.

ssh-copy-id is available to copy the key phrase file and it's simple.

On name node, by user hadoopor, run the following commands:

$

$

$

$

$

$

ssh-keygen -t rsa

ssh-copy-id datanode-1

ssh-copy-id datanode-2

ssh-copy-id datanode-3

ssh-copy-id datanode-4

ssh-copy-id datanode-5

 

 

5. Setup NFS server at name node

Why NFS? Because by NFS we don't have to install hadoop executable and manage configurations all over the world.

Setup NFS server on name node, so that the datanode can share all the hadoop executable and configuration.

On the name node

#

apt-get install nfs-server

#

vi /etc/exports

 

# vi /etc/exports

# add the following

/home/hadoopor    10.67.254.0/255.255.255.0(ro)

 

Force nfsd to re-read the /etc/exports file.

#

exportfs -ra

 

On the datanodes

1.    Installnfs client

#

apt-get install nfs-client

#

mkdir /mnt/hadoopor

#

mount namenode:/home/hadoopor /mnt/hadoopor

 

2.    Addto /etc/fstab file in order to mount automatically once restart the data nodes.

# vi /etc/fstab

# device                   mountpoint        fs-type    options   dump  fsckorder

namenode:/home/hadoopor  /mnt/hadoopor     nfs       ro        0      0

 

3.    Refer the NFS device. By user hadoopor:

$ ln -sn /mnt/hadoopor/hadoop-1.0.4

 

4.    Create logs directory for each data nodes

$ mkdir /home/hadoopor/hadoop-logs

 

6. System-wide profile

Forall nodes (both name node and data nodes)

# vi /etc/profile

 

… …

JAVA_HOME=/usr/local/share/jdk1.7.0_21

HADOOP_HOME=/home/hadoopor/hadoop-1.0.4

PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin

 

7. Configure cluster

Onefor name node, the others are data nodes

Replication number 2

 

$ vi conf/hadoop-env.sh

export JAVA_HOME=/usr/local/share/jdk1.7.0_21

… …

export HADOOP_LOG_DIR=/home/hadoopor/hadoop-logs

 

 

Configure the hadoop, on the named node, which the configuration will be exported to all datanodes.

 

$ vi conf/core-site.xml

<configuration>

<property>

       <name>hadoop.tmp.dir</name>

       <value>/var/tmp</value>

       <description>A base for other temporary directories.</description>

</property>

<property>

       <name>fs.default.name</name>

       <value>hdfs://namenode:9000</value>

</property>

<configuration>

 

$ vi conf/hdfs-site.xml

<configuration>

<property>

       <name>dfs.replication</name>

       <value>2</value>

</property>

<configuration>

 

 

$ vi conf/mapred-site.xml

<configuration>

<property>

       <name>mapred.job.tracker</name>

       <value>http://namenode:9001</value>

</property>

<configuration>

 

 

8. Final

Format name node

$

bin/hadoop namenode -format

 

Start all

$

bin/start-all.sh

 

Test

$

bin/hadoop dfsadmin -report

 


  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值