HADOOP之旅——部署集群

最新推荐文章于 2022-03-09 14:20:55 发布

hanphy

最新推荐文章于 2022-03-09 14:20:55 发布

阅读量596

点赞数

分类专栏：零碎札记文章标签： Hadoop

本文链接：https://blog.csdn.net/hanphy/article/details/8904946

版权

零碎札记专栏收录该内容

5 篇文章 0 订阅

订阅专栏

好吧，把上个月部署hadoop的步骤记一下。因为之前写的文档是英文的，也就不翻译了。

1. Preparation

7 nodes: 1 name node, 6 data nodes

Install OS (Ubuntu 12.04 64bits)

Install all nodes. Modify the default /bin/sh -> bash

Install JDK 1.7

Download JDK-1.7.0-u21, exact to

/usr/local/share/jdk-1.7.0-u21

Download hadoop 1.0.4 tar ball

Download hadoop-1.0.4.tar.gz to name node machine.

2. Configure host names

1. Name the host names with the following list:

(Insert the below to /etc/hosts)

10.67.254.12 namenode

10.67.254.17 datanode-1

10.67.254.18 datanode-2

10.67.254.19 datanode-3

10.67.254.20 datanode-4

10.67.254.21 datanode-5

2. Edit /etc/hostname for each node according to the IP

3. Create User and Group

For all nodes, do the following:

1. Createa group hadoop

#	groupadd hadoop -g 1001

2. Createa user

#	useradd -m hadoopor -g hadoop
#	passwd hadoopor

4. Setup passphraseless ssh

Make namenode could ssh to all data nodes without typing any password.

ssh-copy-id is available to copy the key phrase file and it's simple.

On name node, by user hadoopor, run the following commands:

ssh-keygen -t rsa

ssh-copy-id datanode-1

ssh-copy-id datanode-2

ssh-copy-id datanode-3

ssh-copy-id datanode-4

ssh-copy-id datanode-5

5. Setup NFS server at name node

Why NFS? Because by NFS we don't have to install hadoop executable and manage configurations all over the world.

Setup NFS server on name node, so that the datanode can share all the hadoop executable and configuration.

On the name node

#	apt-get install nfs-server
#	vi /etc/exports

# vi /etc/exports

# add the following

/home/hadoopor 10.67.254.0/255.255.255.0(ro)

Force nfsd to re-read the /etc/exports file.

#	exportfs -ra

On the datanodes

1. Installnfs client

#	apt-get install nfs-client
#	mkdir /mnt/hadoopor
#	mount namenode:/home/hadoopor /mnt/hadoopor

2. Addto /etc/fstab file in order to mount automatically once restart the data nodes.

# vi /etc/fstab

# device mountpoint fs-type options dump fsckorder

namenode:/home/hadoopor /mnt/hadoopor nfs ro 0 0

3. Refer the NFS device. By user hadoopor:

$ ln -sn /mnt/hadoopor/hadoop-1.0.4

4. Create logs directory for each data nodes

$ mkdir /home/hadoopor/hadoop-logs

6. System-wide profile

Forall nodes (both name node and data nodes)

# vi /etc/profile

… …

JAVA_HOME=/usr/local/share/jdk1.7.0_21

HADOOP_HOME=/home/hadoopor/hadoop-1.0.4

PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin

7. Configure cluster

Onefor name node, the others are data nodes

Replication number 2

$ vi conf/hadoop-env.sh

export JAVA_HOME=/usr/local/share/jdk1.7.0_21

… …

export HADOOP_LOG_DIR=/home/hadoopor/hadoop-logs

Configure the hadoop, on the named node, which the configuration will be exported to all datanodes.

$ vi conf/core-site.xml

<name>hadoop.tmp.dir</name>

<description>A base for other temporary directories.</description>

</property>

<name>fs.default.name</name>

<value>hdfs://namenode:9000</value>

</property>

$ vi conf/hdfs-site.xml

<name>dfs.replication</name>

</property>

$ vi conf/mapred-site.xml

<name>mapred.job.tracker</name>

<value>http://namenode:9001</value>

</property>

8. Final

Format name node

$	bin/hadoop namenode -format

Start all

$	bin/start-all.sh

Test

$	bin/hadoop dfsadmin -report

hanphy

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
HADOOP之旅——部署集群

好吧，把上个月部署hadoop的步骤记一下。因为之前写的文档是英文的，也就不翻译了。1. Preparation7 nodes: 1 name node, 6 data nodesInstall OS (Ubuntu 12.04 64bits)Install all nodes. Modify the default /bin/sh -> bashInstall
复制链接

扫一扫

专栏目录