Hadoop2.2.0 集群配置攻略

Hadoop2.2.0 集群配置攻略

 用户输入标识: chmod+x jdk-7u45-linux-x64.rpm  为黑色带底纹

 系统输出标识: java version "1.7.0_51" 为绿色小字





2014年3月20-日 by lilihao Q 404536204



1.安装sun jdk

(1). 到Oracle的官方网站下载jdk,目前最新版本是7u51

安装包:

http://www.oracle.com/technetwork/java/javase/downloads/jdk7-downloads-1880260.html

或 rpm包

http://www.oracle.com/technetwork/cn/java/javase/downloads/jdk7-downloads-1880260-zhs.html

 

(2). 卸载jdk,  如果在部署机上已有其他版本

rpm -qa | grep jdk

  ldapjdk-4.18-2jpp.3.el5 
  jdk-1.7.0_51-fcs

rpm -e --nodeps jdk-1.7.0_51-fcs

 

(3)安装jdk

chmod +x jdk-7u45-linux-x64.rpm

rpm -ivh  jdk-7u45-linux-x64.rpm

 

(4). 在/etc/profile中添加环境变量
JAVA_HOME=/usr/java/jdk1.7.0_51

JRE_HOME=/usr/java/jdk1.7.0_51/jre

CLASSPATH=.:$JAVA_HOME/lib/jt.jar:$JAVA_HOME/lib/tools.jar:$JRE_HOME/lib

PATH=$PATH:$JAVA_HOME/bin

export JAVA_HOME JRE_HOME PATH CLASSPATH



(5).保存环境变量

source /etc/profile

 

(6).安装测试

java -version

 java version "1.7.0_51"

 Java(TM) SE Runtime Environment (build 1.7.0_51-b13)

 Java HotSpot(TM) 64-Bit Server VM (build 24.51-b03, mixed mode)

javac -version

  javac 1.7.0_51



(6). 卸载jdk,  如果有必要

rpm -qa | grep jdk

  ldapjdk-4.18-2jpp.3.el5 
  jdk-1.7.0_51-fcs

rpm -e --nodeps jdk-1.7.0_51-fcs

 

2.安装其他基础库

(1). 如果是Debian或CentOs安装

# yum install g++ autoconf automake libtoolmake cmake zlib1g-dev pkg-config libssl-dev
# yum install openssh-clients
# yum install openssh-server

 

(1').Ubuntu系统 则

 # sudo apt-get install g++ autoconf automakelibtool make cmake zlib1g-dev pkg-config libssl-dev

 # sudo apt-get install openssh-clients

 # sudo apt-getinstall openssh-server

 

(2). 安装protobuf

下载地址:下载最新的安装包,当前最新的安装包为 protobuf-2.5.0

https://code.google.com/p/protobuf/downloads/list

tar -xzvf protobuf-2.5.0.tar.gz

# cd probuf-2.5.0

./configure ; make ; make check; make install

protoc --version

  libprotoc 2.5.0



3.安装maven

Maven是一个软件项目管理及自动构建工具,用于组件的编译和构建。
(1). 首先到Maven官网( http://maven.apache.org/download.cgi),
     下载最新的bin包(apache-maven-3.2.1-bin.tar.gz)。完成后解压并将目录移动到/usr/local/apache-maven

tar -xzvf  apache-maven-3.2.1-bin.tar.gz 

cp  -r apache-maven-3.2.1 /usr/local/apache-maven-3.2.1

 
(2). 编辑/etc/profile,配置环境变量
MAVEN_HOME=/usr/local/apache-maven-3.2.1
PATH=$PATH:$MAVEN_HOME/bin
export MAVEN_HOME PATH

 

(3).保存环境变量

source /etc/profile



(4). 测试安装成功与否
echo $MAVEN_HOME

 /usr/local/apache-maven/apache-maven-3.2.1
mvn -v

 ApacheMaven 3.2.1 (ea8b2b07643dbb1b84b6d16e1f08391b666bc1e9;2014-02-15T01:37:52+08:00)  
 Maven home: /usr/local/apache-maven/apache-maven-3.2.1 
 Java version: 1.7.0_51, vendor: Oracle Corporation 
 Java home: /usr/java/jdk1.7.0_51/jre 
 Default locale: en_US, platform encoding: UTF-8  
 OS name: "linux", version: "2.6.18-194.el5", arch:"amd64", family: "unix"

 

(5). 修改maven配置

 由于maven国外服务器可能连不上,先给maven配置一下国内镜像,在maven目录下,conf/settings.xml,

 在<mirrors></mirros>里添加,原本的不要动, 同样, 在<profiles></profiles>内新添加



# vim $MAVEN_HOME/conf/settings.xml
<mirror>

<id>nexus-osc</id>

<mirrorOf>*</mirrorOf>

<name>Nexusosc</name>

<url>http://maven.oschina.net/content/groups/public/</url>

</mirror>

 

<profile>

<id>jdk-1.7</id>

<activation>

<jdk>1.7</jdk>

</activation>

<repositories>

<repository>

<id>nexus</id>

<name>local private nexus</name>

<url>http://maven.oschina.net/content/groups/public/</url>

<releases>

<enabled>true</enabled>

</releases>

<snapshots>

<enabled>false</enabled>

</snapshots>

</repository>

</repositories>

<pluginRepositories>

<pluginRepository>

<id>nexus</id>

<name>local private nexus</name>

<url>http://maven.oschina.net/content/groups/public/</url>

<releases>

<enabled>true</enabled>

</releases>

<snapshots>

<enabled>false</enabled>

</snapshots>

</pluginRepository>

</pluginRepositories>

</profile>



4.编译安装Hadoop

(1). 首先官方下载hadoop源码
wgethttp://mirrors.cnnic.cn/apache/hadoop/common/hadoop-2.2.0/hadoop-2.2.0-src.tar.gz 

# tar -xzvf hadoop-2.2.0-src.tar.gz 

cd hadoop-2.2.0-src



(2). 修复hadoop源码版本中的bug

vim hadoop-common-project/hadoop-auth/pom.xml

<dependencys>

   <dependency>

      <groupId>org.mortbay.jetty</groupId>

     <artifactId>jetty-util</artifactId>

      <scope>test</scope>

     </dependency>

     <dependency>

       <groupId>org.mortbay.jetty</groupId>

      <artifactId>jetty</artifactId>

       <scope>test</scope>

     </dependency>

</dependencys>



(2). 编译hadoop源码,见BUILDING.txt

mvn package-Pdist,native -DskipTests -Dtar

[INFO]------------------------------------------------------------------------

[INFO]Reactor Summary:

[INFO] 

[INFO]Apache Hadoop Main ................................ SUCCESS [3.709s]

[INFO]Apache Hadoop Project POM ......................... SUCCESS [2.229s]

[INFO]Apache Hadoop Annotations ......................... SUCCESS [5.270s]

[INFO]Apache Hadoop Assemblies .......................... SUCCESS [0.388s]

[INFO]Apache Hadoop Project Dist POM .................... SUCCESS [3.485s]

[INFO]Apache Hadoop Maven Plugins ....................... SUCCESS [8.655s]

[INFO]Apache Hadoop Auth ................................ SUCCESS [7.782s]

[INFO]Apache Hadoop Auth Examples ....................... SUCCESS [5.731s]

[INFO]Apache Hadoop Common .............................. SUCCESS [1:52.476s]

[INFO]Apache Hadoop NFS ................................. SUCCESS [9.935s]

[INFO]Apache Hadoop Common Project ...................... SUCCESS [0.110s]

[INFO]Apache Hadoop HDFS ................................ SUCCESS [1:58.347s]

[INFO]Apache Hadoop HttpFS .............................. SUCCESS [26.915s]

[INFO]Apache Hadoop HDFS BookKeeper Journal ............. SUCCESS [17.002s]

[INFO]Apache Hadoop HDFS-NFS ............................ SUCCESS [5.292s]

[INFO]Apache Hadoop HDFS Project ........................ SUCCESS [0.073s]

[INFO]hadoop-yarn ....................................... SUCCESS [0.335s]

[INFO]hadoop-yarn-api ................................... SUCCESS [54.478s]

[INFO]hadoop-yarn-common ................................ SUCCESS [39.215s]

[INFO]hadoop-yarn-server ................................ SUCCESS [0.241s]

[INFO]hadoop-yarn-server-common ......................... SUCCESS [15.601s]

[INFO]hadoop-yarn-server-nodemanager .................... SUCCESS [21.566s]

[INFO]hadoop-yarn-server-web-proxy ...................... SUCCESS [4.754s]

[INFO]hadoop-yarn-server-resourcemanager ................ SUCCESS [20.625s]

[INFO]hadoop-yarn-server-tests .......................... SUCCESS [0.755s]

[INFO]hadoop-yarn-client ................................ SUCCESS [6.748s]

[INFO]hadoop-yarn-applications .......................... SUCCESS [0.155s]

[INFO]hadoop-yarn-applications-distributedshell ......... SUCCESS [4.661s]

[INFO]hadoop-mapreduce-client ........................... SUCCESS [0.160s]

[INFO]hadoop-mapreduce-client-core ...................... SUCCESS [36.090s]

[INFO]hadoop-yarn-applications-unmanaged-am-launcher .... SUCCESS [2.753s]

[INFO]hadoop-yarn-site .................................. SUCCESS [0.151s]

[INFO]hadoop-yarn-project ............................... SUCCESS [4.771s]

[INFO]hadoop-mapreduce-client-common .................... SUCCESS [24.870s]

[INFO]hadoop-mapreduce-client-shuffle ................... SUCCESS [3.812s]

[INFO]hadoop-mapreduce-client-app ....................... SUCCESS [15.759s]

[INFO]hadoop-mapreduce-client-hs ........................ SUCCESS [6.831s]

[INFO]hadoop-mapreduce-client-jobclient ................. SUCCESS [8.126s]

[INFO]hadoop-mapreduce-client-hs-plugins ................ SUCCESS [2.320s]

[INFO]Apache Hadoop MapReduce Examples .................. SUCCESS [9.596s]

[INFO]hadoop-mapreduce .................................. SUCCESS [3.905s]

[INFO]Apache Hadoop MapReduce Streaming ................. SUCCESS [7.118s]

[INFO]Apache Hadoop Distributed Copy .................... SUCCESS [11.651s]

[INFO]Apache Hadoop Archives ............................ SUCCESS [2.671s]

[INFO]Apache Hadoop Rumen ............................... SUCCESS [10.038s]

[INFO]Apache Hadoop Gridmix ............................. SUCCESS [6.062s]

[INFO]Apache Hadoop Data Join ........................... SUCCESS [4.104s]

[INFO]Apache Hadoop Extras .............................. SUCCESS [4.210s]

[INFO]Apache Hadoop Pipes ............................... SUCCESS [9.419s]

[INFO]Apache Hadoop Tools Dist .......................... SUCCESS [2.306s]

[INFO]Apache Hadoop Tools ............................... SUCCESS [0.037s]

[INFO]Apache Hadoop Distribution ........................ SUCCESS [21.579s]

[INFO]Apache Hadoop Client .............................. SUCCESS [7.299s]

[INFO]Apache Hadoop Mini-Cluster ........................ SUCCESS [7.347s]

[INFO]------------------------------------------------------------------------

[INFO]BUILD SUCCESS

[INFO]------------------------------------------------------------------------

[INFO]Total time: 11:53.144s

[INFO]Finished at: Fri Nov 22 16:58:32 CST 2013

[INFO]Final Memory: 70M/239M

[INFO]------------------------------------------------------------------------

 

(4). 查看编译目录

# cd ./hadoop-dist/target/hadoop-2.2.0

# ls

# cdbin

#./hadoop version

   Hadoop 2.2.0



(5). 拷贝执行目录

# cp-r ./hadoop-dist/target/hadoop-2.2.0 /usr/local/

 

(6). 编辑/etc/profile,配置环境变量
export HADOOP_HOME=/usr/local/hadoop-2.2.0

export HADOOP_PREFIX=${HADOOP_HOME}

export HADOOP_COMMON_HOME=${HADOOP_PREFIX}

export HADOOP_HDFS_HOME=${HADOOP_PREFIX}

export HADOOP_MAPRED_HOME=${HADOOP_PREFIX}

export HADOOP_YARN_HOME=${HADOOP_PREFIX}

exportHADOOP_CONF_DIR="$HADOOP_HOME/etc/hadoop/"

export YARN_CONF_DIR=${HADOOP_CONF_DIR}

export PATH=$PATH:$HADOOP_PREFIX/bin:$HADOOP_PREFIX/sbin

 

(7).保存环境变量

source /etc/profile



(8). 测试编译版本及配置

hadoop version

  Hadoop 2.2.0

 

5. Linux分布式配置

若采用单机部署,则直接跳至 7.hadoop 单机部署配置

(1). 确定linux主从机器, 并修改/etc/sysconfig/network 的主机名

如部署三台机器,

# vim/etc/sysconfig/network

#hostname xxx

61.129.82.157(had-master)

61.129.82.221(had-slave1)

61.129.82.222(had-slave2)

(2). 配置每台机器上的hosts

修改每台机器的/etc/hosts(包括namenode和datanode)

127.0.0.1   localhost.localdomain localhost
61.129.82.157   had-master
61.129.82.221   had-slave1
61.129.82.222   had-slave2
::1          localhost6.localdomain6localhost6

(3). 在linux主从机器上建立统一帐号并设计密码

#useradd hadoop

#passwd hadoop

# suhadoop

$ cd ~

(4).在NameNode上生成公密私密

haduser@61.129.82.157$ ssh-keygen -t rsa -P''

Generating public/private rsakey pair.

Enter passphrase (empty forno passphrase): (忽略)

Enter same passphrase again:(忽略)

Your identification has beensaved in /home/haduser/.ssh/id_rsa.

Your public key has beensaved in /home/haduser/.ssh/id_rsa.pub.

(5).复制公密到待控制服务器

haduser@61.129.82.221$cd ~; mkdir .ssh

haduser@61.129.82.222$cd ~; mkdir .ssh

haduser@61.129.82.157$cd ~/.ssh

haduser@61.129.82.157$scp -P22 ~/.ssh/id_rsa.pub hadoop@61.129.82.221:~/.ssh

haduser@61.129.82.157$scp -P22 ~/.ssh/id_rsa.pub hadoop@61.129.82.222:~/.ssh

(6). 各服务器添加公密到信任区域,以221为例

haduser@61.129.82.221$cat~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

(7). 设置各服务器.ssh目录权限,以221为例

haduser@61.129.82.221$cd ~/

haduser@61.129.82.221$chmod700 .ssh -R

haduser@61.129.82.221$chmod600 .ssh/authorized_keys

(8).ssh 无密登录测试

haduser@61.129.82.157$ssh 61.129.82.221

第一次需要回车确认,第二次登录测试

haduser@61.129.82.157$ssh 61.129.82.221

显示last login.

 

6. Hadoop分布式部署配置

(1). 修改core-site.xml

   主要完成NameNode的 ip和port设置, hadoop分布式文件系统的两个重要的目录结构,一个是namenode上名字空间的存放地方,一个是datanode数据块的存放地方,还有一些其他的文件存放地方,这些存放地方都是基于hadoop.tmp.dir目录的.

比如:

namenode的名字空间存放地方就是 ${hadoop.tmp.dir}/dfs/name,

datanode数据块的存放地方就是 ${hadoop.tmp.dir}/dfs/data

所以设置好hadoop.tmp.dir目录后,其他的重要目录都是在这个目录下面,这是一个根目录。在此设置的是/tmp,当然这个目录必须是存在的。

# vim$HADOOP_HOME/etc/hadoop/core-site.xml

  <property> 
    <name>fs.default.name</name> 
    <value>hdfs://61.129.82.157:9000/</value> 
    <description>The name of the default filesystem.</description> 
  </property>   

  <property> 
    <name>hadoop.tmp.dir</name> 
    <value>/home/hadoop/tmp/hadoop</value> 
    <description>A base for other temporarydirectories.</description> 
  </property> 

(2). 修改hdfs-site.xml

# vim$HADOOP_HOME/etc/hadoop/hdfs-site.xml

  <property> 
    <name>dfs.replication </name> 
   <value>2</value> 
  </property>

 

(3). 修改mapred-site.xml

# cp$HADOOP_HOME/etc/hadoop/mapred-site.xml.template $HADOOP_HOME/etc/hadoop/mapred-site.xml

#vim $HADOOP_HOME/etc/hadoop/mapred-site.xml

<property> 
    <name>mapred.job.tracker</name> 
   <value>61.129.82.157:9001</value> 
   <description>true</description> 
</property>

 

(4). 修改slaves文件

#vim $HADOOP_HOME/etc/hadoop/slaves

had-slave1
had-slave2

 

(5). 修改hadoop-env文件

#vim $HADOOP_HOME/etc/hadoop/hadoop-env.sh

export JAVA_HOME=/usr/java/jdk1.7.0_51

 

7. Hadoop单机部署配置

(1). 修改core-site.xml

# vim$HADOOP_HOME/etc/hadoop/core-site.xml

  <property> 
    <name>fs.defaultFS</name> 
    <value>hdfs://localhost:9000</value> 
    <description>The name of the default file system.</description> 
  </property> 
  <property> 
    <name>hadoop.tmp.dir</name> 
    <value>/Users/micmiu/tmp/hadoop</value> 
    <description>A base for other temporarydirectories.</description> 
  </property> 
  <property> 
    <name>io.native.lib.available</name> 
    <value>false</value> 
    <description>default value is true:Should native hadooplibraries, if present, be used.</description> 
  </property>

 

(2). 修改hdfs-site.xml

# vim$HADOOP_HOME/etc/hadoop/hdfs -site.xml

  <property> 
    <name>dfs.replication </name> 
   <value>1</value> 
  </property>



(3). 修改yarn-site.xml

# vim$HADOOP_HOME/etc/hadoop/yarn-site.xml

<property> 
    <name>yarn.nodemanager.aux-services</name> 
    <value>mapreduce_shuffle</value> 
</property> 
<property> 
    <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> 
   <value>org.apache.hadoop.mapred.ShuffleHandler</value> 
</property>

 

(4). 修改mapred-site.xml

# cp$HADOOP_HOME/etc/hadoop/mapred-site.xml.template $HADOOP_HOME/etc/hadoop/mapred-site.xml

#vim $HADOOP_HOME/etc/hadoop/mapred-site.xml

<property> 
    <name>mapreduce.framework.name</name> 
    <value>yarn</value> 
    <final>true</final> 
</property>

 

8. Hadoop启动 并执行示例

(1) 添加haduser 帐号并 设置localhost无密登录

步骤5 为本地分布式部署,设置localhost无密登录

#useradd haduser

#passwd haduser

# suhaduser

$ cd ~

$ssh-keygen -t rsa -P ''

$cat ~/.ssh/id_rsa.pub >>~/.ssh/authorized_keys

 

验证ssh设置是否成功:

$ssh localhost

 lastlogin: xxxxx



(2) 启动Hadoop

$ hdfsnamenode -format

$start-dfs.sh

$ jps

1522NameNode
1651 DataNode
1794 SecondaryNameNode
1863 Jps

$start-yarn.sh

$ jps

2033NodeManager 
1900 ResourceManager 
1522 NameNode 
1651 DataNode 
2058 Jps 
1794 SecondaryNameNode

 

(3) 测试用例

$ hdfsdfs -ls /

$ hdfsdfs -mkdir  /dhfile

$ hdfsdfs -ls /

$ hdfsdfs -p /hadfile/ot/otds

$ hdfsdfs -ls /hadfile/ot/otds

$ hdfsdfs -put XXX.log /hadfile/ot/otds

$hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar wordcount /hadfile/ot/otds /hadout

$hadoop dfs -ls /hadout

$hadoop dfs -cat /hadout/part-r-0000



(4)web ui test

http://10.0.18.31:8088

http://10.0.18.31:50070



检查监控页面

http://10.0.18.31:8088/cluster/nodes

HDFS集群状态:

http://10.0.18.31:50070/dfshealth.jsp

 

检查监控页面

http://61.129.82.157:8088/cluster/nodes

HDFS集群状态:

http://61.129.82.157:50070/dfshealth.jsp

 

 

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值