编译安装Hadoop

2 篇文章 0 订阅
1 篇文章 0 订阅

首先吐槽一下Apache官网上的Hadoop二进制包居然是32位版本的,害我研究了半天的错误!


一、安装JDK

我安装了Oracle的HotSpot的JDK1.7,这里不能安装JRE,因为没有tools.jar

从官网下载JDK的RPM格式安装包

rpm -ivh xxxxx.rpm

安装完成后,默认的安装地址是在/usr/java/xxxx

配置环境变量


二、安装maven

从官网直接下载二进制包,解压之,放到/usr/local/maven中去

为了提高下载速度,配置OSChina的源

  1. <mirror>  
  2.      <id>nexus-osc</id>  
  3.       <mirrorOf>*</mirrorOf>  
  4.   <name>Nexusosc</name>  
  5.   <url>http://maven.oschina.net/content/groups/public/</url>  
  6. </mirror>
  1. <profile>  
  2.        <id>jdk-1.7</id>  
  3.        <activation>  
  4.          <jdk>1.7</jdk>  
  5.        </activation>  
  6.        <repositories>  
  7.          <repository>  
  8.            <id>nexus</id>  
  9.            <name>local private nexus</name>  
  10.            <url>http://maven.oschina.net/content/groups/public/</url>  
  11.            <releases>  
  12.              <enabled>true</enabled>  
  13.            </releases>  
  14.            <snapshots>  
  15.              <enabled>false</enabled>  
  16.            </snapshots>  
  17.          </repository>  
  18.        </repositories>  
  19.        <pluginRepositories>  
  20.          <pluginRepository>  
  21.            <id>nexus</id>  
  22.           <name>local private nexus</name>  
  23.            <url>http://maven.oschina.net/content/groups/public/</url>  
  24.            <releases>  
  25.              <enabled>true</enabled>  
  26.            </releases>  
  27.            <snapshots>  
  28.              <enabled>false</enabled>  
  29.            </snapshots>  
  30.          </pluginRepository>  
  31.        </pluginRepositories>  
  32.      </profile>
配置环境变量


三、安装protobuf

下载protobuf2.5.0,一定是要这个版本,别下高版本的,否则Hadoop编译失败,我吃过亏了

解压之,进行编译安装

./configure --prefix=/usr/local/protoc/

make

make install

安装完成后,配置环境变量



四、安装Hadoop

从Apache官网下载2.6的源码包,解压之

mvn package -Pdist,native -DskipTests -Dtar

在下载过程中,因为OSChina源不稳定,中间出了好几次网络连接错误,直接重新执行上面的命令即可

编译安装成功后,在hadoop-dist/target/hadoop-2.6.0目录中

cp -r hadoop-2.6.0 /usr/local/hadoop/

cd /usr/local/hadoop/

./bin/hadoop version

可以看到hadoop的版本信息

file lib//native/*

可以看到安装的是32位的还是64位的

配置环境变量

连带上面的一起,环境变量如下:

export JAVA_HOME=/usr/java/jdk1.7.0_71

export HADOOP_HOME=/usr/local/hadoop
export MAVEN_HOME=/usr/local/maven
export PROTOBUF_HOME=/usr/local/protoc
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$MAVEN_HOME/bin:$PROTOBUF_HOME/bin:


五、配置集群

配置各个节点的SSH无密码登陆

首先先配置各个节点的hostname,我这里的环境简单介绍一下

192.168.36.130  Master.Hadoop

192.168.36.131  Slave1.Hadoop

192.168.36.132  Slave2.Hadoop

在CentOS6.6下配置hostname的方法如下

vim /etc/sysconfig/network

vim /etc/hosts

hostname XXX.Hadoop

service network restart

另外再关闭防火墙

service iptables stop

chkconfig iptables off

好了,开始生成公钥和私钥

ssh-keygen -t rsa

一直回车

然后会看到/root/.ssh/这个目录下多了两个文件

在另外两个节点上也执行相同的操作,这里用root账户去执行

同时在这个目录下创建一个名为authorized_keys的文件

里面的内容就是三台节点的公钥的内容,同时复制到另外两个节点上的相同目录下

我的authorized_keys的内容可参考如下:

ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEA4mJOu/de6KmSI7XP9LdxOlldnDI1olDM7GalikiUK3zCSkvUdXCkql7I2b1FU1OopT2keiXZptNJ8DlJi/LCkfi/+zysOmX5ppl5D4Zm9aIzyx1JYUB0pKT5mmYLuCsuHok+rPub1kzwHsWtzoYqAPgmxqnlEtgqxZj+YcaJJp9C2rF9zTaD/1sip/AguCQ2vdQc+yQYc7K33rPZXArnBfNVankIU2o2DsqdovtMCnFPU87+57S3hfT50HyLxXEMiroFypYGTNm84v3gAoCB/IpS0BwPdtHun2YrYtGKTaW0EjgG2J8lYbDUSe1eFWNidWHiDYtvzYR6vORXvMOq6Q== root@6.6.1
ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAtTFmf3Qpms17fsZuxYIFfTY1fGk9e1T7RJOMQIbV/nwBEiy0MDYkGwFhUi1ASWoxGnoPRt6soOE+tluaQOOfAY9HcdyS22ZHcxq4269VdTwZetANrhbI2F0LJgnS9B5D3wQPGqIMiujGria0J9iDpDhXDGWFK+RXzJDsKWTYfVeKVAiGzasebSKsyJKcxzBNzHV0AMKFPuy15DFtC+E82n1gMoPelp3iNpOBCIRzC1koeGvdPG9lu3Y22mpagn7JGw8ozt2j2tVZHl47sZ/rD0LvYK9DRwHFlzUp1h0A55SQwe6D/DVKTwdlKSLasYKlxgqV0ckNynptEvwu/KxoZQ== root@6.6.2
ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAuZjZ+sWkr+9P6/NtEUzxWZyNPbJKAt0W18Cy0gUePFAbXQd9Rv3LngbCScbsNDM7Fsaao+gop87bk2BRsmN9QPzY8KevFMvN4UtysoqgFT7UUWGXRvizLH2EWKi056gu5rw493k9MDbDDFtT03v5PbKen23ILbZ/q2fKe7cyY6xRXNwxTsKm80EOqh4KrU40PkrcEkDL2BA8HGhwdsb7R6nPwcuFkKqIdVEKESHxrrLYApu6Iu5R3WJKGXJXqx7mHZnFOFkTw60BEOalONdg1XXedxCrIUtlbCGiz4xJ+mnCNPDOFoGte/E+WdyPYMqRYEk23E7xRx3a1lLBZ6FsdQ== root@6.6.3

最后测试一下

ssh Slave1.Hadoop

ssh Slave2.Hadoop

另外两台节点也都测试一下,如果可以直接连上,说明配置成功


下面开始配置hadoop的一些信息

cd /usr/local/hadoop/etc/hadoop

配置hadoop-env.sh和yarn-env.sh的JAVA_HOME

在yarn-env.sh中将HADOOP_PID_DIR改为hadoop的安装目录

配置core-site.xml

  1. [root@systdt hadoop]# cat core-site.xml   
  2. <?xml version="1.0" encoding="UTF-8"?>  
  3. <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>  
  4. <configuration>  
  5.         <property>  
  6.                 <name>hadoop.tmp.dir</name>  
  7.                 <value>/home/hadoop/tmp</value>  
  8.                 <description>Abase for other temporary directories.</description>  
  9.         </property>  
  10.         <property>  
  11.                 <name>fs.defaultFS</name>  
  12.                 <value>hdfs://Master.Hadoop:9000</value>  
  13.         </property>  
  14.         <property>  
  15.                 <name>io.file.buffer.size</name>  
  16.                 <value>4096</value>  
  17.         </property>  
  18. </configuration> 
配置hdfs-site.xml

  1. <?xml version="1.0" encoding="UTF-8"?>  
  2. <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>  
  3. <configuration>  
  4.         <property>  
  5.                 <name>dfs.namenode.name.dir</name>  
  6.                 <value>file:///home/hadoop/dfs/name</value>  
  7.         </property>  
  8.         <property>  
  9.                 <name>dfs.datanode.data.dir</name>  
  10.                 <value>file:///home/hadoop/dfs/data</value>  
  11.         </property>  
  12.         <property>  
  13.                 <name>dfs.replication</name>  
  14.                 <value>2</value>  
  15.         </property>  
  16.   
  17.     <property>  
  18.         <name>dfs.nameservices</name>  
  19.         <value>hadoop-cluster1</value>  
  20.     </property>  
  21.     <property>  
  22.         <name>dfs.namenode.secondary.http-address</name>  
  23.         <value>Master.Hadoop:50090</value>  
  24.     </property>  
  25.     <property>  
  26.         <name>dfs.webhdfs.enabled</name>  
  27.         <value>true</value>  
  28.     </property>  
  29. </configuration>
配置mapred-site.xml

  1. <?xml version="1.0"?>  
  2. <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>  
  3. <configuration>  
  4.         <property>  
  5.                 <name>mapreduce.framework.name</name>  
  6.                 <value>yarn</value>  
  7.                 <final>true</final>  
  8.         </property>  
  9.   
  10.     <property>  
  11.         <name>mapreduce.jobtracker.http.address</name>  
  12.         <value>Master.Hadoop:50030</value>  
  13.     </property>  
  14.     <property>  
  15.         <name>mapreduce.jobhistory.address</name>  
  16.         <value>Master.Hadoop:10020</value>  
  17.     </property>  
  18.     <property>  
  19.         <name>mapreduce.jobhistory.webapp.address</name>  
  20.         <value>Master.Hadoop:19888</value>  
  21.     </property>  
  22.         <property>  
  23.                 <name>mapred.job.tracker</name>  
  24.                 <value>http://Master.Hadoop:9001</value>  
  25.         </property>  
  26. </configuration>
配置yarn-site.xml

  1. <?xml version="1.0"?>  
  2. <configuration>  
  3.   
  4.         <!-- Site specific YARN configuration properties -->  
  5.         <property>  
  6.                 <name>yarn.resourcemanager.hostname</name>  
  7.                 <value>Master.Hadoop</value>  
  8.         </property>  
  9.   
  10.     <property>  
  11.         <name>yarn.nodemanager.aux-services</name>  
  12.         <value>mapreduce_shuffle</value>  
  13.     </property>  
  14.     <property>  
  15.         <name>yarn.resourcemanager.address</name>  
  16.         <value>Master.Hadoop:8032</value>  
  17.     </property>  
  18.     <property>  
  19.         <name>yarn.resourcemanager.scheduler.address</name>  
  20.         <value>Master.Hadoop:8030</value>  
  21.     </property>  
  22.     <property>  
  23.         <name>yarn.resourcemanager.resource-tracker.address</name>  
  24.         <value>Master.Hadoop:8031</value>  
  25.     </property>  
  26.     <property>  
  27.         <name>yarn.resourcemanager.admin.address</name>  
  28.         <value>Master.Hadoop:8033</value>  
  29.     </property>  
  30.     <property>  
  31.         <name>yarn.resourcemanager.webapp.address</name>  
  32.         <value>192.168.36.130:8088</value>  
  33.     </property>  
  34. </configuration>
到这里就全部配置好了


六、启动

在Master.Hadoop节点上,执行/usr/local/hadoop/sbin/start-all.sh

输出以下信息:

[root@Master sbin]# ./start-all.sh
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
Starting namenodes on [Master.Hadoop]
Master.Hadoop: starting namenode, logging to /usr/local/hadoop/logs/hadoop-root-namenode-Master.Hadoop.out
Slave1.Hadoop: starting datanode, logging to /usr/local/hadoop/logs/hadoop-root-datanode-Slave1.Hadoop.out
Slave2.Hadoop: starting datanode, logging to /usr/local/hadoop/logs/hadoop-root-datanode-Slave2.Hadoop.out
Master.Hadoop: starting datanode, logging to /usr/local/hadoop/logs/hadoop-root-datanode-Master.Hadoop.out
starting yarn daemons
starting resourcemanager, logging to /usr/local/hadoop/logs/yarn-root-resourcemanager-Master.Hadoop.out
Slave2.Hadoop: starting nodemanager, logging to /usr/local/hadoop/logs/yarn-root-nodemanager-Slave2.Hadoop.out
Slave1.Hadoop: starting nodemanager, logging to /usr/local/hadoop/logs/yarn-root-nodemanager-Slave1.Hadoop.out
Master.Hadoop: starting nodemanager, logging to /usr/local/hadoop/logs/yarn-root-nodemanager-Master.Hadoop.out

说明启动成功

查看集群状态:./bin/hdfs dfsadmin –report

查看文件块组成: ./bin/hdfsfsck / -files -blocks

查看各节点状态: http://192.168.36.130:50070

查看resourcemanager上cluster运行状态: http://192.168.36.130:8088


七、测试

echo "hello,world" >> file

hdfs dfs -mkdir /test

hdfs dfs -put file /test

hdfs dfs -cat /test/file

hdfs dfs -get /test/file file1

hdfs dfs -rm /test/file

hdfs dfs -ls /test

OK,正常工作了!结束!


评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值