随着互联网大数据的兴起,Hadoop这个专门用于大数据处理的框架也越来越被人重视起来,可以说Hadoop这个框架是现阶段进行大数据处理的标配,作为一个冲锋于互联网的先锋军,学习一下Hadoop这个框架都是有必要的。当然学习也是要一步一步的,在学习Hadoop集群搭建之前我们先学习一下怎样在自己的机器上编译出适合自己机器的Hadoop,这一篇我们就来学习怎么重新编译Hadoop
搭建环境:
CentOS6.0、hadoop-2.2.0-src.tar.gz、hadoop-2.2.0.tar.gz、SecureCRT、Transmit、
apache-maven-3.3.3-bin.tar.gz、jdk-7u71-linux-x64.gz、protobuf-2.5.0、cmake-3.3.0.tar.gz
安装软件
解压jdk-7u71-linux-x64.gz
tar -zxvf jdk-7u71-linux-x64.gz -C /usr/local/soft/
安装Maven
因为官方下载下的编译后的hadoop不一定适用于你的服务器环境,因此我们需要在自己的环境下编译适用于自己环境的hadoop。
解压并移动 apache-maven-3.3.3-bin.tar.gz
tar -zxvf apache-maven-3.3.3-bin.tar.gz -C /usr/local/soft/
设置Maven和Java的环境变量
在文件夹/etc/profile.d/下新建custom.sh文件,并写入下列内容
#/etc/profile.d/custom.sh
MAVEN_HOME=/usr/local/soft/apache-maven-3.3.3
export MAVEN_HOME
JAVA_HOME=/usr/local/soft/jdk1.7.0_71
export JAVA_HOME
export PATH=$JAVA_HOME/bin:$MAVEN_HOME/bin:$PATH
~
测试是否安装成功
安装protobuf
解压:tar -zxvf cmake-3.3.0-rc2.tar.gz
配置:./configure --prefix=/usr/local/soft/protobuf-2.5.0
这里可能会报一些错误,取决于你的机器上是否安装了g++、c++,下面两条命令可以解决这个问题
yum install glibc-headers
yum install gcc-c++
编译安装:make && make install
安装cmake
解压:ar -zxvf cmake-3.3.0-rc2.tar.gz
配置:./configure --prefix=/usr/local/soft/cmake-3.3.0
编译安装:make && make install
最后一步安装openssl和ncurse的依赖
yum install openssl-devel
yum install ncurses-devel
编译Hadoop的源码
解压hadoop-2.2.0-src.tar.gz
tar -zxvf hadoop-2.2.0-src.tar.gz
cdhadoop-2.2.0-src
修改hadoop源码目录下的hadoop-common-project/hadoop-auth/pom.xml
添加或修改如下部分内容:
<dependency>
<groupId>org.mortbay.jetty</groupId>
<artifactId>jetty-util</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.mortbay.jetty</groupId>
<artifactId>jetty</artifactId>
<scope>test</scope>
</dependency>
下面这条命令就是进行我们的编译了。
mvn package -Pdist,native -DskipTests -Dtar
这个编译需要很长时间,你可以在这里看一场电影休息下在回来也没关系的。
[INFO] Reactor Summary:
[INFO]
[INFO] Apache Hadoop Main ................................. SUCCESS [ 2.960 s]
[INFO] Apache Hadoop Project POM .......................... SUCCESS [ 1.560 s]
[INFO] Apache Hadoop Annotations .......................... SUCCESS [ 3.302 s]
[INFO] Apache Hadoop Assemblies ........................... SUCCESS [ 0.263 s]
[INFO] Apache Hadoop Project Dist POM ..................... SUCCESS [ 2.084 s]
[INFO] Apache Hadoop Maven Plugins ........................ SUCCESS [ 3.628 s]
[INFO] Apache Hadoop Auth ................................. SUCCESS [ 3.370 s]
[INFO] Apache Hadoop Auth Examples ........................ SUCCESS [ 1.950 s]
[INFO] Apache Hadoop Common ............................... SUCCESS [02:10 min]
[INFO] Apache Hadoop NFS .................................. SUCCESS [ 53.316 s]
[INFO] Apache Hadoop Common Project ....................... SUCCESS [ 0.061 s]
[INFO] Apache Hadoop HDFS ................................. SUCCESS [05:50 min]
[INFO] Apache Hadoop HttpFS ............................... SUCCESS [ 30.996 s]
[INFO] Apache Hadoop HDFS BookKeeper Journal .............. SUCCESS [01:59 min]
[INFO] Apache Hadoop HDFS-NFS ............................. SUCCESS [ 3.839 s]
[INFO] Apache Hadoop HDFS Project ......................... SUCCESS [ 0.030 s]
[INFO] hadoop-yarn ........................................ SUCCESS [04:04 min]
[INFO] hadoop-yarn-api .................................... SUCCESS [ 36.000 s]
[INFO] hadoop-yarn-common ................................. SUCCESS [ 26.416 s]
[INFO] hadoop-yarn-server ................................. SUCCESS [ 0.132 s]
[INFO] hadoop-yarn-server-common .......................... SUCCESS [ 9.069 s]
[INFO] hadoop-yarn-server-nodemanager ..................... SUCCESS [ 17.638 s]
[INFO] hadoop-yarn-server-web-proxy ....................... SUCCESS [ 4.468 s]
[INFO] hadoop-yarn-server-resourcemanager ................. SUCCESS [ 14.609 s]
[INFO] hadoop-yarn-server-tests ........................... SUCCESS [ 1.180 s]
[INFO] hadoop-yarn-client ................................. SUCCESS [ 6.493 s]
[INFO] hadoop-yarn-applications ........................... SUCCESS [ 0.104 s]
[INFO] hadoop-yarn-applications-distributedshell .......... SUCCESS [ 3.221 s]
[INFO] hadoop-mapreduce-client ............................ SUCCESS [ 0.100 s]
[INFO] hadoop-mapreduce-client-core ....................... SUCCESS [ 23.699 s]
[INFO] hadoop-yarn-applications-unmanaged-am-launcher ..... SUCCESS [ 2.320 s]
[INFO] hadoop-yarn-site ................................... SUCCESS [ 0.149 s]
[INFO] hadoop-yarn-project ................................ SUCCESS [ 48.019 s]
[INFO] hadoop-mapreduce-client-common ..................... SUCCESS [ 18.606 s]
[INFO] hadoop-mapreduce-client-shuffle .................... SUCCESS [ 3.093 s]
[INFO] hadoop-mapreduce-client-app ........................ SUCCESS [ 10.888 s]
[INFO] hadoop-mapreduce-client-hs ......................... SUCCESS [ 5.133 s]
[INFO] hadoop-mapreduce-client-jobclient .................. SUCCESS [ 5.858 s]
[INFO] hadoop-mapreduce-client-hs-plugins ................. SUCCESS [ 1.651 s]
[INFO] Apache Hadoop MapReduce Examples ................... SUCCESS [ 6.702 s]
[INFO] hadoop-mapreduce ................................... SUCCESS [ 2.894 s]
[INFO] Apache Hadoop MapReduce Streaming .................. SUCCESS [ 5.037 s]
[INFO] Apache Hadoop Distributed Copy ..................... SUCCESS [01:20 min]
[INFO] Apache Hadoop Archives ............................. SUCCESS [ 2.927 s]
[INFO] Apache Hadoop Rumen ................................ SUCCESS [ 6.893 s]
[INFO] Apache Hadoop Gridmix .............................. SUCCESS [ 4.745 s]
[INFO] Apache Hadoop Data Join ............................ SUCCESS [ 3.408 s]
[INFO] Apache Hadoop Extras ............................... SUCCESS [ 3.200 s]
[INFO] Apache Hadoop Pipes ................................ SUCCESS [ 7.832 s]
[INFO] Apache Hadoop Tools Dist ........................... SUCCESS [ 2.250 s]
[INFO] Apache Hadoop Tools ................................ SUCCESS [ 0.027 s]
[INFO] Apache Hadoop Distribution ......................... SUCCESS [ 17.662 s]
[INFO] Apache Hadoop Client ............................... SUCCESS [ 7.782 s]
[INFO] Apache Hadoop Mini-Cluster ......................... SUCCESS [ 0.090 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 22:26 min
[INFO] Finished at: 2015-12-19T04:43:28-08:00
[INFO] Final Memory: 130M/372M
[INFO] ------------------------------------------------------------------------
当你看到这个信息的时候,说明编译完成
编译路径在hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0
cd hadoop-2.2.0
./hadoop version
我们会看到以下信息
file lib/native/*
红色标注的部分就是我们编译的适合自己服务器的版本了。
直到这里,我们的编译部分就算完美结束了。下一节我们来开始搭建单机的环境。