从Apache官方下载的hadoop 2.X.X的编译版本(binary)为32位,小概率不能适应我们的操作系统。另外,在企业中需要对hadoop源码进行修改,所以需要自己处理hadoop源文件后再进行编译。本文以Centos-6.7为例,演示编译hadoop-2.9.0。
1、hadoop-2.9.0源文件对环境的要求
Requirements:
- Unix System
- JDK 1.8+
- Maven 3.0 or later
- Findbugs 1.3.9 (if running findbugs)
- ProtocolBuffer 2.5.0
- CMake 2.6 or newer (if compiling native code), must be 3.0 or newer on Mac
- Zlib devel (if compiling native code)
- openssl devel (if compiling native hadoop-pipes and to get the best HDFS encryption performance)
- Linux FUSE (Filesystem in Userspace) version 2.6 or above (if compiling fuse_dfs)
- Internet connection for first build (to fetch all Maven and Hadoop dependencies)
- python (for releasedocs)
- bats (for shell code testing)
- Node.js / bower / Ember-cli (for YARN UI v2 building)
2、资料准备
链接:https://pan.baidu.com/s/1qYid49u 密码:a7k0
- 64位linux系统CentOS 6.7。
- JDK 1.8。
- maven-3.2.5。 一个项目管理综合工具, 使用标准的目录结构和默认构建生命周期
- protobuf 2.5.0 google的一种数据交换的格式,它独立于语言,独立于平台
- hadoop-2.9.0-src
- ant-1.9.7 将软件编译、测试、部署等步骤联系在一起加以自动化的一个工具
3、安装环境
创建三个文件夹
#servers为软件安装路径
mkdir -p /export/servers
#software为软件压缩包存放路径
mkdir -p /export/software
#data为软件运行产生数据的存放路径
mkdir -p /export/data
将压缩包全部上传至/export/software
yum install lrzsz
cd /export/software
rz
3.1、安装JDK
cd /export/software
tar -zxvf jdk-8u111.tar.gz -C /export/servers
cd /export/servers/
mv jdk-jdk-8u111 jdk
vi /etc/profile
#键盘输入G调到文件末尾,i插入
export JAVA_HOME=/export/servers/jdk
export PATH=.:$PATH:$JAVA_HOME/bin
#ESC退出,shift+:,wq!保存
source /etc/profile
命令行敲入java -version出现如下代码表示成功
java version "1.8.0_111"
Java(TM) SE Runtime Environment (build 1.8.0_111-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.111-b14, mixed mode)
[root@node0 software]#
3.2、安装maven
cd /export/software
tar -zxvf apache-maven-3.3.9-bin.tar.gz -C /export/servers
cd /export/servers/
mv apache-maven-3.3.9 maven
vi /etc/profile
#键盘输入G调到文件末尾,i插入
export MAVEN_HOME=/export/servers/maven
export PATH=.:$PATH:$JAVA_HOME/bin:$MAVEN_HOME/bin
#ESC退出,shift+:,wq!保存
source /etc/profile
命令行敲入mvn -version出现如下代码表示成功
Apache Maven 3.3.9 (bb52d8502b132ec0a5a3f4c09453c07478323dc5; 2015-11-11T00:41:47+08:00)
Maven home: /export/servers/maven
Java version: 1.8.0_111, vendor: Oracle Corporation
Java home: /export/servers/jdk/jre
Default locale: en_US, platform encoding: UTF-8
OS name: "linux", version: "2.6.32-573.el6.x86_64", arch: "amd64", family: "unix"
3.3、安装protobuf
使用yum安装需要的C和C++编译环境
yum install gcc
yum install gcc-c++
yum install make
安装protobuf
cd /export/software
tar -zxvf protobuf-2.5.0.tar.gz -C /export/servers
cd /export/servers/
mv protobuf-2.5.0 protobuf
cd /protobuf
./configure
make
make install
输入protoc --version,如果出现如下信息表示安装成功
libprotoc 2.5.0
3.4、安装CMake
使用yum安装
yum install cmake
yum install openssl-devel
yum install ncurses-devel
3.5、安装ant
cd /export/software
tar -zxvf apache-ant-1.9.7-bin.tar.gz -C /export/servers
cd /export/servers/
mv apache-ant-1.9.7 ant
vi /etc/profile
export ANT_HOME=/export/servers/ant
export PATH=.:$PATH:$JAVA_HOME/bin:$MAVEN_HOME/bin:$ANT_HOME/bin
source /etc/profile
输入ant -version,如果出现如下信息表示安装成功
Apache Ant(TM) version 1.9.7 compiled on April 9 2016
3.6、编译hadoop
cd /export/software
tar -zxvf hadoop-2.9.0-src.tar.gz -C /export/servers
cd /export/servers/hadoop-2.9.0-src
hadoop-2.9.0-src目录
输入如下命令进行编译hadoop-2.9.0
mvn package -Pdist,native -DskipTests -Dtar
或者
mvn package -DskipTests -Pdist,native
经过漫长的等待,显示如下信息,编译完成
[INFO] Apache Hadoop Main ................................. SUCCESS [ 3.837 s]
[INFO] Apache Hadoop Build Tools .......................... SUCCESS [ 2.712 s]
[INFO] Apache Hadoop Project POM .......................... SUCCESS [ 3.007 s]
[INFO] Apache Hadoop Annotations .......................... SUCCESS [ 7.415 s]
[INFO] Apache Hadoop Assemblies ........................... SUCCESS [ 0.470 s]
[INFO] Apache Hadoop Project Dist POM ..................... SUCCESS [ 3.021 s]
[INFO] Apache Hadoop Maven Plugins ........................ SUCCESS [ 8.520 s]
[INFO] Apache Hadoop MiniKDC .............................. SUCCESS [ 10.373 s]
[INFO] Apache Hadoop Auth ................................. SUCCESS [ 11.262 s]
[INFO] Apache Hadoop Auth Examples ........................ SUCCESS [ 8.079 s]
[INFO] Apache Hadoop Common ............................... SUCCESS [02:10 min]
[INFO] Apache Hadoop NFS .................................. SUCCESS [ 9.837 s]
[INFO] Apache Hadoop KMS .................................. SUCCESS [ 20.864 s]
[INFO] Apache Hadoop Common Project ....................... SUCCESS [ 0.152 s]
[INFO] Apache Hadoop HDFS Client .......................... SUCCESS [ 44.576 s]
[INFO] Apache Hadoop HDFS ................................. SUCCESS [02:33 min]
[INFO] Apache Hadoop HDFS Native Client ................... SUCCESS [ 16.163 s]
[INFO] Apache Hadoop HttpFS ............................... SUCCESS [ 42.671 s]
[INFO] Apache Hadoop HDFS BookKeeper Journal .............. SUCCESS [ 48.829 s]
[INFO] Apache Hadoop HDFS-NFS ............................. SUCCESS [ 6.403 s]
[INFO] Apache Hadoop HDFS Project ......................... SUCCESS [ 0.050 s]
[INFO] Apache Hadoop YARN ................................. SUCCESS [ 0.091 s]
[INFO] Apache Hadoop YARN API ............................. SUCCESS [ 31.414 s]
[INFO] Apache Hadoop YARN Common .......................... SUCCESS [03:43 min]
[INFO] Apache Hadoop YARN Server .......................... SUCCESS [ 0.158 s]
[INFO] Apache Hadoop YARN Server Common ................... SUCCESS [01:12 min]
[INFO] Apache Hadoop YARN Registry ........................ SUCCESS [ 10.529 s]
[INFO] Apache Hadoop YARN NodeManager ..................... SUCCESS [ 52.556 s]
[INFO] Apache Hadoop YARN Web Proxy ....................... SUCCESS [ 6.104 s]
[INFO] Apache Hadoop YARN ApplicationHistoryService ....... SUCCESS [ 39.988 s]
[INFO] Apache Hadoop YARN Timeline Service ................ SUCCESS [ 20.587 s]
[INFO] Apache Hadoop YARN ResourceManager ................. SUCCESS [ 42.893 s]
[INFO] Apache Hadoop YARN Server Tests .................... SUCCESS [ 2.929 s]
[INFO] Apache Hadoop YARN Client .......................... SUCCESS [ 10.088 s]
[INFO] Apache Hadoop YARN SharedCacheManager .............. SUCCESS [ 6.438 s]
[INFO] Apache Hadoop YARN Timeline Plugin Storage ......... SUCCESS [ 5.302 s]
[INFO] Apache Hadoop YARN Router .......................... SUCCESS [ 9.236 s]
[INFO] Apache Hadoop YARN TimelineService HBase Backend ... SUCCESS [03:50 min]
[INFO] Apache Hadoop YARN Timeline Service HBase tests .... SUCCESS [02:05 min]
[INFO] Apache Hadoop YARN Applications .................... SUCCESS [ 0.064 s]
[INFO] Apache Hadoop YARN DistributedShell ................ SUCCESS [ 4.158 s]
[INFO] Apache Hadoop YARN Unmanaged Am Launcher ........... SUCCESS [ 2.997 s]
[INFO] Apache Hadoop YARN Site ............................ SUCCESS [ 0.119 s]
[INFO] Apache Hadoop YARN UI .............................. SUCCESS [ 0.080 s]
[INFO] Apache Hadoop YARN Project ......................... SUCCESS [ 11.104 s]
[INFO] Apache Hadoop MapReduce Client ..................... SUCCESS [ 0.373 s]
[INFO] Apache Hadoop MapReduce Core ....................... SUCCESS [ 41.954 s]
[INFO] Apache Hadoop MapReduce Common ..................... SUCCESS [ 30.493 s]
[INFO] Apache Hadoop MapReduce Shuffle .................... SUCCESS [ 7.215 s]
[INFO] Apache Hadoop MapReduce App ........................ SUCCESS [ 14.694 s]
[INFO] Apache Hadoop MapReduce HistoryServer .............. SUCCESS [ 11.014 s]
[INFO] Apache Hadoop MapReduce JobClient .................. SUCCESS [ 11.493 s]
[INFO] Apache Hadoop MapReduce HistoryServer Plugins ...... SUCCESS [ 2.894 s]
[INFO] Apache Hadoop MapReduce Examples ................... SUCCESS [ 8.967 s]
[INFO] Apache Hadoop MapReduce ............................ SUCCESS [ 4.284 s]
[INFO] Apache Hadoop MapReduce Streaming .................. SUCCESS [ 15.474 s]
[INFO] Apache Hadoop Distributed Copy ..................... SUCCESS [ 9.117 s]
[INFO] Apache Hadoop Archives ............................. SUCCESS [ 3.438 s]
[INFO] Apache Hadoop Archive Logs ......................... SUCCESS [ 3.312 s]
[INFO] Apache Hadoop Rumen ................................ SUCCESS [ 9.584 s]
[INFO] Apache Hadoop Gridmix .............................. SUCCESS [ 6.624 s]
[INFO] Apache Hadoop Data Join ............................ SUCCESS [ 3.789 s]
[INFO] Apache Hadoop Ant Tasks ............................ SUCCESS [ 3.349 s]
[INFO] Apache Hadoop Extras ............................... SUCCESS [ 5.948 s]
[INFO] Apache Hadoop Pipes ................................ SUCCESS [ 8.316 s]
[INFO] Apache Hadoop OpenStack support .................... SUCCESS [ 6.858 s]
[INFO] Apache Hadoop Amazon Web Services support .......... SUCCESS [10:30 min]
[INFO] Apache Hadoop Azure support ........................ SUCCESS [ 36.425 s]
[INFO] Apache Hadoop Client ............................... SUCCESS [ 9.178 s]
[INFO] Apache Hadoop Mini-Cluster ......................... SUCCESS [ 2.358 s]
[INFO] Apache Hadoop Scheduler Load Simulator ............. SUCCESS [ 9.512 s]
[INFO] Apache Hadoop Resource Estimator Service ........... SUCCESS [ 23.458 s]
[INFO] Apache Hadoop Azure Data Lake support .............. SUCCESS [ 23.971 s]
[INFO] Apache Hadoop Tools Dist ........................... SUCCESS [ 32.428 s]
[INFO] Apache Hadoop Tools ................................ SUCCESS [ 0.046 s]
[INFO] Apache Hadoop Distribution ......................... SUCCESS [02:25 min]
[INFO] Apache Hadoop Cloud Storage ........................ SUCCESS [ 4.366 s]
[INFO] Apache Hadoop Cloud Storage Project ................ SUCCESS [ 0.048 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 42:57 min
[INFO] Finished at: 2017-12-05T18:26:04+08:00
[INFO] Final Memory: 157M/450M
[INFO] ------------------------------------------------------------------------
进入hadoop-dist查看编译好的hadoop-2.9.0
注:
- 1、maven仓库用的是国外的镜像,小概率会出现抽风,修改为aliyun镜像
vi /export/servers/maven/conf/setting.xml
- 2、maven编译hadoop过程中可能会出现[ERROR]TEST...,这个是源码中间存在某些test文件,而maven仓库找不到相应的包。
解决:1、使用国外的maven仓库。2、修改源码,将test包下的软件删除,同时删除响应pom.xml中的依赖