离线数仓搭建——1.hadoop编译
hadoop源码编译
下载地址
https://hadoop.apache.org/release.html
https://archive.apache.org/dist/hadoop/common/
解压hadoop-3.1.3-src.tar.gz
查看hadoop-3.1.3-src中的BUILDING.txt
根据BUILDING.txt中的提示提前下载好需要的package
JDK:1.8
https://www.oracle.com/cn/java/technologies/javase/javase-jdk8-downloads.html
Maven:3.6
http://maven.apache.org/download.cgi
ProtocolBuffer 2.5.0
https://github.com/protocolbuffers/protobuf/releases/tag/v2.5.0
cmake:2.5
在虚拟机的/opt目录创建 software和module 目录在两个目录中创建hadoop_source
将
hadoop-3.1.3-src.tar.gz
jdk-8u212-linux-x64.tar.gz
apache-maven-3.6.3-bin.tar.gz
protobuf-2.5.0.tar.gz
cmake-3.17.0.tar.gz
这些package导入到/opt/software/hadoop_source目录下
解压到/opt/module/hadoop_source
#JDK 安装到 /opt/module/
[root@bogon hadoop_source]# tar -zxvf jdk-8u212-linux-x64.tar.gz -C /opt/module/
[root@bogon hadoop_source]# tar -zxvf hadoop-3.1.3-src.tar.gz -C /opt/module/hadoop_source/
[root@bogon hadoop_source]# tar -zxvf apache-maven-3.6.3-bin.tar.gz -C /opt/module/hadoop_source/
[root@bogon hadoop_source]# tar -zxvf cmake-3.17.0.tar.gz -C /opt/module/hadoop_source/
[root@bogon hadoop_source]# tar -zxvf protobuf-2.5.0.tar.gz -C /opt/module/hadoop_source/
修改maven镜像
[root@bogon hadoop_source]# vi apache-maven-3.6.3/conf/settings.xml
#在mirrors 标签内添加阿里镜像
<mirror>
<id>nexus-aliyun</id>
<mirrorOf>central</mirrorOf>
<name>Nexus aliyun</name>
<url>http://maven.aliyun.com/nexus/content/groups/public</url>
</mirror>
配置JDK Maven 环境变量
[root@bogon hadoop_source]# vi /etc/profile.d/my_env.sh
#添加如下内容
#JAVA_HOME
export JAVA_HOME=/opt/module/jdk1.8.0_212
export PATH=$PATH:$JAVA_HOME/bin
#MAVEN_HOME
MAVEN_HOME=/opt/module/hadoop_source/apache-maven-3.6.3
PATH=$PATH:$JAVA_HOME/bin:$MAVEN_HOME/bin
刷新配置文件 测试jdk maven
[root@bogon hadoop_source]# source /etc/profile.d/my_env.sh
[root@bogon hadoop_source]# java -version
java version "1.8.0_212"
Java(TM) SE Runtime Environment (build 1.8.0_212-b10)
Java HotSpot(TM) 64-Bit Server VM (build 25.212-b10, mixed mode)
[root@bogon hadoop_source]# mvn -version
Apache Maven 3.6.3 (cecedd343002696d0abb50b32b541b8a6ba2883f)
Maven home: /opt/module/hadoop_source/apache-maven-3.6.3
Java version: 1.8.0_212, vendor: Oracle Corporation, runtime: /opt/module/hadoop_source/jdk1.8.0_212/jre
Default locale: zh_CN, platform encoding: UTF-8
OS name: "linux", version: "3.10.0-693.el7.x86_64", arch: "amd64", family: "unix"
[root@bogon hadoop_source]#
安装相关依赖
#c c++ 环境
yum install -y gcc* make
#hadoop 压缩格式
yum -y install snappy* bzip2* lzo* zlib* lz4* gzip*
#工具
yum -y install openssl* svn ncurses* autoconf automake libtool
yum -y install epel-release
yum -y install *zstd*
安装cmake 在 cmake-3.17.0 文件夹下执行./bootstrap编译
[root@bogon hadoop_source]# cd cmake-3.17.0/
[root@bogon cmake-3.17.0]# pwd
/opt/module/hadoop_source/cmake-3.17.0
[root@bogon cmake-3.17.0]# ./bootstrap
执行安装 并测试
[root@bogon cmake-3.17.0]# make && make install
...
[root@bogon cmake-3.17.0]# cmake -version
安装 protobuf
[root@bogon protobuf-2.5.0]# cd /opt/module/hadoop_source/protobuf-2.5.0/
[root@bogon protobuf-2.5.0]# pwd
/opt/module/hadoop_source/protobuf-2.5.0
[root@bogon protobuf-2.5.0]# ./configure --prefix=/opt/module/hadoop_source/protobuf-2.5.0
[root@bogon protobuf-2.5.0]# make && make install
配置环境变量
[root@bogon protobuf-2.5.0]# vi /etc/profile.d/my_env.sh
#添加如下内容
#PROTOC_HOME
PROTOC_HOME=/opt/module/hadoop_source/protobuf-2.5.0
PATH=$PATH:$JAVA_HOME/bin:$MAVEN_HOME/bin:$PROTOC_HOME/bin
#刷新配置文件
[root@bogon protobuf-2.5.0]# source /etc/profile
#验证
[root@bogon protobuf-2.5.0]# protoc --version
libprotoc 2.5.0
[root@bogon protobuf-2.5.0]#
到此,软件包安装配置工作完成
编译源码
进入解压后的 Hadoop 源码目录下
[root@bogon hadoop-3.1.3-src]# pwd
/opt/module/hadoop_source/hadoop-3.1.3-src
[root@bogon hadoop-3.1.3-src]# mvn clean package -DskipTests -Pdist,native -Dtar
…
[INFO] --- maven-site-plugin:3.6:attach-descriptor (attach-descriptor) @ hadoop-cloud-storage-project ---
[INFO] No site descriptor found: nothing to attach.
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary for Apache Hadoop Main 3.1.3:
[INFO]
[INFO] Apache Hadoop Main ................................. SUCCESS [01:36 min]
[INFO] Apache Hadoop Build Tools .......................... SUCCESS [ 50.872 s]
[INFO] Apache Hadoop Project POM .......................... SUCCESS [ 14.503 s]
...
[INFO] Apache Hadoop Distribution ......................... SUCCESS [ 18.089 s]
[INFO] Apache Hadoop Client Modules ....................... SUCCESS [ 0.012 s]
[INFO] Apache Hadoop Cloud Storage ........................ SUCCESS [ 0.199 s]
[INFO] Apache Hadoop Cloud Storage Project ................ SUCCESS [ 0.012 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 53:05 min
[INFO] Finished at: 2021-03-31T15:08:11+08:00
[INFO] ------------------------------------------------------------------------
[root@bogon hadoop-3.1.3-src]#
[root@bogon hadoop-3.1.3-src]#
编译好的文件在/opt/module/hadoop_source/hadoop-3.1.3-src/hadoop-dist/target 目录下
[root@bogon target]# ll
总用量 287536
drwxr-xr-x. 2 root root 28 3月 31 15:07 antrun
drwxr-xr-x. 3 root root 22 3月 31 15:07 classes
drwxr-xr-x. 9 root root 149 3月 31 15:07 hadoop-3.1.3
-rw-r--r--. 1 root root 294433524 3月 31 15:08 hadoop-3.1.3.tar.gz
drwxr-xr-x. 3 root root 22 3月 31 15:07 maven-shared-archive-resources
drwxr-xr-x. 3 root root 22 3月 31 15:07 test-classes
drwxr-xr-x. 2 root root 6 3月 31 15:07 test-dir
[root@bogon target]# pwd
/opt/module/hadoop_source/hadoop-3.1.3-src/hadoop-dist/target
[root@bogon target]#
hadoop 编译完成
如果有特殊需求需要改动hadoop源码时才要编译
如果没有直接下载官方的hadoop-3.1.3.tar.gz即可