打包是一项神圣、而庄严的工作。package意味着我们离生产已经非常近了。它会把我们之前的大量工作浓缩成为一个、或者多个文件。接下来,运维的同学就可以拿着这些个打包文件在生产上纵横四海了。
这么一项庄严、神圣的工作,却没有受到多数人的关注,大家习惯去网上随意copy一段pom的xml代码,往自己项目里面一扔,然后就开始执行package打包了。大多数只知道,Maven帮助我管理了JAR包的依赖,可以自动下载,很方便。确实,因为它太方便了,很多时候,我们几乎是没有感知它的存在。想起来某个功能的时候,直接去使用就可以了。
而构建的工作其实并不简单!例如:
-
打包后的程序,与生产环境JAR包冲突
-
依赖中有多个版本的依赖,如何选择、排除依赖
-
编译scala,某些JAR包的调用存在兼容问题
-
如何根据不同的环境来加载不同的配置,例如:本地环境、集群环境。
-
编译开源项目报错,根本无从下手解决。
-
...
其实,稍微离生产环境近一些,我们会发现很多的问题都暴露了出来。碰到这些问题的时候,当然可以第一时间百度。但为了能够更精准的定位问题、减少打包时候给别人挖坑,我们还是很有必要来了解一些关于Maven的细节。
目录
- 菜鸟玩dependency,神仙玩plugin
- 分析Hadoop Example模块打包
- JAR包中的META-INF目录
- Maven插件
- 分析Flink Archetype中的pom.xml
- Shade插件
- Assembly插件
- 制作一个属于我们自己的打包程序
菜鸟玩dependency,神仙玩plugin
我们使用Maven的时候,95%的时候关注是dependency,而很少有人真正会花时间去研究Maven的plugin。但小猴要告诉大家,其实Maven工作的核心是plugin,而不是dependency。好吧!再直接一点,菜鸟玩dependency,神仙玩plugin。是不是拼命想要反驳我,大家看看官网Plugin在Maven文档的位置,这意味着什么?
灵魂拷问:大家留意过吗?是不是只去官网上下载Maven,然后随便百度一个教程就开始用Maven了?
image-20210206120452130
分析Hadoop Example模块打包
学习的一种最好方式就是借鉴,借鉴优秀的开源项目。看看别人是怎么做的。所以,接下来,我们就来看看Hadoop是如何打包的。为了方便给大家演示,小猴特意用Maven给大家演示一遍编译、打包。这样效果会明显些。
操作步骤:
-
在github上找到Apahce/hadoop项目(https://github.com/apache/hadoop)
-
找到hadoop-mapreduce-project / hadoop-mapreduce-examples模块。
-
打开pom.xml文件。
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
https://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-project</artifactId>
<version>3.4.0-SNAPSHOT</version>
<relativePath>../../hadoop-project</relativePath>
</parent>
<artifactId>hadoop-mapreduce-examples</artifactId>
<version>3.4.0-SNAPSHOT</version>
<description>Apache Hadoop MapReduce Examples</description>
<name>Apache Hadoop MapReduce Examples</name>
<packaging>jar</packaging>
<properties>
<mr.examples.basedir>${basedir}</mr.examples.basedir>
</properties>
<dependencies>
<dependency>
<groupId>commons-cli</groupId>
<artifactId>commons-cli</artifactId>
</dependency>
<dependency>
<groupId>commons-logging</groupId>
<artifactId>commons-logging</artifactId>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-mapreduce-client-jobclient</artifactId>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-mapreduce-client-jobclient</artifactId>
<scope>test</scope>
<type>test-jar</type>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<scope>test</scope>
<type>test-jar</type>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs-client</artifactId>
<scope>runtime</scope>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
<scope>test</scope>
<type>test-jar</type>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-yarn-server-tests</artifactId>
<scope>test</scope>
<type>test-jar</type>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-mapreduce-client-app</artifactId>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-mapreduce-client-app</artifactId>
<type>test-jar</type>
<scope>test</scope>
</dependency>
<dependency>
<groupId>com.sun.jersey.jersey-test-framework</groupId>
<artifactId>jersey-test-framework-grizzly2</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-mapreduce-client-hs</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.hsqldb</groupId>
<artifactId>hsqldb</artifactId>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.hadoop.thirdparty</groupId>
<artifactId>hadoop-shaded-guava</artifactId>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
</dependency>
<dependency>
<groupId>org.assertj</groupId>
<artifactId>assertj-core</artifactId>
<scope>test</scope>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-jar-plugin</artifactId>
<configuration>
<archive>
<manifest>
<mainClass>org.apache.hadoop.examples.ExampleDriver</mainClass>
</manifest>
</archive>
</configuration>
</plugin>
<plugin>
<groupId>org.codehaus.mojo</groupId>
<artifactId>findbugs-maven-plugin</artifactId>
<configuration>
<findbugsXmlOutput>true</findbugsXmlOutput>
<xmlOutput>true</xmlOutput>
<excludeFilterFile>${mr.examples.basedir}/dev-support/findbugs-exclude.xml</excludeFilterFile>
<effort>Max</effort>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.rat</groupId>
<artifactId>apache-rat-plugin</artifactId>
<configuration>
<excludes>
<exclude>src/main/java/org/apache/hadoop/examples/dancing/puzzle1.dta</exclude>
</excludes>
</configuration>
</plugin>
</plugins>
</build>
</project>
通过浏览hadoop example的xml文件,我们发现了以下几点:
-
所有的依赖都在父工程hadoop-project的pom.xml中定义好了。在hadoop example项目中,没有出现任何一个版本号。
image-20210206104307402
-
Hadoop使用了三个插件,一个是maven-jar-plugin、一个是findbugs-maven-plugin、还有一个是apache-rat-plugin。
我们进入到example模块中pom.xml所在的目录中,直接执行mvn package试试看。
[root@compile hadoop-mapreduce-examples]# mvn package
[INFO] Scanning for projects...
[INFO]
[INFO] ------------< org.apache.hadoop:hadoop-mapreduce-examples >-------------
[INFO] Building Apache Hadoop MapReduce Examples 3.2.1
[INFO] --------------------------------[ jar ]---------------------------------
Downloading from apache.snapshots.https: https://repository.apache.org/content/repositories/snapshots/org/apache/hadoop/hadoop-mapreduce-client-app/3.2.1/hadoop-mapreduce-client-app-3.2.1-tests.jar
.....
[INFO]
[INFO] --- maven-antrun-plugin:1.7:run (create-testdirs) @ hadoop-mapreduce-examples ---
[INFO] Executing tasks
main:
[INFO] Executed tasks
[INFO]
[INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ hadoop-mapreduce-examples ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory /opt/hadoop-3.2.1-src/hadoop-mapreduce-project/hadoop-mapreduce-examples/src/main/resources
[INFO]
[INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ hadoop-mapreduce-examples ---
[INFO] Nothing to compile - all classes are up to date
[INFO]
[INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ hadoop-mapreduce-examples ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory /opt/hadoop-3.2.1-src/hadoop-mapreduce-project/hadoop-mapreduce-examples/src/test/resources
[INFO]
[INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ hadoop-mapreduce-examples ---
[INFO] Nothing to compile - all classes are up to date
[INFO]
[INFO] --- maven-surefire-plugin:3.0.0-M1:test (default-test) @ hadoop-mapreduce-examples ---
Downloading from central: http://maven.aliyun.com/nexus/content/groups/public/org/apache/maven/surefire/surefire-junit4/3.0.0-M1/surefire-junit4-3.0.0-M1.jar
..........
[INFO]
[INFO] -------------------------------------------------------
[INFO] T E S T S
[INFO] -------------------------------------------------------
[INFO] Running org.apache.hadoop.examples.TestBaileyBorweinPlouffe
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.536 s - in org.apache.hadoop.examples.TestBaileyBorweinPlouffe
..........
[INFO]
[INFO] Results:
[INFO]
[INFO] Tests run: 11, Failures: 0, Errors: 0, Skipped: 0
[INFO]
[INFO]
[INFO] --- maven-jar-plugin:2.5:jar (default-jar) @ hadoop-mapreduce-examples ---
[INFO]
[INFO] --- maven-site-plugin:3.6:attach-descriptor (attach-descriptor) @ hadoop-mapreduce-examples ---
[INFO] Skipping because packaging 'jar' is not pom.
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] -