I'm new to maven. I want to package a jar of my hadoop project with its dependencies, and then use it like:
hadoop jar project.jar com.abc.def.SomeClass1 -params ...
hadoop jar project.jar com.abc.def.AnotherClass -params ...
And I want to have multiple entry points for this jar (different hadoop jobs).
How could I do it?
Thanks!
解决方案
There's two ways to create a jar with dependencies:
Hadoop supports jars in a jar format - meaning that your jar contain contain a lib folder of jars that will be added to the classpath at job submission and map / reduce task execution
You can unpack the jar dependencies and re-pack them with your classes into a single monolithic jar.
The first will require you to create a maven assembly definition file but in reality is more hassle than it's worth. The second also uses maven assemblies but utilizes a built in descriptor. To use the second, just add the following to your project -> build -> plugins section in the pom:
maven-assembly-plugin
2.4
jar-with-dependencies
Now when you run mvn package you'll get two jars in your target folder:
${project.name}-${project.version}.jar - Which will just contain classes and resources for your project
${project.name}-${project.version}-jar-with-dependencies.jar - which will contain your classes / resources and everything from your dependency tree with a scope of compile unpacked and repacked into a single jar
For multi entry points, you don't need to do anything specific, just make sure you don't define a Main-Class entry in the jar manifest (if you explicitly configure a manifest, otherwise the default doesn't name a Main-Class so you should be good)