IntelliJ IDEA + Maven环境下对hadoop、hive-udf的pom.xml配置依赖
文章目录
一、Maven配置
1.1 配置本地仓库
在Maven的根目录下新建目录localrepository(单层目录即可)作为本地仓库路径,复制路径,比如“D:\software\apache-maven-3.5.4\localrepository”到settings文件
打开D:\software\apache-maven-3.5.4\conf,用notepad打开settings.xml文件,找到标签添加上路径
<localRepository>D:\software\apache-maven-3.5.4\localrepository</localRepository>
1.2 配置远程镜像
在settings.xml文件中,找到mirrors标签,在其中加入华为云阿里云的的远程镜像
<mirror>
<id>huaweicloud</id>
<mirrorOf>*</mirrorOf>
<url>https://mirrors.huaweicloud.com/repository/maven/</url>
</mirror>
<mirror>
<id>nexus-aliyun</id>
<mirrorOf>central</mirrorOf>
<name>Nexus aliyun</name>
<url>http://maven.aliyun.com/nexus/content/groups/public</url>
</mirror>
1.3 idea maven配置
打开IDEA->Settings ->搜索maven->找到user setting file->设置为conf/settings.xml路径,勾上后面的Override
1.3.1 引入外部Jar文件的两种方法
- maven官网搜索要加入的依赖 ->
<dependency>...</dependency>
- project structure -> modules->选中对应的模块->Dependencies->±>jars or directories->选择xxx.jar确定即可
1.3.2 引入自定义Jar
-
Maven projects ->life cucle ->package:双击打资源包xxx.jar ,位于target根目录
-
Maven projects ->life cucle ->install:双击,进入maven本地仓库apache-maven-3.5.4\localrepository\cn\xym\spark\sparkstreaming\1.0-SNAPSHOT -> pom.xml -> 下面代码块 ->localRepsitory->jar包也在该目录下
<groupId>cn.xym.spark</groupId>
<artifactId>sparkstreaming</artifactId>
<version>1.0-SNAPSHOT</version>
问题:公司做项目,几层包结构
包的层级定义
第一层:域名倒置,因为域名是唯一的cn.kgc
第二层:项目名称taobaooor
第三层:模块modcart
第四层:对外释放的接口层outcart
第五层:类xxxclass
- 第六层:中间件carttmq
- 第七层:引擎engine
二、新建IntelliJ下的maven项目
1、点击File->New->Project,在弹出的对话框中选择Maven,JDK选择你自己安装的版本,点击Next
2、填写Maven的GroupId和ArtifactId
GroupId文件包:cn.xym.hadoop
ArtifactId项目名称:mapreducedemo
3、设置程序的编译版本
Java Compiler:版本号设置为1.8
modules:language设置为8
三、hadoop配置依赖
1、编辑pom.xml进行配置
修改mavensource和target为1.8
2、添加apache源
在property内尾部添加
<hadoop.version>2.6.0-cdh5.14.2</hadoop.version>
<log4j.version>1.2.17</log4j.version>
在dependency内尾部添加
<!-- hadoop-common -->
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>${hadoop.version}</version>
</dependency>
<!-- hadoop-hdfs -->
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
<version>${hadoop.version}</version>
</dependency>
<!-- hadoop-client -->
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>${hadoop.version}</version>
</dependency>
<!-- log4j -->
<dependency>
<groupId>log4j</groupId>
<artifactId>log4j</artifactId>
<version>${log4j.version}</version>
</dependency>
build标签上方,添加以下内容
<repositories>
<repository>
<id>cloudera</id>
<url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>
</repository>
</repositories>
3、点击右下角弹出的Enable Auto Import
四、hive-udf配置依赖
编辑pom.xml进行配置,添加apache源
1、修改mavensource和target为1.8
2、在name标签下粘贴源数据
<repositories>
<repository>
<id>cloudera</id>
<url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>
</repository>
</repositories>
3、在properties标签内添加以下数据
<hive.version>1.2.1</hive.version>
<hadoop.version>2.7.3</hadoop.version>
<jdk.version>0.11.0.0-cp1</jdk.version>
<junit.version>4.12</junit.version>
<mvnshade.version>2.4.1</mvnshade.version>
<confluent.maven.repo>http://packages.confluent.io/maven/</confluent.maven.repo>
4、替换dependencies里的所有标签为以下内容
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>${junit.version}</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-exec</artifactId>
<version>${hive.version}</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>${hadoop.version}</version>
</dependency>
五、HBase配置依赖
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-client</artifactId>
<version>1.2.0-cdh5.14.2</version>
</dependency>
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-common</artifactId>
<version>1.2.0-cdh5.14.2</version>
</dependency>
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-server</artifactId>
<version>1.2.0-cdh5.14.2</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>2.6.0-cdh5.14.2</version>
</dependency>
六、Spark SQL
quickstart版:
<dependencies>
<!-- https://mvnrepository.com/artifact/mysql/mysql-connector-java -->
<dependency>
<groupId>mysql</groupId>
<artifactId>mysql-connector-java</artifactId>
<version>5.1.38</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.scala-lang/scala-library -->
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>2.12.10</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-core -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.12</artifactId>
<version>2.4.5</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.12</artifactId>
<version>2.4.5</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.scala-lang/scala-reflect -->
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-reflect</artifactId>
<version>2.12.10</version>
</dependency>
</dependencies>
标准版
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<maven.compiler.source>1.8</maven.compiler.source>
<maven.compiler.target>1.8</maven.compiler.target>
<mysql.version>5.1.38</mysql.version>
<scala.version>2.11.12</scala.version>
<spark.version>2.4.5</spark.version>
<spark.scala>2.11</spark.scala>
<hive.version>1.1.0</hive.version>
</properties>
<dependencies>
<!-- mysql-connector-java -->
<dependency>
<groupId>mysql</groupId>
<artifactId>mysql-connector-java</artifactId>
<version>${mysql.version}</version>
</dependency>
<!-- scala-library -->
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>${scala.version}</version>
</dependency>
<!-- scala-reflect -->
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-reflect</artifactId>
<version>${scala.version}</version>
</dependency>
<!-- spark-core -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_${spark.scala}</artifactId>
<version>${spark.version}</version>
</dependency>
<!-- spark-sql -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_${spark.scala}</artifactId>
<version>${spark.version}</version>
</dependency>
<!-- spark-hive -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-hive_${spark.scala}</artifactId>
<version>${spark.version}</version>
</dependency>
<!-- hive-->
<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-jdbc</artifactId>
<version>${hive.version}</version>
</dependency>
</dependencies>
<build>
<pluginManagement>
<plugins>
<plugin>
<groupId>org.scala-tools</groupId>
<artifactId>maven-scala-plugin</artifactId>
<version>2.15.2</version>
<executions>
<execution>
<goals>
<goal>compile</goal>
<goal>testCompile</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<configuration>
<source>8</source>
<target>8</target>
</configuration>
</plugin>
</plugins>
</pluginManagement>
</build>
log4j.properties
log4j.rootLogger=warn, stdout
log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=%d %p [%c] - %m%n
log4j.appender.logfile=org.apache.log4j.FileAppender
log4j.appender.logfile.File=target/hadoop.log
log4j.appender.logfile.layout=org.apache.log4j.PatternLayout
log4j.appender.logfile.layout.ConversionPattern=%d %p [%c] - %m%n