在搭建IDEA使用遇到了一些坑和解决了下面废话不多说,附上步骤
1.配置本地HADOOP_HOME
下载winutils的windows的版本,
地址:https://github.com/srccodes/hadoop-common-2.2.0-bin,直接下载此项目的zip包,下载后是文件名是hadoop-common-2.2.0-bin-master.zip,随便解压到一个目录 例如:(E:\hadoop-common\)
本地添加本地winutils有两个方法,
第一个方法是在代码的开头加上
System.setProperty("hadoop.home.dir", "E:\\hadoop-common")
第二个方法是:设置环境变量(我设置过了但是没有生效)
增加用户变量HADOOP_HOME,值是下载的zip包解压的目录(E:\\hadoop-common),然后在系统变量path里增加%HADOOP_HOME%\bin 即可
原因:程序需要根据HADOOP_HOME找到winutils.exe,由于win并没有配置该环境变量,所以程序报 null\bin\winutils.exe
2.把hive/conf/hive-site.xml ,hadoop-conf下core-site.xml;hdfs-conf下hfds-site.xml 放入工程目录resources下
hive-site.xml文件如下:
<configuration> <property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:mysql://master:3306/hive?createDatabaseIfNotExist=true</value> </property> <property> <name>javax.jdo.option.ConnectionDriverName</name> <value>com.mysql.jdbc.Driver</value> </property> <property> <name>javax.jdo.option.ConnectionUserName</name> <value>root</value> </property> <property> <name>javax.jdo.option.ConnectionPassword</name> <value>hadoop</value> </property> <property> <name>hive.metastore.warehouse.dir</name> <value>/usr/hive/warehouse</value> </property> <property> <name>hive.exec.scratchdir</name> <value>/usr/hive/tmp</value> </property> <property> <name>hive.metastore.uris</name> <value>thrift://master:9083</value> </property> <property> <name>hive.cli.print.current.db</name> <value>true</value> </property> </configuration> 3.配置pom.xml文件
<?xml version="1.0" encoding="UTF-8"?> <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>com.badou</groupId> <artifactId>spark11Pro</artifactId> <version>1.0-SNAPSHOT</version> <properties> <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding> <project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding> <scala.binary.version>2.11</scala.binary.version> <PermGen>64m</PermGen> <MaxPermGen>512m</MaxPermGen> <spark.version>2.0.0</spark.version> <scala.version>2.11</scala.version> </properties> <dependencies> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-core_${scala.version}</artifactId> <version>${spark.version}</version> <!--<exclusions>--> <!--<exclusion>--> <!--<groupId>org.slf4j</groupId>--> <!--<artifactId>slf4j-log4j12</artifactId>--> <!--</exclusion>--> <!--</exclusions>--> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-streaming_${scala.version}</artifactId> <version>${spark.version}</version> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-sql_${scala.version}</artifactId> <version>${spark.version}</version> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-hive_${scala.version}</artifactId> <version>${spark.version}</version> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-mllib_${scala.version}</artifactId> <version>${spark.version}</version> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-streaming-kafka-0-8_${scala.version}</artifactId> <version>${spark.version}</version> </dep