Spark & Hive集成

Spark & Hive集成

代码

  • 修改hive-site.xml
<property>
  <name>javax.jdo.option.ConnectionURL</name>
  <value>jdbc:mysql://CentOS:3306/hive?createDatabaseIfNotExist=true</value>
</property>

<property>
  <name>javax.jdo.option.ConnectionDriverName</name>
  <value>com.mysql.jdbc.Driver</value>
</property>

<property>
  <name>javax.jdo.option.ConnectionUserName</name>
  <value>root</value>
</property>

<property>
  <name>javax.jdo.option.ConnectionPassword</name>
  <value>root</value>
</property>

 <!--开启MetaStore服务,用于Spark读取hive中的元数据-->
<property>
    <name>hive.metastore.uris</name>
    <value>thrift://CentOS:9083</value>
</property>
<property>
    <name>hive.metastore.local</name>
    <value>false</value>
</property>

<property>
    <name>hive.metastore.schema.verification</name>
    <value>false</value>
</property>

  • 启动metastore服务
[root@CentOS apache-hive-1.2.2-bin]# ./bin/hive --service metastore >/dev/null 2>&1 &
[1] 55017
  • 导入以下依赖
<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-sql_2.11</artifactId>
    <version>2.4.5</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-hive -->
<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-hive_2.11</artifactId>
    <version>2.4.5</version>
</dependency>
  • 编写如下代码
 //配置spark
val spark = SparkSession.builder()
					  .appName("Spark Hive Example")
					  .master("local[*]")
					  .config("hive.metastore.uris", "thrift://CentOS:9083")
					  .enableHiveSupport() //启动hive支持
					  .getOrCreate()

    spark.sql("show databases").show()
    spark.sql("use baizhi")
    spark.sql("select * from t_emp").na.fill(0.0).show()

    spark.close()
+-----+------+---------+----+-------------------+-------+-------+------+
|empno| ename|      job| mgr|           hiredate|    sal|   comm|deptno|
+-----+------+---------+----+-------------------+-------+-------+------+
| 7369| SMITH|    CLERK|7902|1980-12-17 00:00:00| 800.00|   0.00|    20|
| 7499| ALLEN| SALESMAN|7698|1981-02-20 00:00:00|1600.00| 300.00|    30|
| 7521|  WARD| SALESMAN|7698|1981-02-22 00:00:00|1250.00| 500.00|    30|
| 7566| JONES|  MANAGER|7839|1981-04-02 00:00:00|2975.00|   0.00|    20|
| 7654|MARTIN| SALESMAN|7698|1981-09-28 00:00:00|1250.00|1400.00|    30|
| 7698| BLAKE|  MANAGER|7839|1981-05-01 00:00:00|2850.00|   0.00|    30|
| 7782| CLARK|  MANAGER|7839|1981-06-09 00:00:00|2450.00|   0.00|    10|
| 7788| SCOTT|  ANALYST|7566|1987-04-19 00:00:00|1500.00|   0.00|    20|
| 7839|  KING|PRESIDENT|   0|1981-11-17 00:00:00|5000.00|   0.00|    10|
| 7844|TURNER| SALESMAN|7698|1981-09-08 00:00:00|1500.00|   0.00|    30|
| 7876| ADAMS|    CLERK|7788|1987-05-23 00:00:00|1100.00|   0.00|    20|
| 7900| JAMES|    CLERK|7698|1981-12-03 00:00:00| 950.00|   0.00|    30|
| 7902|  FORD|  ANALYST|7566|1981-12-03 00:00:00|3000.00|   0.00|    20|
| 7934|MILLER|    CLERK|7782|1982-01-23 00:00:00|1300.00|   0.00|    10|
+-----+------+---------+----+-------------------+-------+-------+------+

交互

1、需要将spark-hive_2.11-2.4.5.jar、spark-hive-thriftserver_2.11-2.4.5.jar拷贝到spark的jar目录,重启spark
jiar

链接:https://pan.baidu.com/s/1nkKesJyRitfvjO7bINlHdA
提取码:9nkb

2、将hive-site.xml文件拷贝到spark的conf目录下

3、需要将Hive的jar的类路径配置到hadoop的类路径下
就是Hive中的lib下的所有jar

SPARK_HOME=/usr/spark-2.4.5
KE_HOME=/usr/kafka-eagle
M2_HOME=/usr/apache-maven-3.6.3
SQOOP_HOME=/usr/sqoop-1.4.7
HIVE_HOME=/usr/apache-hive-1.2.2-bin
JAVA_HOME=/usr/java/latest
HADOOP_HOME=/usr/hadoop-2.9.2/
HBASE_HOME=/usr/hbase-1.2.4/
ZOOKEEPER_HOME=/usr/zookeeper-3.4.6
PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HBASE_HOME/bin:$M2_HOME/bin:$HIVE_HOME/bin:$SQOOP_HOME/bin:$ZOOKEEPER_HOME/bin:$KE_HOME/bin:$SPARK_HOME/bin:$SPARK_HOME/sbin
CLASSPATH=.
export JAVA_HOME
export PATH
export CLASSPATH
export HADOOP_HOME
export HBASE_HOME
HBASE_CLASSPATH=$(/usr/hbase-1.2.4/bin/hbase classpath)
HADOOP_CLASSPATH=/root/mysql-connector-java-5.1.49.jar:/usr/spark-2.4.5/jars/spark-hive_2.11-2.4.5.jar:/usr/spark-2.4.5/jars/spark-hive-thriftserver_2.11-2.4.5.jar:$HIVE_HOME/lib/*
export HADOOP_CLASSPATH
export M2_HOME
export HIVE_HOME
export SQOOP_HOME
export ZOOKEEPER_HOME
export KE_HOME
export SPARK_HOME

4、执行如下指令

[root@CentOS spark-2.4.5]# ./bin/spark-sql --master spark://CentOS:7077 --total-executor-cores 6 --packages org.apache.spark:spark-hive-thriftserver_2.11:2.4.5
.
spark-sql> show databases;
20/11/04 12:06:33 INFO codegen.CodeGenerator: Code generated in 748.341192 ms
baizhi
default
test
Time taken: 5.818 seconds, Fetched 3 row(s)
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值