Spark-Hive

8 篇文章 0 订阅
2 篇文章 0 订阅

启动:spark-sql
(1)log4j.properties :
log4j.rootCategory=WARN,console
这样就不会有大量INFO输出了

(2)将HIVE_HOME/conf/hive-site.xml拷贝到SPARK_HOME/conf/hive-site.xml
但是注意,hive-site.xml必须正确设置,否则启动spark-sql时会出一堆错误。

Hive中的#hive-site.xml:metastore.warehouse.dir
property>
  <name>hive.metastore.warehouse.dir</name>
  <value>/home/hadoop/hdfs/hive/iotmp</value>
  <description>location of default database for the warehouse</description>
</property>

在Spark中使用Hive时候,首选SparkSession 启动时候需要设置Hive支持(看后面代码就明白了),包括连接到持久化的Hive metastore,支持Hive序列化反序列化,以及用户自定义函数(UDF)。用户即使没有部署Hive,依然可以在Spark中获得Hive的支持。当不用hive-site.xml configuration时,contex也在当前目录会自动产生metastore_db。但是注意,在Spark2.0.0版本,不再使用hive.metastore.warehouse.dir ,转而使用spark.sql.warehouse.dir 作为数据库默认warehouse。你或许还需要授予spark用户写的权限。
When working with Hive, one must instantiate SparkSession with Hive support, including connectivity to a persistent Hive metastore, support for Hive serdes, and Hive user-defined functions. Users who do not have an existing Hive deployment can still enable Hive support. When not configured by the hive-site.xml, the context automatically creates metastore_db in the current directory and creates a directory configured by spark.sql.warehouse.dir, which defaults to the directory spark-warehouse in the current directory that the spark application is started. Note that the hive.metastore.warehouse.dir property in hive-site.xml is deprecated since Spark 2.0.0. Instead, use spark.sql.warehouse.dir to specify the default location of database in warehouse. You may need to grant write privilege to the user who starts the spark application.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值