第一种idea直接调用metastore(引入spark-hive依赖包,引入hive-hcatalog-core依赖包)
val spark = SparkSession.builder().master("local").appName("datasource")
.config("fs.defaultFS", "hdfs://wml.com:9000")
.config("spark.sql.warehouse.dir", "hdfs://wml.com:9000/test")
.config("javax.jdo.option.ConnectionURL", "jdbc:mysql://wml.com:3306/test?createDatabaseIfNotExist=true")
.config("javax.jdo.option.ConnectionDriverName","com.mysql.jdbc.Driver")
.config("javax.jdo.option.ConnectionUserName", "root")
.config("javax.jdo.option.ConnectionPassword", "root")
.enableHiveSupport().getOrCreate()
第二种调用hive的metastore服务入口(引入spark-hive依赖包,引入hive-hcatalog-core依赖包)
1.虚拟机hive中配置hive-site.xml
1).hive.metastore.schema.verification设置成false
2).hive.metastore.uris设置成thrift://hadoop-senior.test.com:9083
2.虚拟机hive开启metastore服务
bin/hive --service metastore &
val spark = SparkSession.builder().master("local").appName("hive-datasource") .config("fs.defaultFS", "hdfs://wml.com:9000") .config("spark.sql.warehouse.dir", "hdfs://wml.com:9000/test") .config("hive.metastore.uris", "thrift://wml.com:9083") .enableHiveSupport().getOrCreate()
虚拟机使用
val spark = SparkSession.builder().appName("hive-datasource-server")
.config("spark.sql.warehouse.dir", "hdfs://hadoop-senior.test.com:8020/test")
.enableHiveSupport().getOrCreate()
1.直接调用metastore数据库
hive-site.xml需要四大参数,不需要hive.metastore.uris配置项
引入驱动包
2.调用hive的metastore服务入口
1).只需要spark-sql依赖 打成jar包放到虚拟机中
2).虚拟机中的配置
1).引入hive-hcatalog-core.xxx.jar到spark的jars目录中
2).将hive-site.xml放到spark的conf目录
3).提交作业 bin/spark-submit --class HiveDataSourceServer xxxx.jar people.json