Spark连接mysql
将mysql-connector包 导入spark/jars/ 路径内
利用idea工具连接
代码如下:
package nj.zb.kb11
import org.apache.spark.{SparkConf, SparkContext}
import org.apache.spark.sql.{DataFrame, SparkSession}
object DataFrameToMysql {
def main(args: Array[String]): Unit = {
val spark: SparkSession = SparkSession.builder().appName("sparktohive")
.master("local[*]").config("hive.metastore.uris", "thrift://192.168.146.222:9083")
.enableHiveSupport() //连接必须的
.getOrCreate()
val url="jdbc:mysql://192.168.146.222:3306/emp"
val user="root"
val password="1"
val properties = new java.util.Properties()
properties.setProperty("user",user)
properties.setProperty("password",password)
properties.setProperty("driver","com.mysql.jdbc.Driver")
val tableDF: DataFrame = spark.read.jdbc(url,"emp",properties)
tableDF.printSchema()
tableDF.show()
import org.apache.spark.sql.functions._
val frame: DataFrame = tableDF.agg(max("RETENTION"))
frame.write.jdbc(url,"tttt",properties)
}
}
Spark连接hive
配置文件
配置hive文件路径下,conf内的hive-site.xml文件,添加如下内容:
<property>
<name>hive.server2.thrift.client.user</name>
<value>root</value>
<description>Username to use against thrift client</description>
</property>
<property>
<name>hive.server2.thrift.client.password</name>
<value>root</value>
<description>Password to use against thrift client</description>
</property>
<property>
<name>hive.metastore.uris</name>
<value>thrift://192.168.168.222:9083</value>
</property>
保存退出后,将该文件拷贝一份到spark/conf/ 目录内
启动hive服务
nohup /opt/soft/hive/bin/hive --service metastore & //启动hive元数据Metastore服务
nohup /opt/soft/hive/bin/hive --service hiveserver2 & //启动hiveserver2 服务
通过idea工具连接
package nj.zb.kb11
import org.apache.spark.sql.SparkSession
//用spark读取hive数据
object SparkToHive {
def main(args: Array[String]): Unit = {
val spark: SparkSession = SparkSession.builder().appName("sparktohive")
.master("local[*]").config("hive.metastore.uris", "thrift://192.168.146.222:9083")
.enableHiveSupport() //连接hive必须的
.getOrCreate()
spark.sql("show databases").collect().foreach(println)
}
}
利用spark-shell连接
方法比较简单,配置文件完成后,直接输入命令就可以:
spark table("test.employee")