JDBC数据源实战
Spark SQL支持使用JDBC从关系型数据库(比如MySQL)中读取数据。读取的数据是DataFrame,可以很方便地使用Spark Core提供的各种算子进行处理。
一、JDBC的读取和写入格式
读取JDBC数据的格式
Scala版本
val jdbcDF = sqlContext.read.format("jdbc")
.options(Map(
"url"->"jdbc:mysql://localhost:3306/unicom",
"user" -> "root",
"password" -> "199037",
"dbtable" -> "dataframe1"
))
.load()
或者
val dataframe_mysql = sqlcontext.read.format("jdbc")
.option("url", "jdbc:mysql://localhost:3306/unicom")
.option("driver", "com.mysql.jdbc.Driver")
.option("dbtable", "dataframe1").option("user", "root")
.option("password", "199037")
.load()
或者
val connectProperties=new java.util.Properties()
connectProperties.put("user","username")
connectProperties.put("password","yourpassword")
val url="jdbc:mysql://localhost:3306/unicom"
val df=spark.read.jdbc(url,"outDF",connectProperties)
保存到JDBC的格式
Scala版本
import java.util.Properties
import org.apache.spark.sql.SaveMode
val connectProperties = new Properties()
connectProperties.put("user", "root")
connectProperties.put("password", "199037")
val mysqlDriverUrl = "jdbc:mysql://localhost:3306/unicom?useUnicode=true&characterEncoding=utf-8"
outDF.write.mode(SaveMode.Append).jdbc(mysqlDriverUrl, "outDF", connectProperties)
二、实例讲解
案例:查询分数为优秀的学生信息
import java.util.Properties
import org.apache.spark.sql.functions._
import org.apache.spark.sql.{SQLContext, SaveMode}
import org.apache.spark.{SparkConf, SparkContext}
/**
* Created by cuiyufei on 2018/3/8.
* jdbc实战
*/
object sparkStudy {
/**
* 将学生基本信息和学生成绩信息合并,并找出优秀的学生(成绩大于85)
*/
def main(args: Array[String]) {
val conf = new SparkConf().setAppName("dataframeOperation").setMaster("local")
val sc = new SparkContext(conf)
val sqlContext = new SQLContext(sc)
//学生基本信息
val stuInfoDF = sqlContext.read.format("jdbc")
.options(Map(
"url"->"jdbc:mysql://localhost:3306/unicom",
"user" -> "root",
"password" -> "199037",
"dbtable" -> "student_info"
))
.load()
//学生成绩信息
val stuScoreDF = sqlContext.read.format("jdbc")
.options(Map(
"url"->"jdbc:mysql://localhost:3306/unicom",
"user" -> "root",
"password" -> "199037",
"dbtable" -> "student_score"
))
.load()
//去除分数大于85的学生的信息
val outDF = stuInfoDF.join(stuScoreDF,"name").filter("score>85")
//将数据写入mysq以方便观察
val connectProperties = new Properties()
connectProperties.put("user", "root")
connectProperties.put("password", "199037")
val mysqlDriverUrl = "jdbc:mysql://localhost:3306/unicom?useUnicode=true&characterEncoding=utf-8"
outDF.write.mode(SaveMode.Append).jdbc(mysqlDriverUrl, "outDF", connectProperties)
}
}
学生基本信息,如下图所示:
学生成绩信息,如下图所示:
优秀学生信息,最后的结果信息如下图所示: