项目场景:
sparkdf的write写法
dataFrame.write.mode("overwrite")
.format("parquet")
.saveAsTable("sss")
sparksql写法
sparkSessionWithHive.sql(
"""
|insert overwrite table school.stu
|select id,name,scores from v
|""".stripMargin)
问题描述
1
2
3
发现源头在
java.lang.NoSuchMethodError: org.apache.hadoop.fs.FSOutputSummer.<init>(Ljava/util/zip/Checksum;II)V
原因分析:
发现原来spark-core_2.11-2.1.1.jar依赖的hadoop-client是2.2.0版本,它又依赖了hadoop-hdfs-2.2.0版本,这个与使用的2.7.7不相符:
解决方案:
删掉org.apache.hadoop:hadoop-hdfs:2.2.0
重新引入
示例
成功
完整代码
package SparkToHive
import org.apache.spark.SparkConf
import org.apache.spark.sql.SparkSession
import java.util.Properties
//spark读取Mysql数据,并写入到Hive指定表中
object SparkHivePatition {
def main(args: Array[String]): Unit = {
val sparkConf=new SparkConf().setMaster("local[*]").setAppName("ReadMysqlToHive").set("spark.testing.memory","2147480000")
System.setProperty("HADOOP_USER_NAME", "root")
//创建Hive的session
val sparkSessionWithHive: SparkSession = SparkSession.builder()
.config("hive.metastore.uris", "thrift://192.168.1.12:9083")
.config("spark.sql.warehouse.dir", "hdfs://192.168.1.12:9000//user/hive/warehouse")//指定hive在hdfs上的warehouse仓库地址
.enableHiveSupport()
.config("spark.sql.parquet.writeLegacyFormat", true)
.config(sparkConf).getOrCreate()
// 这个地方的端口统一使用9083,并且在虚拟机开启hiveserver2服务
//mysql的配置信息
val prop=new Properties()
prop.put("user","root")
prop.put("password","123456")
val url="jdbc:mysql://master:3306/school?useUnicode=true&characterEncoding=UTF-8&serverTimezone=UTC&useSSL=false"
//查询MySQL的mysql-connector-java-5.1.38.jar数据,用dataFrame保存数据
val dataFrame=sparkSessionWithHive.read.jdbc(url,"stu",prop).select("*").where("id>0")
// .createOrReplaceTempView("v")
//切换hive数据库
sparkSessionWithHive.sql("use school")
//以年为分组,将数据保存到表new_test_partition中,savaMode为覆盖,也可以选择append追加
sparkSessionWithHive.sql("drop table if exists stu")
sparkSessionWithHive.sql("create table stu(id int,name string,scores int) " )
sparkSessionWithHive.sql("select * from stu").show()
// "row format delimited fields terminated by '\t'")
dataFrame.write.mode("overwrite")
.format("parquet")
.saveAsTable("sss")
//
// sparkSessionWithHive.sql(
// """
// |insert overwrite table school.stu
// |select id,name,scores from v
// |""".stripMargin)
sparkSessionWithHive.sql("select * from stu").show() //读取Hive表并展示到控制台
sparkSessionWithHive.stop()
}
}