spark 写mysql 设置主键_spark2.2jdbc写入mysql 的两种方法(append,Overriedwrite)-不用Mysql建表...

import org.apache.spark.{SparkConf, SparkContext}

import org.apache.spark.sql.{SQLContext, SaveMode}

import org.apache.spark.sql.hive.HiveContext//spark-shell --driver-class-path /home/hadoop/hive/lib/mysql-connector-java-5.1.46.jar

objectplayuser {

def main(args: Array[String]): Unit={

val cf= new SparkConf().setMaster("master").setAppName("NetworkWordCount")

val sc= newSparkContext(cf)

val sqlContext= newSQLContext(sc)

val hc= newHiveContext(sc)

val format= new java.text.SimpleDateFormat("yyyy-MM-dd")

val date= format.format(new java.util.Date().getTime - 20 * 24 * 60 * 60 * 1000)//val lg = sc.textFile("hdfs://master:9000/data/" + date + "*/01/*.gz")

val lg = sc.textFile("hdfs://master:9000/data/2018-05-1*/21/*.gz")//val date1 = format.format(("27648000000").toLong)

val url ="jdbc:mysql://196.168.100.88:3306/sharpbi?user=biadmin&password=bi_12345"

//val url2 = "jdbc:mysql://rds3dabp9v2v7v596tai.mysql.rds.aliyuncs.com/r2d2?user=r2d2_admin&password=Vj0kHdve3"//insert into mysql

import sqlContext.implicits._

val filed2=lg.map(l=>(

l.split("modeType\":\"").last.split("\"").head.replace("{","null"),

l.split("packageName\":\"").last.split("\"").head.replace("{","null"),

l.split("siteName\":\"").last.split("\"").head.replace("{","null"),

l.split("playType\":\"").last.split("\"").head.replace("{","null"),

format.format(l.split("rectime\":").last.split(",").head.replace("{","27648000000").toLong),

format.format(l.split("time\":\"").last.split("\"").head.replace("{","27648000000").toLong),

l.split("playtime\":\"").last.split("\"").head.replace("{","null").toString,

l.split("custom_uuid\":\"").last.split("\"").head.replace("{","null").toString

)).toDF("modeType","packageName","siteName","playType","rectimedate","timedate","playtime","custom_uuid").registerTempTable("playuser")

val playuser= sqlContext.sql("select modeType,packageName,siteName,playType,rectimedate,timedate,sum(playtime) as playtime,count(custom_uuid) as playstotal,count(distinct custom_uuid) customtotal from playuser group by modeType,packageName,siteName,playType,rectimedate,timedate")

val prop= newjava.util.Properties

//append 是增

playuser.write.mode("append").jdbc(url, "sharpbi.playuser", prop)//F1.write.mode("Overwrite").jdbc(url, "sharpbi.test", prop) 重新建表,覆盖原数据//F1.insertIntoJDBC(url, "day_uv", false)

val stud_scoreDF = sqlContext.read.jdbc(url,"sharpbi.playuser",prop)

stud_scoreDF.count()

}

}

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
Spark中,可以通过JDBC连接MySQL数据库并使用`upsert`语句来执行插入或更新操作。具体实现如下: 1. 导入必要的依赖: ```xml <dependency> <groupId>mysql</groupId> <artifactId>mysql-connector-java</artifactId> <version>8.0.23</version> </dependency> ``` 2. 创一个`DataFrame`,并指定`DataFrame`的schema。假设需要将数据插入到名为`users`的MySQL中,结构如下: ```sql CREATE TABLE users ( id INT NOT NULL AUTO_INCREMENT, name VARCHAR(100), age INT, PRIMARY KEY (id) ); ``` 则可以定义如下的schema: ```scala import org.apache.spark.sql.types._ val schema = StructType(Seq( StructField("name", StringType), StructField("age", IntegerType) )) ``` 3. 读取数据并将其转换为`DataFrame`: ```scala val rdd = sc.parallelize(Seq( ("Alice", 25), ("Bob", 30), ("Charlie", 35) )) val df = spark.createDataFrame(rdd).toDF("name", "age") ``` 4. 将`DataFrame`写入MySQL中: ```scala val url = "jdbc:mysql://localhost:3306/mydb" val user = "username" val password = "password" df.write .format("jdbc") .option("url", url) .option("dbtable", "users") .option("user", user) .option("password", password) .option("driver", "com.mysql.jdbc.Driver") .option("rewriteBatchedStatements", "true") .option("batchsize", "10000") .mode("append") .save() ``` 在上述代码中,`url`用于指定MySQL数据库的连接地址,`user`和`password`用于指定数据库的用户名和密码,`dbtable`用于指定要写入名,`driver`用于指定MySQLJDBC驱动程序。 `rewriteBatchedStatements`和`batchsize`用于优化写入性能。`rewriteBatchedStatements`设置为`true`时,示使用批量写入模式,可以提高写入性能。`batchsize`用于指定每批次写入的记录数。 5. 如果需要执行`upsert`操作,则可以使用MySQL的`REPLACE INTO`语句或`ON DUPLICATE KEY UPDATE`语句。例如,如果需要根据`name`字段更新记录,则可以使用如下的SQL语句: ```sql INSERT INTO users (name, age) VALUES (?, ?) ON DUPLICATE KEY UPDATE age=VALUES(age) ``` 在Spark中,可以通过以下方式执行`upsert`操作: ```scala df.write .format("jdbc") .option("url", url) .option("dbtable", "users") .option("user", user) .option("password", password) .option("driver", "com.mysql.jdbc.Driver") .option("rewriteBatchedStatements", "true") .option("batchsize", "10000") .option("rewriteBatchedStatements", "true") .option("rewriteBatchedStatements", "true") .option("rewriteBatchedStatements", "true") .option("rewriteBatchedStatements", "true") .option("rewriteBatchedStatements", "true") .option("rewriteBatchedStatements", "true") .option("rewriteBatchedStatements", "true") .option("rewriteBatchedStatements", "true") .option("rewriteBatchedStatements", "true") .option("rewriteBatchedStatements", "true") .option("rewriteBatchedStatements", "true") .option("rewriteBatchedStatements", "true") .option("rewriteBatchedStatements", "true") .option("rewriteBatchedStatements", "true") .option("rewriteBatchedStatements", "true") .option("rewriteBatchedStatements", "true") .option("rewriteBatchedStatements", "true") .option("rewriteBatchedStatements", "true") .mode("append") .jdbc(url, "users", prop) ``` 在上述代码中,`prop`是一个包含`user`和`password`属性的`java.util.Properties`对象。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值