spark之JDBC开发(连接数据库测试)
以下操作属于本地模式操作:
1、在Eclipse4.5中建立工程RDDToJDBC,并创建一个文件夹lib用于放置第三方驱动包
[hadoop@CloudDeskTop software]$ cd /project/RDDToJDBC/
[hadoop@CloudDeskTop RDDToJDBC]$ mkdir -p lib
[hadoop@CloudDeskTop RDDToJDBC]$ ls
bin lib src
2、添加必要的环境
2.1、将MySql的jar包拷贝到工程目录RDDToJDBC下的lib目录下
[hadoop@CloudDeskTop software]$ cp -a /software/hive-1.2.2/lib/mysql-connector-java-3.0.17-ga-bin.jar /project/RDDToJDBC/lib/
2.1、将Spark的开发库Spark2.1.1-All追加到RDDToJDBC工程的classpath路径中去(可以通过添加用户库的方式来解决);Spark2.1.1-All中包含哪些包,请点击此处
3、准备spark的源数据:
[hadoop@CloudDeskTop spark]$ cd /home/hadoop/test/jdbc/[hadoop@CloudDeskTop jdbc]$ ls
myuser testJDBC.txt
[hadoop@CloudDeskTop jdbc]$ cat myuser
lisi123456 165 1998-9-9lisan 123ss187 2009-10-19wangwu 123qqwe177 1990-8-3
4、开发源码:
packagecom.mmzs.bigdata.spark.core.local;importjava.io.File;importjava.sql.Connection;importjava.sql.Date;importjava.sql.DriverManager;importjava.sql.PreparedStatement;importjava.sql.SQLException;importorg.apache.spark.SparkConf;importorg.apache.spark.api.java.JavaRDD;importorg.apache.spark.api.java.JavaSparkContext;importorg.apache.spark.api.java.function.VoidFunction;importscala.Tuple4;public classTestMain {/*** 全局计数器*/
private static intcount;/*** 数据库连接*/
private staticConnection conn;/*** 预编译语句*/
private staticPreparedStatement pstat;private static final File OUT_PATH=new File("/home/hadoop/test/jdbc/output");static{
delDir(OUT_PATH);try{ </