Spark-SQL很强大,可以读写各种JDBC的库,先来一弹MySQL的,超简单:
1.MySQL库建测试表和数据:
CREATE TABLE t_realtime.test_spark2mysql (
id bigint(11) NOT NULL AUTO_INCREMENT,
name varchar(30) DEFAULT NULL,
age int DEFAULT NULL,
PRIMARY KEY (id)
) ENGINE=InnoDB;
INSERT INTO t_realtime.test_spark2mysql
(name, age)
VALUES('张三', 23);
INSERT INTO t_realtime.test_spark2mysql
(name, age)
VALUES('李四', 33);
INSERT INTO t_realtime.test_spark2mysql
(name, age)
VALUES('王五', 66);
2.在Spark客户机上(有Spark客户端的),启动spark-sql:
## 注:如果集群机class-path上已经有了mysql jar就不需要提交了,否则要准备好mysql jdbc jar包
## 在当前提交的客户机上准备jar包mysql-connector-java-5.1.45-bin.jar
## --driver-class-path:指定Driver端所需jar
## --jars :指定executor所需要jar
spark-sql --driver-class-path /home/dw/hubg/mysql-connector-java-5.1.45-bin.jar \
--jars /home/dw/hubg/mysql-connector-java-5.1.45-bin.jar
3.执行Spark SQL语句:
-- 创建spark mysql的临时映射表
CREATE TEMPORARY VIEW test_spark2mysql
USING org.apache.spark.sql.jdbc
OPTIONS (
url "jdbc:mysql://10.0.马赛克:5506",
dbtable "t_realtime.test_spark2mysql",
user '马赛克_writer',
password '马赛克'
);
-- 写入一条数据
INSERT into TABLE test_spark2mysql
SELECT
101
,'hubg' as name
,26 as age ;
-- 查询表中结果
select * from test_spark2mysql limit 10;
完成,收工!!