官网文档:https://clickhouse.tech/docs/zh/
spark读取clickhouse数据
一:这种jdbc的连接加载的是全量表数据
val prop = new java.util.Properties
prop.setProperty("user", "default")
prop.setProperty("password", "123456")
prop.setProperty("driver", "ru.yandex.clickhouse.ClickHouseDriver")
val readDataDf = sparkSession
.read
.jdbc("jdbc:clickhouse://hadoop102:8123",
"table_op",
prop)
.where("LocationTime>='2021-09-21 09:00:00' AND LocationTime<='2021-09-21 18:00:00'")
二:这种是添加过滤条件加载部分数据(推荐这种,因为如果你的表很大的话spark任务driver启不来)
//将过滤查询提前存到临时表
val tablename = s"(select * from table_op where LocationTime between '$start_time' a