一、问题背景
在综合决策平台客流预测实时计算过程中,flink30s步长窗口,需要实时大批量数据实时写入clikhouse,频繁请求导致内存异常。
User class threw exception: ru.yandex.clickhouse.except.ClickHouseException: ClickHouse exception, code: 241, host: xxx.xxx.xxx.xxx, port: 8123; Code: 241, e.displayText() = DB::Exception: Memory limit (for query) exceeded: would use 9.31 GiB (attempt to allocate chunk of 1048591 bytes), maximum: 9.31 GiB (version 19.9.5.36)
二、解决方案
1、修改users.xml
<max_memory_usage>15032385536</max_memory_usage>
没有解决根本问题
2.使用批量插入 ,可使用JDBCAppendTableSink
val sink2 = JDBCAppendTableSink
.builder()
.setDrivername("ru.yandex.clickhouse.ClickHouseDriver")
.setDBUrl(Constants.CLICKHOUSE_URL)
.setUsername(Constants.CLICKHOUSE_USERNAME)
.setPassword(Constants.CLICKHOUSE_PASSWORD)
.setQuery(sqlInsertIntoCapacity02)
.setBatchSize(50) //批量值不够不会执行插入
.setParameterTypes(Types.STRING, Types.STRING, Types.STRING, Types.STRING, Types.STRING, Types.STRING, Types.STRING, Types.STRING, Types.STRING, Types.STRING, Types.INT, Types.STRING, Types.STRING, Types.STRING, Types.STRING, Types.STRING, Types.INT)
.build()
存在问题:批量值不够不会执行插入
3、通过keyBy分区,批量插入
.keyBy(ads => ads.time_id.toString)
.window(TumblingEventTimeWindows.of(Time.seconds(30)))
.process(new ProcessWindowFunction[XXX, List[XXX], String, TimeWindow] {
override def process(key: String, context: Context, input: Iterable[XXX], out: Collector[List[XXX]]): Unit = {
val list = input.toList
if (list.size > 0) {
println("内部" + list.size)
out.collect(list)
}
}
}).addSink(new JDBCSinkXXX)
class JDBCSinkXXX extends RichSinkFunction[List[XXX]] {
// 定义连接、预编译语句
var conn: Connection = _
override def open(parameters: Configuration): Unit = {
conn = DriverManager.getConnection("jdbc:clickhouse://xxxx:8123/db", "default", "123@123")
conn.setAutoCommit(false); //关键,参考源码看注解
}
@Override
override def invoke(list: List[xxx]): Unit = {
var insertStmt: PreparedStatement = null
try {
insertStmt = conn.prepareStatement("insert into ads_index(xxx,xxx...) values (?,?,....)")
for (pedNum <- list) {
insertStmt.setString(1, "index_0011")
..........
insertStmt.addBatch()
}
val count = insertStmt.executeBatch //批量执行
conn.commit()
insertStmt.close()
System.out.println("成功了插入了" + count.length + "行数据")
} catch {
case e: SQLException => println(e)
}
}
override def close(): Unit = {
conn.close()
}
}
以上是个人处理方式。
三、测试一段时间,通过高峰时段测试!

2144

被折叠的 条评论
为什么被折叠?



