Flink1.8.0版本。好奇心来自Flink sql设置withIdleStateRetentionTime的最小和最大时间后,状态是如何删除的。
qConfig.withIdleStateRetentionTime(
Time.hours(24),
Time.hours(48));
点击withIdleStateRetentionTime进入源码会进入StreamQueryConfig类。主要参数minIdleStateRetentionTime和maxIdleStateRetentionTime。点击maxIdleStateRetentionTime查询什么类在使用maxIdleStateRetentionTime 。
/**
* The minimum time until state which was not updated will be retained.
* State might be cleared and removed if it was not updated for the defined period of time.
*/
private long minIdleStateRetentionTime = 0L;
/**
* The maximum time until state which was not updated will be retained.
* State will be cleared and removed if it was not updated for the defined period of time.
*/
private long maxIdleStateRetentionTime = 0L;
我比较关心KeyedProcessFunctionWithCleanupState类进入。然后就会发现initCleanupTimeState()和processCleanupTimer()。initCleanupTimeState方法比较简单就是注册了一个state。state对应值就是删除状态时间的时间戳。
protected val minRetentionTime: Long = queryConfig.getMinIdleStateRetentionTime
protected val maxRetentionTime: Long = queryConfig.getMaxIdleStateRetentionTime
protected val stateCleaningEnabled: Boolean = minRetentionTime > 1
// holds the latest registered cleanup timer
protected var cleanupTimeState: ValueState[JLong] = _
protected def initCleanupTimeState(stateName: String) {
if (stateCleaningEnabled) {
val inputCntDescriptor: ValueStateDescriptor[JLong] =
new ValueStateDescriptor[JLong](stateName, Types.LONG)
cleanupTimeState = getRuntimeContext.getState(inputCntDescriptor)
}
}
protected def processCleanupTimer(
ctx: KeyedProcessFunction[K, I, O]#Context,
currentTime: Long): Unit = {
if (stateCleaningEnabled) {
registerProcessingCleanupTimer(
cleanupTimeState,
currentTime,
minRetentionTime,
maxRetentionTime,
ctx.timerService()
)
}
}
继续看第二个方法processCleanupTimer。stateCleaningEnabled这个判断可以忽略,其实看到registerProcessingCleanupTimer就知道这个方法是做什么的。不过还是继续点进入,cleanupTimeState在初始化时为null。cleanupTime= currentTime + maxRetentionTime 当前时间+最大的过期时间。然后注册一个registerProcessingTimeTimer来删除状态。同时更新cleanupTimeState=当前时间+最大的过期时间。这是第一次初始化时发生的事情。
def registerProcessingCleanupTimer(
cleanupTimeState: ValueState[JLong],
currentTime: Long,
minRetentionTime: Long,
maxRetentionTime: Long,
timerService: TimerService): Unit = {
// last registered timer
val curCleanupTime = cleanupTimeState.value()
// check if a cleanup timer is registered and
// that the current cleanup timer won't delete state we need to keep
if (curCleanupTime == null || (currentTime + minRetentionTime) > curCleanupTime) {
// we need to register a new (later) timer
val cleanupTime = currentTime + maxRetentionTime
// register timer and remember clean-up time
timerService.registerProcessingTimeTimer(cleanupTime)
// delete expired timer
if (curCleanupTime != null) {
timerService.deleteProcessingTimeTimer(curCleanupTime)
}
cleanupTimeState.update(cleanupTime)
}
}
然后问题来了,在官网上有这样的两句话:
- The minimum idle state retention time defines how long the state of an inactive key is at least kept before it is removed.
- The maximum idle state retention time defines how long the state of an inactive key is at most kept before it is removed.
这两句话说的是不管是最小过期时间还是最大过期时间都是在这个key不活跃时的时间范围,简单说就是key在这段时间内没有被访问。然后继续看registerProcessingCleanupTimer方法,里面if判断的第二个条件 (currentTime + minRetentionTime) > curCleanupTime。在第一次访问时curCleanupTime已经有值了=(过去的当前时间记为X+最大的过期时间)。此时有两种情况第一当前时间Y+最小的过期时间大于curCleanupTime则重新生成定时器删除老的定时器。第二不大于则什么都不做这样极大的减少了删除定时器和生成新的定时器工作。毕竟key刚生成时往往是访问最频繁的时候。
举个栗子:
当前时间 2019-11-20 15:00:00 过期最小时间为1天 最大为2天
初始化key时生成一个定时器时间为2019-11-22 15:00:00
2019-11-20 16:00:00 访问这个key时定时器时间还是为2019-11-22 15:00:00
2019-11-21 16:00:00 再次访问这个key时定时器时间就会因为
2019-11-21 16:00:00+1天 大于2019-11-22 15:00:00导致生成一个新的定时器。定时器时间为2019-11-23 16:00:00。
如有不正确的地方欢迎指正。