flink sql时区问题的解决
背景
读入的数据是utc时间,写出的数据也是utc时间,但是中间开的一天的窗口要是东八区的窗口。
办法
1.数据端下手,写入东八区的时间。
2.开多个窗口
insert into sink
select a,b,TUBMLE_START(rowtime,INVERTAL '1' DAY) as rowtime,sum(c) as c from (
select a,b,SESSION_ROWTIME(rowtime, INTERVAL '1' MINUTE) as rowtime,sum(c) from
(select a,b,TUBMLE_ROWTIME(rowtime, INTERVAL '8' HOUR) as rowtime,count(c) as c from source
GROUP BY a,b,TUBMLE(rowtime, INTERVAL '8' HOUR) )
GROUP BY a,b,SESSION(rowtime, INTERVAL '1' MINUTE))
GROUP BY a,b,TUBMLE(rowtime, INTERVAL '1' DAY)
这种方法很狗,而且得益于我们是东八区,正好将一天分成3个点。
这种生成的时间是utc的零点,可以使用Udf实现,人为减去8个小时
import org.apache.flink.table.functions.ScalarFunction;
import java.sql.Timestamp;
import java.text.ParseException;
import java.text.SimpleDateFormat;
import java.util.Date;
import java.util.TimeZone;
public class UTCToLocal extends ScalarFunction {
public Timestamp eval(Timestamp s) {
long timestamp = s.getTime() - 28800000;
return new Timestamp(timestamp);
}
// 注册函数
tabEnv.registerFunction("utctolocal", new UTCToLocal());
上面的sql可以改成:
insert into sink
select a,b,utctolocal(TUBMLE_START(rowtime,INVERTAL '1' DAY)) as rowtime ,sum(c) as c from (
select a,b,SESSION_ROWTIME(rowtime, INTERVAL '1' MINUTE) as rowtime,sum(c) from
(select a,b,TUBMLE_ROWTIME(rowtime, INTERVAL '8' HOUR) as rowtime,count(c) as c from source
GROUP BY a,b,TUBMLE(rowtime, INTERVAL '8' HOUR) )
GROUP BY a,b,SESSION(rowtime, INTERVAL '1' MINUTE))
GROUP BY a,b,TUBMLE(rowtime, INTERVAL '1' DAY)
3.用ddl
但是目前flink1.9.1,还不支持变量的计算,等flink1.10吧。
https://www.alibabacloud.com/help/zh/doc-detail/110847.htm?spm=a2c63.p38356.b99.47.637b1c4emwYtDS