目标
业务统计需要按照自然周看数据,所以把每天计算都归到自然周。
展示的时候用自然周的周一来代替自然周展示。
odps 没有计算自然周的函数,所以用 Python 做了一个 udf 来实现。
一年的第一天是周一,那么这一年的自然周就是以01开始,否则是以00,
这个在计算对应的周一时会产生问题,针对跨年周做了兼容处理。
似乎 MySQL 在计算自然周的时候也有类似问题,这里没有细查。
代码
from odps.udf import annotate
import datetime
@annotate("string->string")
class get_year_week(object):
def evaluate(self, arg):
try :
ds_date = datetime.datetime.strptime(arg, '%Y%m%d')
yw = '{}'.format(ds_date.strftime("%Y%W"))
if yw[-2:] == '00':
ds_date = ds_date + datetime.timedelta(days=1 - ds_date.isoweekday())
yw = '{}'.format(ds_date.strftime("%Y%W"))
return yw
except:
return None
@annotate("string->string")
class get_year_week_day(object):
def evaluate(self, yearsweek):
iosweekday=1
try :
year = yearsweek[:4]
# if 0101 is monday, the yearweek will begin on 01,need mines 1
if datetime.datetime.strptime(year, '%Y').isoweekday() == 1:
week_num = int(yearsweek[-2:]) - 1
else:
week_num = int(yearsweek[-2:])
monday_of_the_week = datetime.datetime.strptime(year, '%Y') + datetime.timedelta(days=7 * week_num)
#
if monday_of_the_week.isoweekday() != iosweekday:
monday_of_the_week = monday_of_the_week + datetime.timedelta(days=iosweekday - monday_of_the_week.isoweekday())
return monday_of_the_week.strftime('%Y%m%d')
except:
return None
效果
select date1,yearweek(date1),yearweekday(yearweek(date1)) monday
from date_list;
date1 | _c1 | monday |
---|---|---|
20180221 | 201808 | 20180219 |
20181231 | 201853 | 20181231 |
20190101 | 201853 | 20181231 |
20190106 | 201853 | 20181231 |
20190107 | 201901 | 20190107 |
20190108 | 201901 | 20190107 |
20191231 | 201952 | 20191230 |
20200101 | 201952 | 20191230 |
20201231 | 202052 | 20201228 |
20210101 | 202052 | 20201228 |
20211231 | 202152 | 20211227 |
20220101 | 202152 | 20211227 |