Python 日期表制作

日期维度表是数据仓库设计中的重要部分,在之前一篇MySQL 制作日期表有介绍使用SQL语言制作,本文来一波如何使用Python生成日期维度表。

  1. 导入必要的库
    需要导入datetime模块,以便在Python中处理日期;导入pandas模块,便于展示,如果遇到没有次模块问题,可前往终端进行pip/conda install 模块;
from datetime import date, timedelta
import pandas as pd
  1. 进行测试,定义生成日期维度表的函数
    日期信息变量可以进行打印测试,待测试没有问题后,进行封装,定义一个名为generate_pub_calendar的函数,该函数接受起始日期和结束日期作为参数,并返回日期相关信息。
def generate_pub_calendar(start_date, end_date):
    date_data_info = []
    current_date = start_date
    # 使用循环生成日期区间的信息
    while current_date <= end_date:
        calendar_year = current_date.year
        year_start_dt = date(current_date.year, 1, 1)
        year_end_dt = date(current_date.year, 12, 31)
        calendar_week = current_date.strftime('%W')
        # calendar_year_week 年份和年中第几周拼接,乘出位数相加,用作排序和定位年周
        calendar_year_week = current_date.year * 100 + current_date.isocalendar()[1]
        calendar_weekday = current_date.weekday()+1
        calendar_weekday_name = current_date.strftime('%A')
        week_begin_dt = current_date - timedelta(days=current_date.weekday())
        week_end_dt = week_begin_dt + timedelta(days=6)
        calendar_month = current_date.month
        calendar_month_name = current_date.strftime('%B')
        month_begin_dt = date(current_date.year, current_date.month, 1)
        # month_end_dt 先得出下一个月第一天 减去一天 即可
        month_end_dt = date(current_date.year + int(current_date.month/12), current_date.month % 12 + 1, 1) - timedelta(days=1)
        calendar_year_month = current_date.year*100 + current_date.month
        # calendar_quarter //(向下取整除,它会返回整除结果的整数部分)
        calendar_quarter = (current_date.month - 1)//3 + 1
        # quarter_begin_dt 根据当前季度数*3月份数再加上1个月
        quarter_begin_dt = date(current_date.year, ((current_date.month - 1) // 3) * 3 + 1, 1)
        # quarter_end_dt 根据 month_end_dt 得出逻辑 和 calendar_quarter 配合即可得出
        quarter_end_dt = date(current_date.year + int(((current_date.month - 1)//3 + 1) * 3/12),(((current_date.month - 1)//3 + 1) * 3) % 12 + 1, 1) - timedelta(days=1)
        # calendar_year_quarter 根据 calendar_year、calendar_quarter  做拼接即可
        calendar_year_quarter = current_date.year * 100 + ((current_date.month - 1)//3 + 1)
        # week_in_calendar_quarter 根据 calendar_week 减去 quarter_begin_dt 所在的年周 即可得出
        # week_in_calendar_quarter = current_date.isocalendar()[1] - date(current_date.year, ((current_date.month - 1) // 3) * 3 + 1, 1).isocalendar()[1]
        week_in_calendar_quarter = int(current_date.strftime('%W')) - int(date(current_date.year, ((current_date.month - 1) // 3) * 3 + 1, 1).strftime('%W'))
        # calendar_year_quarter_week 根据 calendar_year、calendar_quarter 、 week_in_calendar_quarter 做拼接即可
        calendar_year_quarter_week = (current_date.year * 100 + ((current_date.month - 1)//3 + 1)) * 100 + (int(current_date.strftime('%W')) - int(date(current_date.year, ((current_date.month - 1) // 3) * 3 + 1, 1).strftime('%W')))
        # day_in_calendar_year 当前日期 减 year_start_dt 即可
        day_in_calendar_year = (current_date - date(current_date.year, 1, 1)).days
        day_in_calendar_quarter = (current_date - date(current_date.year, ((current_date.month - 1) // 3) * 3 + 1, 1)).days
        day_in_calendar_month = current_date.day
        # 追加日期区间的信息数据
        date_data_info.append({
            'calendar_date': current_date,
            'calendar_year': calendar_year,
            'year_start_dt': year_start_dt,
            'year_end_dt': year_end_dt,
            'calendar_week': calendar_week,
            'calendar_year_week': calendar_year_week,
            'calendar_weekday': calendar_weekday,
            'calendar_weekday_name': calendar_weekday_name,
            'week_begin_dt': week_begin_dt,
            'week_end_dt': week_end_dt,
            'calendar_month': calendar_month,
            'calendar_month_name': calendar_month_name,
            'month_begin_dt': month_begin_dt,
            'month_end_dt': month_end_dt,
            'calendar_year_month': calendar_year_month,
            'calendar_quarter': calendar_quarter,
            'quarter_begin_dt': quarter_begin_dt,
            'quarter_end_dt': quarter_end_dt,
            'calendar_year_quarter': calendar_year_quarter,
            'week_in_calendar_quarter': week_in_calendar_quarter,
            'calendar_year_quarter_week': calendar_year_quarter_week,
            'day_in_calendar_year': day_in_calendar_year,
            'day_in_calendar_quarter': day_in_calendar_quarter,
            'day_in_calendar_month': day_in_calendar_month
        })
        
        current_date += timedelta(days=1)
    
    return date_data_info

在这个函数中,我们使用循环和datetime模块提供的功能来计算相关日期信息,最后将这些信息存储为字典。

  1. 调用函数并生成日期维度表
start_date = date(2023, 1, 1)
end_date = date(2023, 12, 31)

pub_calendar_data = generate_pub_calendar(start_date, end_date)
df = pd.DataFrame(pub_calendar_data)
print(df.tail(10).to_markdown(index=False))
calendar_datecalendar_yearyear_start_dtyear_end_dtcalendar_weekcalendar_year_weekcalendar_weekdaycalendar_weekday_nameweek_begin_dtweek_end_dtcalendar_monthcalendar_month_namemonth_begin_dtmonth_end_dtcalendar_year_monthcalendar_quarterquarter_begin_dtquarter_end_dtcalendar_year_quarterweek_in_calendar_quartercalendar_year_quarter_weekday_in_calendar_yearday_in_calendar_quarterday_in_calendar_month
2023-12-2220232023-01-012023-12-31512023515Friday2023-12-182023-12-2412December2023-12-012023-12-3120231242023-10-012023-12-3120230412202304123558222
2023-12-2320232023-01-012023-12-31512023516Saturday2023-12-182023-12-2412December2023-12-012023-12-3120231242023-10-012023-12-3120230412202304123568323
2023-12-2420232023-01-012023-12-31512023517Sunday2023-12-182023-12-2412December2023-12-012023-12-3120231242023-10-012023-12-3120230412202304123578424
2023-12-2520232023-01-012023-12-31522023521Monday2023-12-252023-12-3112December2023-12-012023-12-3120231242023-10-012023-12-3120230413202304133588525
2023-12-2620232023-01-012023-12-31522023522Tuesday2023-12-252023-12-3112December2023-12-012023-12-3120231242023-10-012023-12-3120230413202304133598626
2023-12-2720232023-01-012023-12-31522023523Wednesday2023-12-252023-12-3112December2023-12-012023-12-3120231242023-10-012023-12-3120230413202304133608727
2023-12-2820232023-01-012023-12-31522023524Thursday2023-12-252023-12-3112December2023-12-012023-12-3120231242023-10-012023-12-3120230413202304133618828
2023-12-2920232023-01-012023-12-31522023525Friday2023-12-252023-12-3112December2023-12-012023-12-3120231242023-10-012023-12-3120230413202304133628929
2023-12-3020232023-01-012023-12-31522023526Saturday2023-12-252023-12-3112December2023-12-012023-12-3120231242023-10-012023-12-3120230413202304133639030
2023-12-3120232023-01-012023-12-31522023527Sunday2023-12-252023-12-3112December2023-12-012023-12-3120231242023-10-012023-12-3120230413202304133649131

通过调用这个函数,我们可以生成包含各种日期维度信息的日期维度表。

日期维度表为数据分析和业务智能提供了强大的基础,生成日期维度表是数据分析工作中的基础步骤之一,其实实现思路基本一致,只要你理清思路了,无论使用何种语言和工具,都可实现,大家可以实操起来,看看都掌握了几种工具。

在这里插入图片描述

  • 2
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

有请小发菜

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值