美国python程序员,范围内的美国假期的python列表

I have a need to fetch list of holidays in a given range, i.e., if start date is 20/12/2016 & end date is 10/1/2017, then I should get 25/12/2017, 1/1/2017. I can do this using Pandas, but in my case, I have limitation that I need to AWS Glue service & Pandas are not supported in AWS Glue.

I am trying to use native python library holidays, but I couldn't see API document to fetch holidays between from & to date?

Here is what I have tried:

import holidays

import datetime

from datetime import date, timedelta

import dateutil

from dateutil.relativedelta import relativedelta

us_holidays = holidays.UnitedStates()

for date2,name in sorted(holidays.US(state='CA', years=2013).items()):

print (date2,name)

days_from_closest_holiday = [(abs(fdate - hdate)).days for hdate in holidays.date.tolist()]

print days_from_closest_holiday

Output of us_holidays:

(datetime.date(2013, 1, 1), "New Year's Day")

(datetime.date(2013, 1, 21), 'Martin Luther King, Jr. Day')

(datetime.date(2013, 2, 18), "Washington's Birthday")

(datetime.date(2013, 3, 31), 'C\xc3\xa9sar Ch\xc3\xa1vez Day')

(datetime.date(2013, 4, 1), 'C\xc3\xa9sar Ch\xc3\xa1vez Day (Observed)')

(datetime.date(2013, 5, 27), 'Memorial Day')

(datetime.date(2013, 7, 4), 'Independence Day')

(datetime.date(2013, 9, 2), 'Labor Day')

(datetime.date(2013, 10, 14), 'Columbus Day')

(datetime.date(2013, 11, 11), 'Veterans Day')

(datetime.date(2013, 11, 28), 'Thanksgiving')

(datetime.date(2013, 12, 25), 'Christmas Day')

I need to pass a date range, fromdate, todate to the us_holidays object, and I expect list of holidays with in the range, something like the following:

DatetimeIndex(['2013-12-25', '2014-01-01'], dtype='datetime64[ns]', freq=None)

[10, 17]

In pandas, I can get it using the following:

cal = USFederalHolidayCalendar()

holidays = pd.to_datetime(cal.holidays(start_date, end_date))

print holidays

As mentioned above, I need to use AWS Glue, and currently Glue doesn't support Pandas.

Appreciate any help please.

Thanks

解决方案

After a while of exploring, I have the got the solution myself, sharing here for reference as a solution:

%pyspark

import holidays

import datetime

from datetime import date, timedelta

import dateutil

from dateutil.relativedelta import relativedelta

us_holidays = holidays.UnitedStates()

custom_holidays = holidays.HolidayBase()

holidays_within_range=[]

fmt = '%Y-%m-%d'

holidays2013=[]

for date2,name in sorted(holidays.US(state='CA', years=2013).items()):

holidays2013.append(date2.strftime(fmt))

print holidays2013

fdate = date(2013, 1, 1)

s_date = fdate - dateutil.relativedelta.relativedelta(days=7)

e_date = fdate + relativedelta(months=1)

start_date = s_date.strftime(fmt)

end_date = e_date.strftime(fmt)

print "Range : "

print start_date, end_date

dd = [s_date + timedelta(days=x) for x in range((e_date-s_date).days + 1)]

for d in dd:

if(d in us_holidays):

custom_holidays.append(d)

holidays_within_range.append(d.strftime(fmt))

print holidays_within_range

days_from_closest_holiday = [(abs(fdate - datetime.datetime.strptime(hdate, fmt).date())).days for hdate in holidays_within_range]

print days_from_closest_holiday

The outputs for the above would be :

['2013-01-01', '2013-01-21', '2013-02-18', '2013-03-31', '2013-04-01', '2013-05-27', '2013-07-04', '2013-09-02', '2013-10-14', '2013-11-11', '2013-11-28', '2013-12-25']

Range :

2012-12-25 2013-02-01

['2012-12-25', '2013-01-01', '2013-01-21']

[7, 0, 20]

This doesn't require Pandas, and I hope this works with AWS Glue. Haven't updated the variable names, please change as necessary.

Thanks

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值