I have a pandas date column and I want to return the date for the previous Thursday (could be any day of week). I use pd.offsets.Week but I do not get the expected result when the year changes and the Week starts over. Here is my dataframe as 'd':
raw date Thursday week_start
0 2019-01-03 2018-12-27 2018-12-27
1 2019-01-03 2018-12-27 2018-12-27
2 2019-01-03 2018-12-27 2018-12-27
3 2019-01-02 2018-12-27 2018-12-27
4 2019-01-02 2018-12-27 2018-12-27
5 2019-01-02 2018-12-27 2018-12-27
6 2019-01-03 2019-01-03 2018-12-27
7 2019-01-03 2019-01-03 2018-12-27
8 2019-01-03 2019-01-03 2018-12-27
9 2019-01-03 2019-01-03 2018-12-27
10 2019-01-02 2018-12-27 2018-12-27
11 2019-01-02 2018-12-27 2018-12-27
12 2019-01-02 2018-12-27 2018-12-27
d['week_start'] = d['raw date'] - pd.offsets.Week(weekday=3)
I expected where d['week_start'] = 1/3/2019 to return 1/3/2019, not 12/27/2018. I suspect it is because the Week at 1/3/2019 is 0 so it returns the Thursday of that week. How can I get the previous Thursday's date regardless of a change in the year?
解决方案
You can use Series.where with Series.dt.weekday for change only not Thursday values:
rng = pd.date_range('2019-01-03', periods=20)
d = pd.DataFrame({'raw date': rng})
mask = d['raw date'].dt.weekday == 3
d['week_start'] = d['raw date'].where(mask, d['raw date'] - pd.offsets.Week(weekday=3))
print(d)
raw date week_start
0 2019-01-03 2019-01-03
1 2019-01-04 2019-01-03
2 2019-01-05 2019-01-03
3 2019-01-06 2019-01-03
4 2019-01-07 2019-01-03
5 2019-01-08 2019-01-03
6 2019-01-09 2019-01-03
7 2019-01-10 2019-01-10
8 2019-01-11 2019-01-10
9 2019-01-12 2019-01-10
10 2019-01-13 2019-01-10
11 2019-01-14 2019-01-10
12 2019-01-15 2019-01-10
13 2019-01-16 2019-01-10
14 2019-01-17 2019-01-17
15 2019-01-18 2019-01-17
16 2019-01-19 2019-01-17
17 2019-01-20 2019-01-17
18 2019-01-21 2019-01-17
19 2019-01-22 2019-01-17