pandas提取时间里面的年月日_从pandas的字符串datetime列中提取日期

I have a column cash_date in pandas dataframe which is a object. I am not able to use pandas to_datetime function here. Shape of my data frame is (47654566,5).My data frame looks like

cash_date amount id

02-JAN-13 12.00.00.000000000 AM 100 1

13-FEB-13 12.00.00.000000000 AM 200 2

09-MAR-13 12.00.00.000000000 AM 300 3

03-APR-13 12.00.00.000000000 AM 400 4

02-JAN-13 06.26.02.438000000 PM 500 7

17-NOV-18 08.31.47.443000000 PM 700 8

I tried following ways -

df.cash_date = pd.to_datetime(df['cash_date'], errors='coerce') # Not working

for i in range(len(df)):

df.cash_date = df.cash_date.astype(str).str.split('\d\d.\d\d.\d\d.\d\d\d\d\d\d\d\d\d')[i][0] # Not working

I want the data frame looks like s-

cash_date amount id date

02-JAN-13 12.00.00.000000000 AM 100 1 02-JAN-13

13-FEB-13 12.00.00.000000000 AM 200 2 13-FEB-13

09-MAR-13 12.00.00.000000000 AM 300 3 09-MAR-13

03-APR-13 12.00.00.000000000 AM 400 4 03-APR-13

02-JAN-13 06.26.02.438000000 PM 500 7 02-JAN-13

17-NOV-18 08.31.47.443000000 PM 700 8 17-NOV-18

解决方案

Specify a format=... argument.

pd.to_datetime(df['cash_date'], format='%d-%b-%y %H.%M.%S.%f %p', errors='coerce')

0 2013-01-02 12:00:00.000

1 2013-02-13 12:00:00.000

2 2013-03-09 12:00:00.000

3 2013-04-03 12:00:00.000

4 2013-01-02 06:26:02.438

5 2018-11-17 08:31:47.443

Name: cash_date, dtype: datetime64[ns]

Details about acceptable formats may be found at http://strftime.org.

From here, you can floor the datetimes using dt.floor:

df['date'] = pd.to_datetime(

df['cash_date'], format='%d-%b-%y %H.%M.%S.%f %p', errors='coerce'

).dt.floor('D')

df

cash_date amount id date

0 02-JAN-13 12.00.00.000000000 AM 100 1 2013-01-02

1 13-FEB-13 12.00.00.000000000 AM 200 2 2013-02-13

2 09-MAR-13 12.00.00.000000000 AM 300 3 2013-03-09

3 03-APR-13 12.00.00.000000000 AM 400 4 2013-04-03

4 02-JAN-13 06.26.02.438000000 PM 500 7 2013-01-02

5 17-NOV-18 08.31.47.443000000 PM 700 8 2018-11-17

OTOH, if you are looking to extract the date component without parsing the date, there are a couple of options:

str.split

df['date'] = df['cash_date'].str.split(n=1).str[0]

df

cash_date amount id date

0 02-JAN-13 12.00.00.000000000 AM 100 1 02-JAN-13

1 13-FEB-13 12.00.00.000000000 AM 200 2 13-FEB-13

2 09-MAR-13 12.00.00.000000000 AM 300 3 09-MAR-13

3 03-APR-13 12.00.00.000000000 AM 400 4 03-APR-13

4 02-JAN-13 06.26.02.438000000 PM 500 7 02-JAN-13

5 17-NOV-18 08.31.47.443000000 PM 700 8 17-NOV-18

Or, using a list comprehension.

df['date'] = [x.split(None, 1)[0] for x in df['cash_date']]

df

cash_date amount id date

0 02-JAN-13 12.00.00.000000000 AM 100 1 02-JAN-13

1 13-FEB-13 12.00.00.000000000 AM 200 2 13-FEB-13

2 09-MAR-13 12.00.00.000000000 AM 300 3 09-MAR-13

3 03-APR-13 12.00.00.000000000 AM 400 4 03-APR-13

4 02-JAN-13 06.26.02.438000000 PM 500 7 02-JAN-13

5 17-NOV-18 08.31.47.443000000 PM 700 8 17-NOV-18

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值