在我的数据框中,我将索引更改为日期字段
df.index = df.TRX_DATE # transaction date and type is class pandas.core.series.Series'
现在我想根据两个日期或任何日期差异来切割我的数据帧.
但我得到错误.
# currentdate is today date
startdate = currentdate - timedelta(days=30)
dflast30 = df.loc[startdate:currentdate] # error
尝试通过创建面具来做
mask = (df['TRX_DATE'] >= startdate) & (df['TRX_DATE'] <= currentdate )
dflast30 = df.loc[mask]
dflast30 = df.loc[mask]
TypeError: unorderable types: str() > datetime.datetime()
然后我尝试截断像:
dflast30 = df.truncate(before = currentdate, after = startdate)
我得到了同样的错误.
我很迷惑.我需要就这些问题提出建议:
>我可以将索引(TRX_DATE字段)更改为datetime类型吗?
>或者我应该创建该字符串字段类型.
>或者我应该按原样放置未分配的索引,并在日期字段中搜索我当前的要求.
>或者举一个例子,我如何将日期字段作为日期范围的索引和切片,请同时提及输出.
解决方法:
我认为你的第一种方法很好.
如果您希望复制列TRX_DATE索引:
df.index = pd.to_datetime(df['TRX_DATE'])
如果您不想复制,只需将列TRX_DATE设置为索引:
df = df.set_index(['TRX_DATE'])
有我的演示:
import pandas as pd
import numpy as np
import io
import datetime as dt
temp=u"""TRX_DATE;A
2013-07-05;1
2013-08-06;1
2015-09-05;2
2015-10-08;2
2015-11-05;2
2015-11-25;2
2015-12-06;3"""
df = pd.read_csv(io.StringIO(temp), sep=";", parse_dates=[0])
print df
# TRX_DATE A
#0 2013-07-05 1
#1 2013-08-06 1
#2 2015-09-05 2
#3 2015-10-08 2
#4 2015-11-05 2
#5 2015-11-25 2
#6 2015-12-06 3
print df.dtypes
#TRX_DATE datetime64[ns]
#A int64
#dtype: object
#copy column TRX_DATE to index
#df.index = pd.to_datetime(df['TRX_DATE'])
#no copy, only set column TRX_DATE to index
df = df.set_index(['TRX_DATE'])
print df
# A
#TRX_DATE
#2013-07-05 1
#2013-08-06 1
#2015-09-05 2
#2015-10-08 2
#2015-11-05 2
#2015-11-25 2
#2015-12-06 3
currentdate = dt.date.today()
print currentdate
#2015-11-06
startdate = currentdate - pd.Timedelta(days=30)
print startdate
#2015-10-07
dflast30 = df.loc[startdate:currentdate]
print dflast30
# A
#TRX_DATE
#2015-10-08 2
#2015-11-05 2
dflast30 = dflast30.reset_index()
print dflast30
# TRX_DATE A
#0 2015-10-08 2
#1 2015-11-05 2
不同的方法,您可以在其中创建df的子集.无需设置datetimeindex.
import pandas as pd
import numpy as np
import io
import datetime as dt
temp=u"""TRX_DATE;A
2013-07-05;1
2013-08-06;1
2015-09-05;2
2015-10-08;2
2015-11-05;2
2015-11-25;2
2015-12-06;3"""
df = pd.read_csv(io.StringIO(temp), sep=";", parse_dates=[0])
print df
# TRX_DATE A
#0 2013-07-05 1
#1 2013-08-06 1
#2 2015-09-05 2
#3 2015-10-08 2
#4 2015-11-05 2
#5 2015-11-25 2
#6 2015-12-06 3
print df.dtypes
#TRX_DATE datetime64[ns]
#A int64
#dtype: object
currentdate = dt.date.today()
print currentdate
#2015-11-06
startdate = currentdate - pd.Timedelta(days=30)
print startdate
#2015-10-07
dflast30 = df[(df.TRX_DATE >= startdate) & (df.TRX_DATE <= currentdate)]
print dflast30
# TRX_DATE A
#3 2015-10-08 2
#4 2015-11-05 2
标签:python,date,datetime,pandas
来源: https://codeday.me/bug/20190702/1357214.html