例子:
df = pd.DataFrame()
df['A'] = [1, 1, 2]
df['B'] = [datetime.date(2018, 1, 2), datetime.date(2018, 1, 3), datetime.date(2018, 1, 3)]
df['C'] = df.groupby('A').B.diff()
df['C'] = df.C.dt.days
报错:
Traceback (most recent call last):
File "D:\python_virtualenv\common\lib\site-packages\pandas-0.20.3-py3.6-win-amd64.egg\pandas\core\series.py", line 2820, in _make_dt_accessor
return maybe_to_datetimelike(self)
File "D:\python_virtualenv\common\lib\site-packages\pandas-0.20.3-py3.6-win-amd64.egg\pandas\core\indexes\accessors.py", line 84, in maybe_to_datetimelike
"datetimelike index".format(type(data)))
TypeError: cannot convert an object of type <class 'pandas.core.series.Series'> to a datetimelike index
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "D:/学习/pandas_test/pandas_learn_20190102.py", line 49, in <module>
test2()
File "D:/学习/pandas_test/pandas_learn_20190102.py", line 32, in test2
df['C'] = df.C.dt.days
File "D:\python_virtualenv\common\lib\site-packages\pandas-0.20.3-py3.6-win-amd64.egg\pandas\core\generic.py", line 3077, in __getattr__
return object.__getattribute__(self, name)
File "D:\python_virtualenv\common\lib\site-packages\pandas-0.20.3-py3.6-win-amd64.egg\pandas\core\base.py", line 243, in __get__
return self.construct_accessor(instance)
File "D:\python_virtualenv\common\lib\site-packages\pandas-0.20.3-py3.6-win-amd64.egg\pandas\core\series.py", line 2822, in _make_dt_accessor
raise AttributeError("Can only use .dt accessor with datetimelike "
AttributeError: Can only use .dt accessor with datetimelike values
原因:
分组求diff后的结果是:
A B C
0 1 2018-01-02 NaT
1 1 2018-01-03 1 days 00:00:00
2 2 2018-01-03 NaN
类型是:
A int64
B object
C object
dtype: object
预想的类型是:
A int64
B object
C timedelta64[ns]
dtype: object
解决:
原本尝试使用astype强制将object列,转成timedelta列
df['C'] = df.C.astype(pd.Timedelta)
这句代码不会报错,但是C列的类型不会改变,没有作用。
最后有两种处理方式:
提前定义B列为时间列:
df = pd.DataFrame()
df['A'] = [1, 1, 2]
df['B'] = [datetime.date(2018, 1, 2), datetime.date(2018, 1, 3), datetime.date(2018, 1, 3)]
df.B = pd.to_datetime(df.B)
df['C'] = df.groupby('A').B.diff()
df['C'] = df.C.dt.days
增加类型转换:
df = pd.DataFrame()
df['A'] = [1, 1, 2]
df['B'] = [datetime.date(2018, 1, 2), datetime.date(2018, 1, 3), datetime.date(2018, 1, 3)]
df['C'] = df.groupby('A').B.diff()
df['C'] = pd.to_timedelta(df.C, unit='d').dt.days