shift 概念
shift函数是在不改变索引情况下对数据进行移动的操作,pandas 中上下两行相减(隔行相减) -- shift函数的使用
df.shift(periods=1, freq=None, axis=0, fill_value=None)
"""
periods: int 要移动的值,默认1,可不写periods
frep: 只适用于时间序列,DateOffset, timedelta, or time rule string,可选参数,
默认值为None,一般不用
如果这个参数存在,那么会按照参数值移动时间索引,而数据值没有发生变化。
axis:指定要移动的行或列,0为行,1为列, 上下左右移动
fill_value:指定移位后的缺失数据填充值
periods : int
Number of periods to move, can be positive or negative
freq : DateOffset, timedelta, or time rule string, optional
Increment to use from the tseries module or time rule (e.g. 'EOM').
See Notes.
axis : {0 or 'index', 1 or 'columns'}
"""
源码
help(pandas.DataFrame.shift)
shift(self, periods=1, freq=None, axis=0, fill_value=None) -> 'DataFrame'
Shift index by desired number of periods with an optional time `freq`.
When `freq` is not passed, shift the index without realigning the data.
If `freq` is passed (in this case, the index must be date or datetime,
or it will raise a `NotImplementedError`), the index will be
increased using the periods and the `freq`. `freq` can be inferred
when specified as "infer" as long as either freq or inferred_freq
attribute is set in the index.
Parameters
----------
periods : int
Number of periods to shift. Can be positive or negative.
freq : DateOffset, tseries.offsets, timedelta, or str, optional
Offset to use from the tseries module or time rule (e.g. 'EOM').
If `freq` is specified then the index values are shifted but the
data is not realigned. That is, use `freq` if you would like to
extend the index when shifting and preserve the original data.
If `freq` is specified as "infer" then it will be inferred from
the freq or inferred_freq attributes of the index. If neither of
those attributes exist, a ValueError is thrown
axis : {0 or 'index', 1 or 'columns', None}, default None
Shift direction.
fill_value : object, optional
The scalar value to use for newly introduced missing values.
the default depends on the dtype of `self`.
For numeric data, ``np.nan`` is used.
For datetime, timedelta, or period data, etc. :attr:`NaT` is used.
For extension dtypes, ``self.dtype.na_value`` is used.
.. versionchanged:: 1.1.0
Returns
-------
DataFrame
Copy of input object, shifted.
See Also
--------
Index.shift : Shift values of Index.
DatetimeIndex.shift : Shift values of DatetimeIndex.
PeriodIndex.shift : Shift values of PeriodIndex.
tshift : Shift the time index, using the index's frequency if
available.
实例
Examples
--------
>>> df = pd.DataFrame({"Col1": [10, 20, 15, 30, 45],
... "Col2": [13, 23, 18, 33, 48],
... "Col3": [17, 27, 22, 37, 52]},
... index=pd.date_range("2020-01-01", "2020-01-05"))
>>> df
Col1 Col2 Col3
2020-01-01 10 13 17
2020-01-02 20 23 27
2020-01-03 15 18 22
2020-01-04 30 33 37
2020-01-05 45 48 52
>>> df.shift(periods=3)
Col1 Col2 Col3
2020-01-01 NaN NaN NaN
2020-01-02 NaN NaN NaN
2020-01-03 NaN NaN NaN
2020-01-04 10.0 13.0 17.0
2020-01-05 20.0 23.0 27.0
>>> df.shift(periods=1, axis="columns")
Col1 Col2 Col3
2020-01-01 NaN 10.0 13.0
2020-01-02 NaN 20.0 23.0
2020-01-03 NaN 15.0 18.0
2020-01-04 NaN 30.0 33.0
2020-01-05 NaN 45.0 48.0
>>> df.shift(periods=3, fill_value=0)
Col1 Col2 Col3
2020-01-01 0 0 0
2020-01-02 0 0 0
2020-01-03 0 0 0
2020-01-04 10 13 17
2020-01-05 20 23 27
>>> df.shift(periods=3, freq="D")
Col1 Col2 Col3
2020-01-04 10 13 17
2020-01-05 20 23 27
2020-01-06 15 18 22
2020-01-07 30 33 37
2020-01-08 45 48 52
>>> df.shift(periods=3, freq="infer")
Col1 Col2 Col3
2020-01-04 10 13 17
2020-01-05 20 23 27
2020-01-06 15 18 22
2020-01-07 30 33 37
2020-01-08 45 48 52
freq参数
data = {
"A":[1,2,3,0],
"B":[4,5,6,0],
"C":[7,8,9,0],
"D":[7,8,9,0],
}
# df = pd.DataFrame(data)
df = pd.DataFrame(data,index =pd.date_range('2012-06-01','2012-06-04'))
df2= df.shift(freq=datetime.timedelta(1))
df3 =df.shift(freq=datetime.timedelta(-2))
print(df)
print(df2)
print(df3)
# 只改索引不改值
"""
A B C D
2012-06-01 1 4 7 7
2012-06-02 2 5 8 8
2012-06-03 3 6 9 9
2012-06-04 0 0 0 0
A B C D
2012-06-02 1 4 7 7
2012-06-03 2 5 8 8
2012-06-04 3 6 9 9
2012-06-05 0 0 0 0
A B C D
2012-05-30 1 4 7 7
2012-05-31 2 5 8 8
2012-06-01 3 6 9 9
2012-06-02 0 0 0 0
"""