pandas移动操作shift函数

**星光*

已于 2022-04-21 21:40:38 修改

阅读量2.5k

点赞数 1

分类专栏： # pandas系列 python 文章标签：数据挖掘 python

于 2022-04-17 12:09:18 首次发布

本文链接：https://blog.csdn.net/weixin_42322206/article/details/124226550

版权

python 同时被 2 个专栏收录

80 篇文章 4 订阅

订阅专栏

pandas系列

24 篇文章 6 订阅

订阅专栏

shift 概念

shift函数是在不改变索引情况下对数据进行移动的操作，pandas 中上下两行相减(隔行相减) -- shift函数的使用

df.shift(periods=1, freq=None, axis=0, fill_value=None)
"""
periods: int 要移动的值，默认1，可不写periods
frep: 只适用于时间序列,DateOffset, timedelta, or time rule string，可选参数，
      默认值为None，一般不用
      如果这个参数存在，那么会按照参数值移动时间索引，而数据值没有发生变化。
axis:指定要移动的行或列，0为行，1为列, 上下左右移动
fill_value:指定移位后的缺失数据填充值 

periods : int
Number of periods to move, can be positive or negative
freq : DateOffset, timedelta, or time rule string, optional
Increment to use from the tseries module or time rule (e.g. 'EOM').
See Notes.
axis : {0 or 'index', 1 or 'columns'}
"""

源码

help(pandas.DataFrame.shift)
shift(self, periods=1, freq=None, axis=0, fill_value=None) -> 'DataFrame'
    Shift index by desired number of periods with an optional time `freq`.

    When `freq` is not passed, shift the index without realigning the data.
    If `freq` is passed (in this case, the index must be date or datetime,
    or it will raise a `NotImplementedError`), the index will be
    increased using the periods and the `freq`. `freq` can be inferred
    when specified as "infer" as long as either freq or inferred_freq
    attribute is set in the index.

    Parameters
    ----------
    periods : int
        Number of periods to shift. Can be positive or negative.
    freq : DateOffset, tseries.offsets, timedelta, or str, optional
        Offset to use from the tseries module or time rule (e.g. 'EOM').
        If `freq` is specified then the index values are shifted but the
        data is not realigned. That is, use `freq` if you would like to
        extend the index when shifting and preserve the original data.
        If `freq` is specified as "infer" then it will be inferred from
        the freq or inferred_freq attributes of the index. If neither of
        those attributes exist, a ValueError is thrown
    axis : {0 or 'index', 1 or 'columns', None}, default None
        Shift direction.
    fill_value : object, optional
        The scalar value to use for newly introduced missing values.
        the default depends on the dtype of `self`.
        For numeric data, ``np.nan`` is used.
        For datetime, timedelta, or period data, etc. :attr:`NaT` is used.
        For extension dtypes, ``self.dtype.na_value`` is used.

        .. versionchanged:: 1.1.0

    Returns
    -------
    DataFrame
        Copy of input object, shifted.

    See Also
    --------
    Index.shift : Shift values of Index.
    DatetimeIndex.shift : Shift values of DatetimeIndex.
    PeriodIndex.shift : Shift values of PeriodIndex.
    tshift : Shift the time index, using the index's frequency if
        available.

实例

 Examples
    --------
    >>> df = pd.DataFrame({"Col1": [10, 20, 15, 30, 45],
    ...                    "Col2": [13, 23, 18, 33, 48],
    ...                    "Col3": [17, 27, 22, 37, 52]},
    ...                   index=pd.date_range("2020-01-01", "2020-01-05"))
    >>> df
                Col1  Col2  Col3
    2020-01-01    10    13    17
    2020-01-02    20    23    27
    2020-01-03    15    18    22
    2020-01-04    30    33    37
    2020-01-05    45    48    52

    >>> df.shift(periods=3)
                Col1  Col2  Col3
    2020-01-01   NaN   NaN   NaN
    2020-01-02   NaN   NaN   NaN
    2020-01-03   NaN   NaN   NaN
    2020-01-04  10.0  13.0  17.0
    2020-01-05  20.0  23.0  27.0

    >>> df.shift(periods=1, axis="columns")
                Col1  Col2  Col3
    2020-01-01   NaN  10.0  13.0
    2020-01-02   NaN  20.0  23.0
    2020-01-03   NaN  15.0  18.0
    2020-01-04   NaN  30.0  33.0
    2020-01-05   NaN  45.0  48.0

    >>> df.shift(periods=3, fill_value=0)
                Col1  Col2  Col3
    2020-01-01     0     0     0
    2020-01-02     0     0     0
    2020-01-03     0     0     0
    2020-01-04    10    13    17
    2020-01-05    20    23    27

    >>> df.shift(periods=3, freq="D")
                Col1  Col2  Col3
    2020-01-04    10    13    17
    2020-01-05    20    23    27
    2020-01-06    15    18    22
    2020-01-07    30    33    37
    2020-01-08    45    48    52

    >>> df.shift(periods=3, freq="infer")
                Col1  Col2  Col3
    2020-01-04    10    13    17
    2020-01-05    20    23    27
    2020-01-06    15    18    22
    2020-01-07    30    33    37
    2020-01-08    45    48    52

freq参数

data = {
    "A":[1,2,3,0],
    "B":[4,5,6,0],
    "C":[7,8,9,0],
    "D":[7,8,9,0],
}
# df = pd.DataFrame(data)
df = pd.DataFrame(data,index =pd.date_range('2012-06-01','2012-06-04'))
df2= df.shift(freq=datetime.timedelta(1))
df3 =df.shift(freq=datetime.timedelta(-2))
print(df)
print(df2)
print(df3)
# 只改索引不改值
"""
            A  B  C  D
2012-06-01  1  4  7  7
2012-06-02  2  5  8  8
2012-06-03  3  6  9  9
2012-06-04  0  0  0  0
            A  B  C  D
2012-06-02  1  4  7  7
2012-06-03  2  5  8  8
2012-06-04  3  6  9  9
2012-06-05  0  0  0  0
            A  B  C  D
2012-05-30  1  4  7  7
2012-05-31  2  5  8  8
2012-06-01  3  6  9  9
2012-06-02  0  0  0  0

"""