一.shift()
df.shift(periods=1, freq=None, axis=0)
df = pd.DataFrame(np.arange(1,17).reshape(4,4),columns=['A','B','C','D'],index =['a','b','c','d'])
print(df)
A B C D
a 1 2 3 4
b 5 6 7 8
c 9 10 11 12
d 13 14 15 16
print(df.shift(3))
A B C D
a NaN NaN NaN NaN
b NaN NaN NaN NaN
c NaN NaN NaN NaN
d 1.0 2.0 3.0 4.0
print(df.shift(2,axis = 1))
A B C D
a NaN NaN 1.0 2.0
b NaN NaN 5.0 6.0
c NaN NaN 9.0 10.0
d NaN NaN 13.0 14.0
print(df.shift(-1))
A B C D
a 5.0 6.0 7.0 8.0
b 9.0 10.0 11.0 12.0
c 13.0 14.0 15.0 16.0
d NaN NaN NaN NaN
df = pd.DataFrame(np.arange(1,17).reshape(4,4),columns=['A','B','C','D'],index =pd.date_range('10/1/2018','10/4/2018'))
print(df)
A B C D
2018-10-01 1 2 3 4
2018-10-02 5 6 7 8
2018-10-03 9 10 11 12
2018-10-04 13 14 15 16
print(df.shift(freq=datetime.timedelta(1)))
A B C D
2018-10-02 1 2 3 4
2018-10-03 5 6 7 8
2018-10-04 9 10 11 12
2018-10-05 13 14 15 16
print(df.shift(freq=datetime.timedelta(-1)))
A B C D
2018-09-30 1 2 3 4
2018-10-01 5 6 7 8
2018-10-02 9 10 11 12
2018-10-03 13 14 15 16
shift如字面义,移动,
函数中的几个参数意义如下:
period:表示移动的幅度,可以是正数,也可以是负数,默认值是1,1就表示移动一次,移动之后没有对应值的,就赋值为NaN。
freq: DateOffset, timedelta, or time rule string,可选参数,默认值为None,只适用于时间序列
axis: 轴向。0表示行向移动(上下移动),1表示列向移动(左右移动)
period与freq的区别:
period移动时,只移动数据,行列索引不移动;
freq移动时,只移动索引,数据不变,且只在索引是时间时生效
period移动时的理解:
整个数据块移动,比如向下移动3行时,整个数据块向下移动,原本下三行就移出了我们规定的4*4的矩阵,所以原本下三行的数据就不可见了,而原本第一行上面是没有数据的,下移后依然为空,只是pandas里空数据一般以NaN占位
freq移动时的理解:
时间的移动自然是前一天后一天这样,所以,1就是后移一天,-1就是前移一天
二.map(),apply(),applymap()
def change(x):
if x > 2:
return 'big'
return 'small'
s = pd.Series([1, 2, 3, np.nan])
frame = pd.DataFrame(np.arange(1, 17).reshape(4,4), columns=list('abcd'), index=['A', 'B', 'C', 'D'])
s2 = s.map(lambda x: change(x),na_action=None)
s3 = s.map(lambda x: change(x).format(x),na_action='ignore')
print(s2,s3)
0 small
1 small
2 big
3 small
dtype: object
0 small
1 small
2 big
3 NaN
dtype: object
print(frame)
a b c d
A 1 2 3 4
B 5 6 7 8
C 9 10 11 12
D 13 14 15 16
print(frame.apply(lambda x:x.max()/x.min()))
a 13.0
b 7.0
c 5.0
d 4.0
dtype: float64
print(frame.applymap(change))
a b c d
A small small big big
B big big big big
C big big big big
D big big big big
Series.map(arg, na_action=None)
当na_action为None时,NaN会被传递到函数中,如果ignore,则直接传递Na值,而不将它传递到函数
Series.apply(func, convert_dtype=True, args=(), **kwds)
func:function
convert_dtype:boolean,default True尝试找到更好的dtype元素功能结果。如果为False,则保留为dtype = object
args:tuple除了值之外,还要传递位置参数