通过轴排序
import pandas as pd
import numpy as np
dates=pd.date_range('20190301`,periods=6)
df = pd.DataFrame(np.random.randn(6,4),index=dates,columns=list('ABCD'))
print(df.sort_index(axis=1,ascending=False)
结果
D C B A
2019-03-01 -0.791684 -0.487994 0.609415 -0.130605
2019-03-02 -1.547686 0.424872 0.774110 -0.444310
2019-03-03 0.017846 1.498144 -0.384752 -2.453477
2019-03-04 0.727243 -0.112879 1.297718 0.278707
2019-03-05 0.506601 0.796999 1.532483 -1.643227
2019-03-06 -0.217591 -2.295332 -1.521089 -0.051233
按B轴排序
print(df.sort_values(by='B'))
结果
A B C D
2019-03-06 -0.051233 -1.521089 -2.295332 -0.217591
2019-03-03 -2.453477 -0.384752 1.498144 0.017846
2019-03-01 -0.130605 0.609415 -0.487994 -0.791684
2019-03-02 -0.444310 0.774110 0.424872 -1.547686
2019-03-04 0.278707 1.297718 -0.112879 0.727243
2019-03-05 -1.643227 1.532483 0.796999 0.506601
逆序排
print(df.sort_values(by='B',ascending=False))
pandas取数
获取
选择一列,产生一个系列,相当于df.A
print(df['A'])
结果
2019-03-01 -0.130605
2019-03-02 -0.444310
2019-03-03 -2.453477
2019-03-04 0.278707
2019-03-05 -1.643227
2019-03-06 -0.051233
Freq: D, Name: A, dtype: float64
通过[]操作符选择切片行(默认axis=0)
dates=pd.date_range('20190101',periods=6)
df=pd.DataFrame(np.random.randn(6.4),index=dates,columns=list('ABCD'))
df=pd.DataFrame(np.random.randn(6,4),index=dates,columns=list('ABCD'))
print(df[0:3])
结果:
A B C D
2019-01-01 0.856435 -1.004727 1.922469 0.137247
2019-01-02 -0.197909 1.979294 1.754650 0.900661
2019-01-03 0.298422 0.446164 -0.281074 -1.514126
print(df['20190102':'20190103'])
结果:
A B C D
2019-01-02 -0.197909 1.979294 1.754650 0.900661
2019-01-03 0.298422 0.446164 -0.281074 -1.514126
按标签选择
使用标签获取横截面
import pandas as pd
import numpy as np
dates = pd.date_range('20190101',periods=6)
df = pd.DataFrame(np.random.randn(6,4),index=dates,columns=list('ABCD'))
print(df.loc[dates[0]])
结果:
A 1.755160
B 1.889182
C 1.646745
D 0.418026
Name: 2019-01-01 00:00:00, dtype: float64
print(df.loc[:,['A','B']])
结果:
A B
2019-01-01 1.755160 1.889182
2019-01-02 -1.687622 1.189862
2019-01-03 0.242299 -0.810373
2019-01-04 -0.758775 0.855404
2019-01-05 -0.316525 -1.058533
2019-01-06 1.584005 1.586659
print(df.loc['20190102':'20190104',['A','C']])
结果:
A C
2019-01-02 -1.687622 0.565117
2019-01-03 0.242299 0.703034
2019-01-04 -0.758775 0.297482
print(df.loc['20190102',['A','B']])
结果:
A -1.687622
B 1.189862
Name: 2019-01-02 00:00:00, dtype: float64