DataFrame分层索引
frame=pd.DataFrame(np.arange(12).reshape(4,3),
index=[['a','b','a','b'],[1,2,1,2]],
columns=[['ohio','ohio','colo'],['red','red','yellow']])
#分层索引取名
frame.index.names=['key1','key2']
frame.columns.names=['state','color']
print(frame)
'''
state ohio colo
color red red yellow
key1 key2
a 1 0 1 2
b 2 3 4 5
a 1 6 7 8
b 2 9 10 11
'''
层级排序(重新赋值)
'''
层级排序,不会改变值,只会改变层级位置
swaplevel参数:
i,j:层级序号或名称
axis=0
'''
frame=pd.DataFrame(np.arange(12).reshape(4,3),
index=[['a','b','a','b'],[1,2,1,2]],
columns=[['ohio','ohio','colo'],['red','red','yellow']])
frame.index.names=['key1','key2']
frame.columns.names=['state','color']
frame1=frame.swaplevel('key1','key2')
print(frame1)
'''
state ohio colo
color red red yellow
key2 key1
1 a 0 1 2
2 b 3 4 5
1 a 6 7 8
2 b 9 10 11
'''
frame=pd.DataFrame(np.arange(12).reshape(4,3),
index=[['a','b','a','b'],[1,2,1,2]],
columns=[['ohio','ohio','colo'],['red','red','yellow']])
frame.index.names=['key1','key2']
frame.columns.names=['state','color']
frame1=frame.sort_index(level=-1)
print(frame1)
'''
state ohio colo
color red red yellow
key1 key2
a 1 0 1 2
1 6 7 8
b 2 3 4 5
2 9 10 11
'''
按层级统计
frame=pd.DataFrame(np.arange(12).reshape(4,3),
index=[['a','b','a','b'],[1,2,1,2]],
columns=[['ohio','ohio','colo'],['red','red','yellow']])
frame.index.names=['key1','key2']
frame.columns.names=['state','color']
frame1=frame.sum(level='color',axis=1)
print(frame1)
'''
color red yellow
key1 key2
a 1 1 2
b 2 7 5
a 1 13 8
b 2 19 11
'''
列索引(重新赋值)
'''
将某一列或几列作为行索引
set_index(self, keys, drop=True, append=False, inplace=False,
verify_integrity=False)
drop:列从DataFrame中移除
'''
frame=pd.DataFrame({
'a':range(4),
'b':range(3,-1,-1),
'c':['a','b','c','d'],
'd':['one','two','three','four']
})
frame2=frame.set_index(['c','d'])
print(frame2)
'''
a b
c d
a one 0 3
b two 1 2
c three 2 1
d four 3 0
'''
'''
reset_index():set_index()的反操作
'''