Pandas_分层索引

最新推荐文章于 2024-01-01 21:47:35 发布

进击的rookie of python

最新推荐文章于 2024-01-01 21:47:35 发布

阅读量307

点赞数

分类专栏：数据分析与可视化文章标签： Pandas_分层索引

本文链接：https://blog.csdn.net/rookie_is_me/article/details/87343916

版权

数据分析与可视化专栏收录该内容

18 篇文章 2 订阅

订阅专栏

DataFrame分层索引

frame=pd.DataFrame(np.arange(12).reshape(4,3),
                   index=[['a','b','a','b'],[1,2,1,2]],
                   columns=[['ohio','ohio','colo'],['red','red','yellow']])
#分层索引取名
frame.index.names=['key1','key2']
frame.columns.names=['state','color']
print(frame)
'''
state     ohio       colo
color      red red yellow
key1 key2                
a    1       0   1      2
b    2       3   4      5
a    1       6   7      8
b    2       9  10     11
'''

层级排序（重新赋值）

'''
层级排序，不会改变值，只会改变层级位置
swaplevel参数：
i,j:层级序号或名称
axis=0
'''
frame=pd.DataFrame(np.arange(12).reshape(4,3),
                   index=[['a','b','a','b'],[1,2,1,2]],
                   columns=[['ohio','ohio','colo'],['red','red','yellow']])
frame.index.names=['key1','key2']
frame.columns.names=['state','color']
frame1=frame.swaplevel('key1','key2')
print(frame1)
'''
state     ohio       colo
color      red red yellow
key2 key1                
1    a       0   1      2
2    b       3   4      5
1    a       6   7      8
2    b       9  10     11
'''

frame=pd.DataFrame(np.arange(12).reshape(4,3),
                   index=[['a','b','a','b'],[1,2,1,2]],
                   columns=[['ohio','ohio','colo'],['red','red','yellow']])
frame.index.names=['key1','key2']
frame.columns.names=['state','color']
frame1=frame.sort_index(level=-1)
print(frame1)
'''
state     ohio       colo
color      red red yellow
key1 key2                
a    1       0   1      2
     1       6   7      8
b    2       3   4      5
     2       9  10     11
'''

按层级统计

frame=pd.DataFrame(np.arange(12).reshape(4,3),
                   index=[['a','b','a','b'],[1,2,1,2]],
                   columns=[['ohio','ohio','colo'],['red','red','yellow']])
frame.index.names=['key1','key2']
frame.columns.names=['state','color']
frame1=frame.sum(level='color',axis=1)
print(frame1)
'''
color      red  yellow
key1 key2             
a    1       1       2
b    2       7       5
a    1      13       8
b    2      19      11
'''

列索引（重新赋值）

'''
将某一列或几列作为行索引
set_index(self, keys, drop=True, append=False, inplace=False,
                  verify_integrity=False)
drop:列从DataFrame中移除
'''
frame=pd.DataFrame({
    'a':range(4),
    'b':range(3,-1,-1),
    'c':['a','b','c','d'],
    'd':['one','two','three','four']
})
frame2=frame.set_index(['c','d'])
print(frame2)
'''
         a  b
c d          
a one    0  3
b two    1  2
c three  2  1
d four   3  0
'''


'''
reset_index():set_index()的反操作
'''