Pandas多级/分层索引



1、创建(多级行索引)


1) 从数组列表:MultiIndex.from_arrays()

df = pd.DataFrame(np.random.randn(4), columns=['v'])
print(df.to_string())
'''
          v
0 -1.066128
1  0.393179
2  1.595051
3 -1.492583
'''
arrays = [np.array(['bar', 'bar', 'foo', 'foo']), np.array(['one', 'two', 'one', 'two'])]
index = pd.MultiIndex.from_arrays(arrays, names=['first', 'second'])
df.index = index
print(df.to_string())
'''
                     v
first second          
bar   one     1.270260
      two    -0.539714
foo   one    -0.284528
      two     0.682699
'''

2) 从元组数组:MultiIndex.from_tuples()

df = pd.DataFrame(np.random.randn(4), columns=['v'])
arrays = [['bar', 'bar', 'foo', 'foo'], ['one', 'two', 'one', 'tow']]
tuples = list(zip(*arrays))
print(tuples)    # [('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'tow')]
index = pd.MultiIndex.from_tuples(tuples, names=['first', 'second'])
df.index = index
print(df.to_string())
'''
                     v
first second          
bar   one    -1.066128
      two     0.393179
foo   one     1.595051
      tow    -1.492583
'''

3) 从交叉迭代器集(两两匹配):MultiIndex.from_product()

df = pd.DataFrame(np.random.randn(4), columns=['v'])
iterables = [['bar', 'foo'], ['one', 'two']]
index = pd.MultiIndex.from_product(iterables, names=['first', 'second'])
df.index = index
print(df.to_string())
'''
                     v
first second          
bar   one    -0.201214
      two    -1.097793
foo   one     0.033961
      two    -0.023977
'''

4) 从DataFrame:MultiIndex.from_frame()

df = pd.DataFrame(np.random.randn(4), columns=['v'])
df_index = pd.DataFrame([['bar', 'one'], ['bar', 'two'], ['foo', 'one'], ['foo', 'two']], columns=['first', 'second'])
index = pd.MultiIndex.from_frame(df_index)
df.index = index
print(df.to_string())
'''
                     v
first second          
bar   one    -0.669057
      two     0.749700
foo   one    -1.458682
      two    -0.795187
'''

2、获取指定层索引标签列表与数据对齐


1) 获取指定层索引标签列表:get_level_values()

print(df.index.get_level_values(-1))    # Index(['one', 'two', 'one', 'two'], dtype='object', name='second')

2) 数据对齐:reindex()

# 调整列(默认行):可添加列、对列调换位置
df_reindex = df.reindex(['k', 'v'], axis=1, fill_value=0)
print(df_reindex.to_string())
'''
              k         v
first second             
bar   one     0  0.778402
      two     0 -0.773582
foo   one     0  0.102783
      two     0 -0.849725
'''

3、多级/分层索引高级用法


1) 取指定索引的所有列的值

print(df_reindex.loc[('bar', 'two')])
'''
k    0.000000
v   -0.115918
Name: (bar, two), dtype: float64
'''

2) 取指定索引指定列的值

print(df_reindex.loc[('bar', 'two'), 'v'])
'''
1.1971089038835172
'''

3) 局部切片

print(df_reindex.loc['bar'])
'''
        k         v
second             
one     0 -0.141858
two     0 -0.786239
'''
print(df_reindex.loc(axis=0)[:, 'two'])
'''
              k         v
first second             
bar   two     0 -0.615808
foo   two     0  0.238987
'''

print(df_reindex.loc[('bar', 'two'):('foo', 'one')])
'''
              k         v
first second             
bar   two     0  1.055614
foo   one     0  0.899269
'''

4) 交叉选择:xs(key, axis, level, drop_level)

print(df_reindex.xs(key='two', level='second', drop_level=False))
'''
              k         v
first second             
bar   two     0 -0.615808
foo   two     0  0.238987
'''

5) 多级列索引的使用

df = df_reindex.T
print(df)
'''
first        bar                 foo          
second       one       two       one       two
k       0.000000  0.000000  0.000000  0.000000
v      -0.062292  0.342238 -0.520897 -0.565263
'''
print(df.loc(axis=1)[:, 'two'])
print(df.xs('two', level=-1, axis=1, drop_level=False))
'''
first        bar       foo
second       two       two
k       0.000000  0.000000
v      -0.711593  1.917493
'''
print(df.xs(('bar', 'two'), level=(0, -1), axis=1))
'''
first       bar
second      two
k       0.00000
v       0.38138
'''

6) 交换索引层级:swaplevel()

print(df.swaplevel(axis=1))
'''
second       one       two       one       two
first        bar       bar       foo       foo
k       0.000000  0.000000  0.000000  0.000000
v       0.398076 -0.346376  0.555577 -1.454103
'''

7) 重命名索引:rename(index)

print(df_reindex.rename(index={'bar': "bar_x", 'one': "one_x"}))
'''
              k         v
first second             
bar_x one_x   0  0.686636
      two     0  0.352534
foo   one_x   0  0.489657
      two     0  0.067204
'''

8) 重命名轴名称:rename_axis(index)

print(df_reindex.rename_axis(index=['first_x', 'second_x']))
'''
                  k         v
first_x second_x             
bar     one       0  0.792857
        two       0 -1.092279
foo     one       0 -1.822016
        two       0  0.661191
'''
  • 0
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值