Python pandas索引

最新推荐文章于 2023-08-08 12:41:12 发布

小浩子7号

最新推荐文章于 2023-08-08 12:41:12 发布

阅读量164

点赞数

分类专栏： python

本文链接：https://blog.csdn.net/qq_41782791/article/details/105945707

版权

python 专栏收录该内容

48 篇文章 0 订阅

订阅专栏

一、重复索引
查看是否是相同索引

s = pd.Series(np.random.rand(6), index=list('abcbda'))

a    0.848267
b    0.704451
c    0.326481
b    0.897240
d    0.220018
a    0.565038

print(s.index.is_unique)

False

把两个相同的索引进行清洗

s.groupby(s.index).sum()

a    0.523258
b    0.313002
c    0.581596
b    0.745864
d    0.829234
a    0.455364

二、多级索引

s = pd.Series(np.random.rand(7), index=index)

level1  level2
a       1         0.137450
        2         0.968562
        3         0.622390
b       1         0.035829
        2         0.874913
c       2         0.043371
        3         0.226650
dtype: float64

一级索引

print(s['c'])

level2
2    0.695136
3    0.933268
dtype: float64

二级索引

print(s[:, '2'])

level1
a    0.227801
b    0.777973
c    0.032813
dtype: float64

复杂索引
DataFrame

df = pd.DataFrame(np.random.randint(1, 10, (4, 3)),
                  index=[['a', 'a', 'b', 'b'], [1, 2, 1, 2]],
                  columns=[['one', 'one', 'two'], ['blue', 'red', 'blue']])
df.index.names = ['row-1', 'row-2']
df.columns.names = ['col-1', 'col-2']
print(df)

col-1        one      two
col-2       blue red blue
row-1 row-2              
a     1        2   4    1
      2        3   1    4
b     1        3   9    4
      2        7   9    8

一级索引

print(df.loc['a'])

col-1  one      two
col-2 blue red blue
row-2              
1        8   5    1
2        8   5    6

二级索引

print(df.loc['a', 1])

col-1  col-2
one    blue     2
       red      4
two    blue     1

交换索引

df2 = df.swaplevel('row-1', 'row-2')
print(df2)

col-1        one      two
col-2       blue red blue
row-2 row-1              
1     a        5   1    9
2     a        9   4    1
1     b        3   8    3
2     b        7   7    8

多级索引的统计

df.sum(level=0)
print(df)

df = pd.DataFrame({
    'a': range(7),
    'b': range(7, 0, -1),
    'c': ['one', 'one', 'one', 'one', 'two', 'two', 'two'],
    'd': [0, 1, 2, 0, 1, 2, 3]
})
print(df)

   a  b    c  d
0  0  7  one  0
1  1  6  one  1
2  2  5  one  2
3  3  4  one  0
4  4  3  two  1
5  5  2  two  2
6  6  1  two  3

把某一列设成索引值

print(df.set_index('c'))

     a  b  d
c           
one  0  7  0
one  1  6  1
one  2  5  2
one  3  4  0
two  4  3  1
two  5  2  2
two  6  1  3

print(df.set_index(['c', 'd']))

   a  b
c   d      
one 0  0  7
    1  1  6
    2  2  5
    0  3  4
two 1  4  3
    2  5  2
    3  6  1

将它转换回来

df2 = df.set_index(['c', 'd'])
print(df2.reset_index())

    c  d  a  b
0  one  0  0  7
1  one  1  1  6
2  one  2  2  5
3  one  0  3  4
4  two  1  4  3
5  two  2  5  2
6  two  3  6  1

列索引排序

print(df2.reset_index().sort_index('columns'))

  a  b    c  d
0  0  7  one  0
1  1  6  one  1
2  2  5  one  2
3  3  4  one  0
4  4  3  two  1
5  5  2  two  2
6  6  1  two  3

小浩子7号

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
Python pandas索引

一、重复索引查看是否是相同索引s = pd.Series(np.random.rand(6), index=list('abcbda'))a 0.848267b 0.704451c 0.326481b 0.897240d 0.220018a 0.565038print(s.index.is_unique)False把两个相同的索引进...
复制链接

扫一扫