Pandas 创建层次化索引

1.创建多层次索引

1.1 隐式构造

  • 最常见的方法是给DataFrame构造函数的index参数传递两个或更多的数组

# 导入pandasimport numpy as npimport pandas as pddata = np.random.randint(0,100,size=(6,6))# 行索引index = [    ["1班","1班","1班","2班","2班","2班"],    ["张三","李四","王五","鲁班","张三丰","张无忌"]]# 列索引columns = [    ["期中","期中","期中","期末","期末","期末"],    ["语文","数学","英语","语文","数学","英语"]]df = pd.DataFrame(data=data,index=index,columns=columns)df

期中期末
语文数学英语语文数学英语
1班张三405130165745
李四74457048751
王五574535252276
2班鲁班938069311729
张三丰903836775630
张无忌355079453876
  • Series也可以创建多层索引

data = np.random.randint(0,100,size=6)index = [    ["1班","1班","1班","2班","2班","2班"],    ["张三","李四","王五","鲁班","张三丰","张无忌"]]s = pd.Series(data=data,index=index)s
1班  张三      7
    李四      9
    王五     57
2班  鲁班     88
    张三丰    36
    张无忌     5
dtype: int32

1.2 显式构造pd.MultiIndex

  • 使用数组

data = np.random.randint(0,100,size=(6,6))# 行索引index = pd.MultiIndex.from_arrays([    ["1班","1班","1班","2班","2班","2班"],    ["张三","李四","王五","鲁班","张三丰","张无忌"]])# 列索引columns = [    ["期中","期中","期中","期末","期末","期末"],    ["语文","数学","英语","语文","数学","英语"]]df = pd.DataFrame(data=data,index=index,columns=columns)df

期中期末
语文数学英语语文数学英语
1班张三56070165465
李四119994668251
王五37167148272
2班鲁班393365697768
张三丰53152399797
张无忌53301895736

  • 使用tuple

data = np.random.randint(0,100,size=(6,6))# 行索引index = pd.MultiIndex.from_tuples(    (        ("1班","张三"),("1班","李四"),("1班","王五"),        ("2班","鲁班"),("2班","张三丰"),("2班","张无忌")    ))# 列索引columns = [    ["期中","期中","期中","期末","期末","期末"],    ["语文","数学","英语","语文","数学","英语"]]df = pd.DataFrame(data=data,index=index,columns=columns)df

期中期末
语文数学英语语文数学英语
1班张三278920751
李四756058795015
王五374756435955
2班鲁班412543714637
张三丰665352215391
张无忌292622495624

  • 使用product

笛卡尔积:{a,b}{c,d} ==> {a,c},{a,d},{b,c},{b,d}

data = np.random.randint(0,100,size=(6,6))# 行索引index = pd.MultiIndex.from_product([    ["1班","2班"],    ["张三","李四","王五"]])# 列索引columns = [    ["期中","期中","期中","期末","期末","期末"],    ["语文","数学","英语","语文","数学","英语"]]df = pd.DataFrame(data=data,index=index,columns=columns)df

期中期末
语文数学英语语文数学英语
1班张三75548352756
李四52631106384
王五6765964516
2班张三771010944173
李四863451501887
王五97391433845

2.多层列索引

除了行索引index,列索引columns也能用同样的方法创建多层索引

  • 使用数组

data = np.random.randint(0,100,size=(6,6))# 行索引index = pd.MultiIndex.from_arrays([    ["1班","1班","1班","2班","2班","2班"],    ["张三","李四","王五","鲁班","张三丰","张无忌"]])# 列索引columns = pd.MultiIndex.from_arrays([    ["期中","期中","期中","期末","期末","期末"],    ["语文","数学","英语","语文","数学","英语"]])df = pd.DataFrame(data=data,index=index,columns=columns)df

期中期末
语文数学英语语文数学英语
1班张三205872666271
李四672263461621
王五34921517430
2班鲁班18972455086
张三丰582417324952
张无忌503326384182

  • 使用tuple

data = np.random.randint(0,100,size=(6,6))# 行索引index = pd.MultiIndex.from_tuples(    (        ("1班","张三"),("1班","李四"),("1班","王五"),        ("2班","鲁班"),("2班","张三丰"),("2班","张无忌")    ))# 列索引columns = pd.MultiIndex.from_tuples(    (        ("期中","语文"),("期中","数学"),("期中","英语"),        ("期末","语文"),("期末","数学"),("期末","英语")    ))df = pd.DataFrame(data=data,index=index,columns=columns)df

期中期末
语文数学英语语文数学英语
1班张三5537667064
李四38327979782
王五804956513219
2班鲁班366881133573
张三丰945694451534
张无忌78593572465

  • 使用product

data = np.random.randint(0,100,size=(6,6))# 行索引index = pd.MultiIndex.from_product([    ["1班","2班"],    ["张三","李四","王五"]])# 列索引columns = pd.MultiIndex.from_product([    ["期中","期末"],    ["语文","数学","英语"]])df = pd.DataFrame(data=data,index=index,columns=columns)df

期中期末
语文数学英语语文数学英语
1班张三1181749472
李四413322753677
王五428228218457
2班张三1864904180
李四249966923425
王五28411628536

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

腾飞开源

你的鼓励将是我创作的最大动力!

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值