使用pandas创建多重索引的方式有很多,下面举出常见的几种:
1、pd.MultiIndex.from_tuples方法
>>> import pandas as pd
>>> import numpy as np
>>> arrays = [["bar", "bar", "baz", "baz", "foo", "foo", "qux", "qux"],["one", "two", "one", "two", "one", "two", "one", "two"]]
>>> tuples = list(zip(*arrays))
>>> index = pd.MultiIndex.from_tuples(tuples)
>>> index
MultiIndex([('bar', 'one'),
('bar', 'two'),
('baz', 'one'),
('baz', 'two'),
('foo', 'one'),
('foo', 'two'),
('qux', 'one'),
('qux', 'two')],
)
>>> s = pd.Series(np.random.randn(8), index=index)
>>> s
bar one 0.612171
two -0.615973
baz one -1.611292
two -0.708034
foo one -1.588981
two -0.998916
qux one -0.320857
two -1.623501
dtype: float64
2、pd.MultiIndex.from_product
#该方法与from_tuple的是有区别,具体一看例子就懂
>>> iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]]
>>> pd.MultiIndex.from_product(iterables, names=["first", "second"])
MultiIndex([('bar', 'one'),
('bar', 'two'),
('baz', 'one'),
('baz', 'two'),
('foo', 'one'),
('foo', 'two'),
('qux', 'one'),
('qux', 'two')],
names=['first', 'second'])
3、pd.MultiIndex.from_frame直接使用已有dataFrame添加
>>> df = pd.DataFrame([["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]],columns=["first", "second"])
>>> df
first second
0 bar one
1 bar two
2 foo one
3 foo two
>>> pd.MultiIndex.from_frame(df)
MultiIndex([('bar', 'one'),
('bar', 'two'),
('foo', 'one'),
('foo', 'two')],
names=['first', 'second'])
4、直接通过构建arrays
>>> arrays = [np.array(["bar", "bar", "baz", "baz", "foo", "foo", "qux", "qux"]),np.array(["one", "two", "one", "two", "one", "two", "one", "two"])]
>>> s = pd.Series(np.random.randn(8), index=arrays)
>>> s
bar one 0.953744
two 0.069687
baz one -0.205349
two 1.093807
foo one -0.845178
two -1.040949
qux one 1.646463
two -0.244347
dtype: float64
哈哈,以上就是构建multiIndex的几种常见方式,欢迎关注python小工具,一起学习python和pandas。