3.1.4 Multiindex与Panel
1. Multiindex
多级或分层索引对象
- index的属性:
- names:levels的名称
- levels:每个level的元组值
new_df.index
new_df.index.levels
new_df.index.names
MultiIndex([(2012, 1),
(2014, 4),
(2013, 7),
(2014, 10)],
names=['year', 'month'])
FrozenList([[2012, 2013, 2014], [1, 4, 7, 10]])
FrozenList(['year', 'month'])
2. Panel
- class pandas.panel(data=None, items=None, major_axis=None, minor_axis=None, copy=False, dtype=None)
- 存储三维数据的Panel结构
- 现在已经被废弃
p = pd.Panel(np.arange(24).reshape(4,3,2),
items=list('ABCD'),
major_axis=pd.date_range('20130101', periods=3),
minor_axis=['first', 'second'])
D:\Anaconda\Anaconda3\lib\site-packages\ipykernel_launcher.py:1: FutureWarning: The Panel class is removed from pandas. Accessing it from the top-level namespace will also be removed in the next version
"""Entry point for launching an IPython kernel.
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-65-050bc3687ffb> in <module>
2 items=list('ABCD'),
3 major_axis=pd.date_range('20130101', periods=3),
----> 4 minor_axis=['first', 'second'])
TypeError: Panel() takes no arguments
3.1.5 Series
- 带索引的一维数据
- 只有行索引(index),没有列索引
- 属性
- index
- values
- 方法
sr = data.iloc[1, :]
sr
sr.index
sr.values
2018-01-01 -0.801225
2018-01-02 1.522848
2018-01-03 -1.492739
2018-01-04 -0.549901
2018-01-05 -0.960143
Freq: B, Name: 股票-1, dtype: float64
DatetimeIndex(['2018-01-01', '2018-01-02', '2018-01-03', '2018-01-04',
'2018-01-05'],
dtype='datetime64[ns]', freq='B')
array([-0.80122527, 1.52284836, -1.49273893, -0.54990148, -0.96014347])
1. 创建Series
# 利用已有数组创建
pd.Series(np.arange(10))
0 0
1 1
2 2
3 3
4 4
5 5
6 6
7 7
8 8
9 9
dtype: int32
# 指定索引
pd.Series(np.arange(3, 9, 2), index=['a', 'b', 'c'])
a 3
b 5
c 7
dtype: int32
# 利用字典数据创建
pd.Series({'red':100, 'blue':200, 'green':500, 'yellow':1000})
red 100
blue 200
green 500
yellow 1000
dtype: int64
2. Series获取索引和值
- index
- values