总括
pandas里对索引的操作主要有
- DataFrame.rename
- DataFrame.rename_axis
- DataFrame.reindex
- DataFrame.reindex_axis
- DataFrame.reset_index
- pandas.Index.reindex
- pandas.Index.set_names
其中1和2,是对索引的更改,原索引是不变的.3和4是增加和减少了索引,如果索引存在则还按照原来的值,如果不存在则填充空值.5是重新给索引.
1~5都返回的是数据框
6~7返回的是索引
rename
DataFrame.rename(index=None, columns=None, **kwargs)
参数
- index, columns : scalar, list-like, dict-like or function, optional(Function/dict值必须是一对一)
- copy : boolean, default True(复制底层函数)
- inplace : boolean, default False(替换原对象)
- level:int or level name, default None(多层索引时使用)
返回
1.renamed : DataFrame (new object)
例子
In [1]: import pandas as pd
...: df = pd.DataFrame({"A": [1, 2, 3], "B": [4, 5, 6]})
...: df
Out[1]:
A B
0 1 4
1 2 5
2 3 6
In [2]: df.rename(index={0:3,1:4,2:5}, columns={"A": "a", "C": "c"})
Out[2]:
a B
3 1 4
4 2 5
5 3 6
rename_axis
DataFrame.rename_axis(mapper, axis=0, copy=True, inplace=False)
参数
- mapper : scalar, list-like, dict-like or function, optional
- axis : int or string, default 0
- copy : boolean, default True
- inplace : boolean, default False
返回
- renamed : type of caller
例子
In [1]: import pandas as pd
...: df = pd.DataFrame({"A": [1, 2, 3], "B": [4, 5, 6]})
...: df
Out[1]:
A B
0 1 4
1 2 5
2 3 6
In [2]: df.rename_axis({0:3,1:4,2:5})
Out[2]:
A B
3 1 4
4 2 5
5 3 6
In [3]: df.rename_axis({"A": "a", "C": "c"},axis=1)
Out[3]:
a B
0 1 4
1 2 5
2 3 6
reindex
可以用来插值
DataFrame.reindex(index=None, columns=None, **kwargs)
参数
- index, columns : array-like, optional (can be specified in order, or as
- method : {None, ‘backfill’/’bfill’, ‘pad’/’ffill’, ‘nearest’}, optional(填充设置)
- copy : boolean, default True
- level : int or name
- fill_value : scalar, default np.NaN
- limit : int, default None
- tolerance : optional
返回
- reindexed : DataFrame
例子
import pandas as pd
df = pd.DataFrame({"A": [1, 2, 3], "B": [4, 5, 6]})
df
Out[1]:
A B
0 1 4
1 2 5
2 3 6
df.reindex(index=(1,2,3))
Out[2]:
A B
1 2.0 5.0
2 3.0 6.0
3 NaN NaN
df.reindex(columns=("B","C"))
Out[3]:
B C
0 4 NaN
1 5 NaN
2 6 NaN
reindex_axis
DataFrame.reindex_axis(labels, axis=0, method=None, level=None, copy=True, limit=None, fill_value=nan)
参数
- labels : array-like
- axis : {0 or ‘index’, 1 or ‘columns’}
- method : {None, ‘backfill’/’bfill’, ‘pad’/’ffill’, ‘nearest’}, optional
- copy : boolean, default True
- level : int or name
- limit : int, default None
- tolerance : optional
返回
- reindexed : DataFrame
例子
In [1]: import pandas as pd
...: df = pd.DataFrame({"A": [1, 2, 3], "B": [4, 5, 6]})
...: df
Out[1]:
A B
0 1 4
1 2 5
2 3 6
In [2]: df.reindex_axis((1,2,3))
Out[2]:
A B
1 2.0 5.0
2 3.0 6.0
3 NaN NaN
In [3]: df.reindex_axis(("B","C"),axis=1)
Out[3]:
B C
0 4 NaN
1 5 NaN
2 6 NaN
reset_index
DataFrame.reset_index(level=None, drop=False, inplace=False, col_level=0, col_fill='')
参数
- level : int, str, tuple, or list, default None
- drop : boolean, default False
- inplace : boolean, default False
- col_level : int or str, default 0
- col_fill : object, default ‘’
返回
- resetted : DataFrame
例子
In [1]: import pandas as pd
...: df = pd.DataFrame({"A": [1, 2, 3], "B": [4, 5, 6]})
...: df=df.reindex_axis((1,2,3))
...: df
Out[1]:
A B
1 2.0 5.0
2 3.0 6.0
3 NaN NaN
In [2]: df.reset_index()
Out[2]:
index A B
0 1 2.0 5.0
1 2 3.0 6.0
2 3 NaN NaN
set_index
set_index方法是将某一列做为索引,而reset_index是从新按int升序的方式做了一个索引
Index.reindex
Index.reindex(target, method=None, level=None, limit=None, tolerance=None)
参数
- target : an iterable
返回
- new_index : pd.Index
- indexer : np.ndarray or None
例子
In [1]: import pandas as pd
...: df = pd.DataFrame({"A": [1, 2, 3], "B": [4, 5, 6]})
...: df
Out[1]:
A B
0 1 4
1 2 5
2 3 6
In [2]: df.index.reindex((1,2,3))
Out[2]: (Int64Index([1, 2, 3], dtype='int64'), array([ 1, 2, -1], dtype=int64))
Index.set_names
Index.set_names(names, level=None, inplace=False)