DataFrame.insert(loc, column, value, allow_duplicates=False)[source]
Insert column into DataFrame at specified location.
Raises a ValueError if column is already contained in the DataFrame, unless allow_duplicates is set to True.
Parameters:
loc : #int 使用整数定义_列数据_插入的位置,必须是0到columns列标签的长度
Insertion index. Must verify 0 <= loc <= len(columns)
column : string, number, or hashable object # 可选字符串、数字或者object;列标签名
label of the inserted column
value : int, Series, or array-like # 整数、Series或者数组型数据
allow_duplicates : bool, optional # 可选参数,如果dataframe中已经存在某列,将allow_duplicates置为true才可以将指定得列插入。
实例详解:
import pandas as pd
from pandas import DataFrame,Series
df = pd.DataFrame(np.arange(12).reshape(4,3),columns=['a','b','c'])
df
Out[4]:
a b c
0 0 1 2
1 3 4 5
2 6 7 8
3 9 10 11
在第二列插入数据:
df.insert(1,'d',np.ones(4))
df
Out[6]:
a d b c
0 0 1.0 1 2
1 3 1.0 4 5
2 6 1.0 7 8
3 9 1.0 10 11
如果没有设定allow_duplicates = True,此时如果添加的列已经存在,则会报错:
df.insert(1,'d',np.ones(4))
Traceback (most recent call last):
File "<ipython-input-11-0e09dfb193a4>", line 1, in <module>
df.insert(1,'d',np.ones(4))
File "C:\Anaconda3\lib\site-packages\pandas\core\frame.py", line 2449, in insert
allow_duplicates=allow_duplicates)
File "C:\Anaconda3\lib\site-packages\pandas\core\internals.py", line 3510, in insert
raise ValueError('cannot insert %s, already exists' % item)
ValueError: cannot insert d, already exists
因此,如果是添加的列已经存在,如下处理:
df.insert(1,'d',np.ones(4),allow_duplicates=True) #allow_duplicates=True
df
Out[13]:
a d d b c
0 0 1.0 1.0 1 2
1 3 1.0 1.0 4 5
2 6 1.0 1.0 7 8
3 9 1.0 1.0 10 11