Pandas DataFrame append() function is used to merge rows from another DataFrame object. This function returns a new DataFrame object and doesn’t change the source objects. If there is a mismatch in the columns, the new columns are added in the result DataFrame.
Pandas DataFrame append()函数用于合并另一个DataFrame对象中的行。 此函数返回一个新的DataFrame对象,并且不更改源对象。 如果各列不匹配,则将新列添加到结果DataFrame中。
1.熊猫DataFrame append()参数 (1. Pandas DataFrame append() Parameters)
The append() function syntax is:
append()函数的语法为:
append(other, ignore_index=False, verify_integrity=False, sort=None)
- other: The DataFrame, Series or Dict-like object whose rows will be added to the caller DataFrame. 其他 :DataFrame,Series或Dict式对象,其行将添加到调用方DataFrame中。
- ignore_index: if True, the indexes from the source DataFrame objects are ignored. ignore_index :如果为True,则将忽略源DataFrame对象中的索引。
- verify_integrity: if True, raise
ValueError
on creating index with duplicates. verify_integrity :如果为True,则在创建具有重复项的索引时引发ValueError
。 - sort: sort columns if the source DataFrame columns are not aligned. This functionality is deprecated. So we have to pass
sort=True
to sort and silence the warning message. Ifsort=False
is passed, the columns are not sorted and warning is ignored. sort :如果源DataFrame列未对齐,则对列进行排序。 不建议使用此功能。 因此,我们必须传递sort=True
来排序和静音警告消息。 如果传递了sort=False
,则不会对列进行排序,并且会忽略警告。
Let’s look into some examples of the DataFrame append() function.
让我们看一下DataFrame append()函数的一些示例。
2.追加两个数据框 (2. Appending Two DataFrames)
import pandas as pd
df1 = pd.DataFrame({'Name': ['Pankaj', 'Lisa'], 'ID': [1, 2]})
df2 = pd.DataFrame({'Name': ['David'], 'ID': [3]})
print(df1)
print(df2)
df3 = df1.append(df2)
print('\nResult DataFrame:\n', df3)
Output:
输出:
Name ID
0 Pankaj 1
1 Lisa 2
Name ID
0 David 3
Result DataFrame:
Name ID
0 Pankaj 1
1 Lisa 2
0 David 3
3.附加和忽略DataFrame索引 (3. Appending and Ignoring DataFrame Indexes)
If you look at the previous example, the output contains duplicate indexes. We can pass ignore_index=True
to ignore the source indexes and assign new index to the output DataFrame.
如果查看前面的示例,则输出包含重复的索引。 我们可以传递ignore_index=True
来忽略源索引,并将新索引分配给输出DataFrame。
df3 = df1.append(df2, ignore_index=True)
print(df3)
Output:
输出:
Name ID
0 Pankaj 1
1 Lisa 2
2 David 3
4.为重复的索引引发ValueError (4. Raise ValueError for duplicate indexes)
We can pass verify_integrity=True
to raise ValueError if there are duplicate indexes in the two DataFrame objects.
如果两个DataFrame对象中有重复的索引,我们可以传递verify_integrity=True
引发ValueError。
import pandas as pd
df1 = pd.DataFrame({'Name': ['Pankaj', 'Lisa'], 'ID': [1, 2]})
df2 = pd.DataFrame({'Name': ['David'], 'ID': [3]})
df3 = df1.append(df2, verify_integrity=True)
Output:
输出:
ValueError: Indexes have overlapping values: Int64Index([0], dtype='int64')
Let’s look at another example where we don’t have duplicate indexes.
让我们看另一个没有重复索引的示例。
import pandas as pd
df1 = pd.DataFrame({'Name': ['Pankaj', 'Lisa'], 'ID': [1, 2]}, index=[100, 200])
df2 = pd.DataFrame({'Name': ['David'], 'ID': [3]}, index=[300])
df3 = df1.append(df2, verify_integrity=True)
print(df3)
Output:
输出:
Name ID
100 Pankaj 1
200 Lisa 2
300 David 3
5.追加具有非匹配列的DataFrame对象 (5. Appending DataFrame objects with Non-Matching Columns)
import pandas as pd
df1 = pd.DataFrame({'Name': ['Pankaj', 'Lisa'], 'ID': [1, 2]})
df2 = pd.DataFrame({'Name': ['Pankaj', 'David'], 'ID': [1, 3], 'Role': ['CEO', 'Author']})
df3 = df1.append(df2, sort=False)
print(df3)
Output:
输出:
Name ID Role
0 Pankaj 1 NaN
1 Lisa 2 NaN
0 Pankaj 1 CEO
1 David 3 Author
We are explicitly passing sort=False
to avoid sorting of columns and ignore FutureWarning. If you don’t pass this parameter, the output will contain the following warning message.
我们明确传递了sort=False
以避免对列进行排序并忽略FutureWarning。 如果不传递此参数,则输出将包含以下警告消息。
FutureWarning: Sorting because the non-concatenation axis is not aligned. A future version
of pandas will change to not sort by default.
To accept the future behavior, pass 'sort=False'.
To retain the current behavior and silence the warning, pass 'sort=True'.
Let’s see what happens when we pass sort=True
.
让我们看看当我们传递sort=True
时会发生什么。
import pandas as pd
df1 = pd.DataFrame({'Name': ['Pankaj', 'Lisa'], 'ID': [1, 2]})
df2 = pd.DataFrame({'Name': ['Pankaj', 'David'], 'ID': [1, 3], 'Role': ['CEO', 'Author']})
df3 = df1.append(df2, sort=True)
print(df3)
Output:
输出:
ID Name Role
0 1 Pankaj NaN
1 2 Lisa NaN
0 1 Pankaj CEO
1 3 David Author
Notice that the columns are sorted in the result DataFrame object. Note that this feature is deprecated and will be removed from future releases.
请注意,列在结果DataFrame对象中排序。 请注意,此功能已被弃用,将从将来的版本中删除。
Let’s look at another example where we have non-matching columns with int values.
让我们看另一个示例,在该示例中,我们具有不匹配的带有int值的列。
import pandas as pd
df1 = pd.DataFrame({'ID': [1, 2]})
df2 = pd.DataFrame({'Name': ['Pankaj', 'Lisa']})
df3 = df1.append(df2, sort=False)
print(df3)
Output:
输出:
ID Name
0 1.0 NaN
1 2.0 NaN
0 NaN Pankaj
1 NaN Lisa
Notice that the ID values are changed to floating-point numbers to allow NaN value.
请注意,ID值已更改为浮点数以允许使用NaN值。
6.参考 (6. References)
- Python Pandas Module Tutorial Python Pandas模块教程
- Pandas concat() function 熊猫concat()函数
- Pandas DataFrame append() API Docs 熊猫DataFrame append()API文档
翻译自: https://www.journaldev.com/33465/pandas-dataframe-append-function