熊猫DataFrame append()函数

Pandas DataFrame append() function is used to merge rows from another DataFrame object. This function returns a new DataFrame object and doesn’t change the source objects. If there is a mismatch in the columns, the new columns are added in the result DataFrame.

Pandas DataFrame append()函数用于合并另一个DataFrame对象中的行。 此函数返回一个新的DataFrame对象,并且不更改源对象。 如果各列不匹配,则将新列添加到结果DataFrame中。

1.熊猫DataFrame append()参数 (1. Pandas DataFrame append() Parameters)

The append() function syntax is:

append()函数的语法为:

append(other, ignore_index=False, verify_integrity=False, sort=None)
  • other: The DataFrame, Series or Dict-like object whose rows will be added to the caller DataFrame.

    其他 :DataFrame,Series或Dict式对象,其行将添加到调用方DataFrame中。
  • ignore_index: if True, the indexes from the source DataFrame objects are ignored.

    ignore_index :如果为True,则将忽略源DataFrame对象中的索引。
  • verify_integrity: if True, raise ValueError on creating index with duplicates.

    verify_integrity :如果为True,则在创建具有重复项的索引时引发ValueError
  • sort: sort columns if the source DataFrame columns are not aligned. This functionality is deprecated. So we have to pass sort=True to sort and silence the warning message. If sort=False is passed, the columns are not sorted and warning is ignored.

    sort :如果源DataFrame列未对齐,则对列进行排序。 不建议使用此功能。 因此,我们必须传递sort=True来排序和静音警告消息。 如果传递了sort=False ,则不会对列进行排序,并且会忽略警告。

Let’s look into some examples of the DataFrame append() function.

让我们看一下DataFrame append()函数的一些示例。

2.追加两个数据框 (2. Appending Two DataFrames)

import pandas as pd

df1 = pd.DataFrame({'Name': ['Pankaj', 'Lisa'], 'ID': [1, 2]})
df2 = pd.DataFrame({'Name': ['David'], 'ID': [3]})

print(df1)
print(df2)

df3 = df1.append(df2)
print('\nResult DataFrame:\n', df3)

Output:

输出:

Name  ID
0  Pankaj   1
1    Lisa   2
    Name  ID
0  David   3

Result DataFrame:
      Name  ID
0  Pankaj   1
1    Lisa   2
0   David   3

3.附加和忽略DataFrame索引 (3. Appending and Ignoring DataFrame Indexes)

If you look at the previous example, the output contains duplicate indexes. We can pass ignore_index=True to ignore the source indexes and assign new index to the output DataFrame.

如果查看前面的示例,则输出包含重复的索引。 我们可以传递ignore_index=True来忽略源索引,并将新索引分配给输出DataFrame。

df3 = df1.append(df2, ignore_index=True)
print(df3)

Output:

输出:

Name  ID
0  Pankaj   1
1    Lisa   2
2   David   3

4.为重复的索引引发ValueError (4. Raise ValueError for duplicate indexes)

We can pass verify_integrity=True to raise ValueError if there are duplicate indexes in the two DataFrame objects.

如果两个DataFrame对象中有重复的索引,我们可以传递verify_integrity=True引发ValueError。

import pandas as pd

df1 = pd.DataFrame({'Name': ['Pankaj', 'Lisa'], 'ID': [1, 2]})
df2 = pd.DataFrame({'Name': ['David'], 'ID': [3]})

df3 = df1.append(df2, verify_integrity=True)

Output:

输出:

ValueError: Indexes have overlapping values: Int64Index([0], dtype='int64')

Let’s look at another example where we don’t have duplicate indexes.

让我们看另一个没有重复索引的示例。

import pandas as pd

df1 = pd.DataFrame({'Name': ['Pankaj', 'Lisa'], 'ID': [1, 2]}, index=[100, 200])

df2 = pd.DataFrame({'Name': ['David'], 'ID': [3]}, index=[300])

df3 = df1.append(df2, verify_integrity=True)

print(df3)

Output:

输出:

Name  ID
100  Pankaj   1
200    Lisa   2
300   David   3

5.追加具有非匹配列的DataFrame对象 (5. Appending DataFrame objects with Non-Matching Columns)

import pandas as pd

df1 = pd.DataFrame({'Name': ['Pankaj', 'Lisa'], 'ID': [1, 2]})
df2 = pd.DataFrame({'Name': ['Pankaj', 'David'], 'ID': [1, 3], 'Role': ['CEO', 'Author']})

df3 = df1.append(df2, sort=False)

print(df3)

Output:

输出:

Name  ID    Role
0  Pankaj   1     NaN
1    Lisa   2     NaN
0  Pankaj   1     CEO
1   David   3  Author

We are explicitly passing sort=False to avoid sorting of columns and ignore FutureWarning. If you don’t pass this parameter, the output will contain the following warning message.

我们明确传递了sort=False以避免对列进行排序并忽略FutureWarning。 如果不传递此参数,则输出将包含以下警告消息。

FutureWarning: Sorting because the non-concatenation axis is not aligned. A future version
of pandas will change to not sort by default.

To accept the future behavior, pass 'sort=False'.

To retain the current behavior and silence the warning, pass 'sort=True'.

Let’s see what happens when we pass sort=True.

让我们看看当我们传递sort=True时会发生什么。

import pandas as pd

df1 = pd.DataFrame({'Name': ['Pankaj', 'Lisa'], 'ID': [1, 2]})
df2 = pd.DataFrame({'Name': ['Pankaj', 'David'], 'ID': [1, 3], 'Role': ['CEO', 'Author']})

df3 = df1.append(df2, sort=True)

print(df3)

Output:

输出:

ID    Name    Role
0   1  Pankaj     NaN
1   2    Lisa     NaN
0   1  Pankaj     CEO
1   3   David  Author

Notice that the columns are sorted in the result DataFrame object. Note that this feature is deprecated and will be removed from future releases.

请注意,列在结果DataFrame对象中排序。 请注意,此功能已被弃用,将从将来的版本中删除。

Let’s look at another example where we have non-matching columns with int values.

让我们看另一个示例,在该示例中,我们具有不匹配的带有int值的列。

import pandas as pd

df1 = pd.DataFrame({'ID': [1, 2]})
df2 = pd.DataFrame({'Name': ['Pankaj', 'Lisa']})

df3 = df1.append(df2, sort=False)
print(df3)

Output:

输出:

ID    Name
0  1.0     NaN
1  2.0     NaN
0  NaN  Pankaj
1  NaN    Lisa

Notice that the ID values are changed to floating-point numbers to allow NaN value.

请注意,ID值已更改为浮点数以允许使用NaN值。

6.参考 (6. References)

翻译自: https://www.journaldev.com/33465/pandas-dataframe-append-function

  • 3
    点赞
  • 8
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值