[Python3] Pandas v1.0 —— (四) 合并数据集


[ Pandas version: 1.0.1 ]


六、合并数据集:Concat与Append操作

将不同的数据源进行合并,包括:

  • 将两个不同的数据集简单拼接
  • 用数据库的连接 (join) 与合并 (merge) 操作处理有重叠字段的数据集
# 定义一个能够创建DataFrame某种形式的函数
def make_df(cols, ind):
    """一个简单的DataFrame"""
    data = {
   c: [str(c) + str(i) for i in  ind] for c in cols}
    return pd.DataFrame(data, ind)

# DataFrame示例
make_df('ABC', range(3)) 

#     A   B   C
# 0  A0  B0  C0
# 1  A1  B1  C1
# 2  A2  B2  C2

(一)NumPy数组的合并 np.concatenate()

x = [1, 2, 3]
y = [4, 5, 6]
z = [7, 8, 9]
np.concatenate([x, y, z])
# array([1, 2, 3, 4, 5, 6, 7, 8, 9])

x = [[1, 2], [3, 4]]
np.concatenate([x, x], axis=1)
# array([[1, 2, 1, 2],
#        [3, 4, 3, 4]])

(二)通过 pd.concat 实现简易合并

pd.concat()函数比np.concatenate()配置更多参数,功能更强大。

pandas.concat — pandas 1.0.3 documentation

# pandas.concat — pandas 1.0.3 documentation
pandas.concat(objs: Union[Iterable[Union[ForwardRef('DataFrame'), ForwardRef('Series')]], Mapping[Union[Hashable, NoneType], Union[ForwardRef('DataFrame'), ForwardRef('Series')]]], axis=0, join='outer', ignore_index: bool = False, keys=None, levels=None, names=None, verify_integrity: bool = False, sort: bool = False, copy: bool = True) → Union[ForwardRef('DataFrame'), ForwardRef('Series')]

Parameters:

objs:	a sequence or mapping of Series or DataFrame objects
		If a dict is passed, the sorted keys will be used as the keys argument, unless it is passed, in which case the values will be selected (see below). Any None objects will be dropped silently unless they are all None in which case a ValueError will be raised.

axis:	{
   0/’index’, 1/’columns’}, default 0
		The axis to concatenate along.

join:	{
   ‘inner’, ‘outer’}, default ‘outer’
		How to handle indexes on other axis (or axes).

ignore_index:	bool, default False
		If True, do not use the index values along the concatenation axis. The resulting axis will be labeled 0,, n - 1. This is useful if you are concatenating objects where the concatenation axis does not have meaningful indexing information. Note the index values on the other axes are still respected in the join.

keys:	sequence, default None
		If multiple levels passed, should contain tuples. Construct hierarchical index using the passed keys as the outermost level.

levels:	list of sequences, default None
		Specific levels (unique values) to use for constructing a MultiIndex. Otherwise they will be inferred from the keys.

names:	list, default None
		Names for the levels in the resulting hierarchical index.

verify_integrity:	bool, default False
		Check whether the new concatenated axis contains duplicates. This can be very expensive relative to the actual data concatenation.

sort:	bool, default False
		Sort non-concatenation axis if it is not already aligned when join is ‘outer’. This has no effect when join='inner', which already preserves the order of the non-concatenation axis.

copy:	bool, default True
		If False, do not copy data unnecessarily.

Returns:	object, type of objs
		When concatenating all Series along the index (axis=0), a Series is returned. When objs contains at least one DataFrame, a DataFrame is returned. When concatenating along the columns (axis=1), a DataFrame is returned.
# 一维合并
ser1 
  • 4
    点赞
  • 9
    收藏
    觉得还不错? 一键收藏
  • 6
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 6
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值