pandas_算术和广播(np/df/Series)_函数和映射

最新推荐文章于 2022-06-08 15:49:49 发布

火锅午餐肉

最新推荐文章于 2022-06-08 15:49:49 发布

阅读量302

点赞数

分类专栏： python_pandas

本文链接：https://blog.csdn.net/weixin_38717734/article/details/94588356

版权

python_pandas 专栏收录该内容

7 篇文章 0 订阅

订阅专栏

算术和广播

Series

建立两个一维数据
    s1 = pd.Series([4.2,2.6, 5.4, -1.9], index=list('acde'))
    
    s2 = pd.Series([-2.3, 1.2, 5.6, 7.2, 3.4], index= list('acefg'))
    
       s1
        a    4.2
        c    2.6
        d    5.4
        e   -1.9
        dtype: float64
        
       s2
        a   -2.3
        c    1.2
        e    5.6
        f    7.2
        g    3.4
        dtype: float64

对数据进行算术

df1.add(df2, fill_value=0) #以df2为基础 df2中没有的 为nan
         b    c     d     e
five   6.0  NaN   7.0   8.0
one    0.0  1.0   2.0   NaN
six    9.0  NaN  10.0  11.0
three  9.0  7.0  12.0   5.0
two    3.0  4.0   6.0   2.0

对数据从新定义列 按照df2的列为准
    df1.reindex(columns=df2.columns, fill_value=0) # 也可以这么干
           b  d  e
    one    0  2  0
    two    3  5  0
    three  6  8  0

类似add的方法还有：
add：加法
sub：减法
div：除法
floordiv：整除
mul：乘法
pow：幂次方

numpy

a = np.arange(12).reshape(3,4)

a
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

a[0]   # 取a的第一行，这是一个一维数组
array([0, 1, 2, 3])

a - a[0] # 二维数组减一维数组，在行方向上进行了广播
array([[0, 0, 0, 0],
       [4, 4, 4, 4],
       [8, 8, 8, 8]])

DateFrame
DataFrame和Series之间的操作是类似的：

df = pd.DataFrame(np.arange(12).reshape(4,3),columns=list('bde'),index=['one','two','three','four'])
s = df.iloc[0]  # 取df的第一行生成一个Series

 df
       b   d   e
one    0   1   2
two    3   4   5
three  6   7   8
four   9  10  11

 s
b    0
d    1
e    2
Name: one, dtype: int32

df - s # 减法会广播
       b  d  e
one    0  0  0
two    3  3  3
three  6  6  6
four   9  9  9
#---------------------------------------------------------

s2 = pd.Series(range(3), index=list('bef')) 
 df + s2  # 如果存在不匹配的列索引，则引入缺失值

           b   d     e   f
one    0.0 NaN   3.0 NaN
two    3.0 NaN   6.0 NaN
three  6.0 NaN   9.0 NaN
four   9.0 NaN  12.0 NaN
#---------------------------------------------------------

s3 = df['d'] # 取df的一列

s3
one       1
two       4
three     7
four     10
Name: d, dtype: int32

 df.sub(s3, axis='index')  # 指定按列进行广播
       b  d  e
one   -1  0  1
two   -1  0  1
three -1  0  1
four  -1  0  1

函数和映射

apply (max/min,axis=0)列中最大/小值
apply (max/min,axis=1)行中最大/小值

    建立维度表
    #一些Numpy的通用函数对Pandas对象也有效：
    import pandas as pd
    import numpy as np
    df = pd.DataFrame(np.random.randn(4,3), columns=list('bde'),index = ['one','two','three','four'])
    >>>
            b	        d	        e
    one	-1.194842	-1.372962	0.723438
    two	0.180274	-0.117977	-0.172359
    three	0.115074	-0.586764	0.570921
    four	1.095042	0.721313	-0.287133

#-------------------------------------------------------------

函数映射
#取df中每列的最大值与最小值差
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randn(4,3), columns=list('bde'),index = ['one','two','three','four'])
f = lambda x: x.max() - x.min()
df.apply(f,axis='columns')

火锅午餐肉

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
pandas_算术和广播(np/df/Series)_函数和映射

算术和广播Series建立两个一维数据 s1 = pd.Series([4.2,2.6, 5.4, -1.9], index=list('acde')) s2 = pd.Series([-2.3, 1.2, 5.6, 7.2, 3.4], index= list('acefg')) s1 a 4.2 c ...
复制链接

扫一扫