算术运算和数据对齐(NaN)
from pandas import Series,DataFrame
import numpy as np
import pandas as pd
Series
s1 = Series([-7.3,-2.5,3.4,1.5],index=['a','c','d','e'])
s1
a -7.3
c -2.5
d 3.4
e 1.5
dtype: float64
s2 = Series([-2.1,3.6,-1.5,4,3.1], index=['a','c','e','f','g'])
s2
a -2.1
c 3.6
e -1.5
f 4.0
g 3.1
dtype: float64
s1 + s2
a -9.4
c 1.1
d NaN
e 0.0
f NaN
g NaN
dtype: float64
DataFrame
print(list('bde'))
['b', 'd', 'e']
df1 = DataFrame(np.arange(9).reshape((3,3)),
columns=list('bcd'),index=['Ohio','Texas','Colorado'])
df1
| b | c | d |
---|
Ohio | 0 | 1 | 2 |
---|
Texas | 3 | 4 | 5 |
---|
Colorado | 6 | 7 | 8 |
---|
df2 = DataFrame(np.arange(12).reshape((4,3)),
columns=list('bde'),index=['Utah','Ohio','Texas','Oregon'])
df2
| b | d | e |
---|
Utah | 0 | 1 | 2 |
---|
Ohio | 3 | 4 | 5 |
---|
Texas | 6 | 7 | 8 |
---|
Oregon | 9 | 10 | 11 |
---|
df1 + df2
| b | c | d | e |
---|
Colorado | NaN | NaN | NaN | NaN |
---|
Ohio | 3.0 | NaN | 6.0 | NaN |
---|
Oregon | NaN | NaN | NaN | NaN |
---|
Texas | 9.0 | NaN | 12.0 | NaN |
---|
Utah | NaN | NaN | NaN | NaN |
---|
方法 | 说明 |
---|
add | 用于加法(+)的方法 |
sub | 用于减法(-)的方法 |
div | 用于除法(/)的方法 |
mul | 用于乘法(*)的方法 |
在算术方法中填充值
df1 = DataFrame(np.arange(12).reshape((3,4)),columns=list('abcd'))
df1
df2 = DataFrame(np.arange(20).reshape((4,5)),columns=list('abcde'))
df2
| a | b | c | d | e |
---|
0 | 0 | 1 | 2 | 3 | 4 |
---|
1 | 5 | 6 | 7 | 8 | 9 |
---|
2 | 10 | 11 | 12 | 13 | 14 |
---|
3 | 15 | 16 | 17 | 18 | 19 |
---|
df1.add(df2,fill_value=0)
| a | b | c | d | e |
---|
0 | 0.0 | 2.0 | 4.0 | 6.0 | 4.0 |
---|
1 | 9.0 | 11.0 | 13.0 | 15.0 | 9.0 |
---|
2 | 18.0 | 20.0 | 22.0 | 24.0 | 14.0 |
---|
3 | 15.0 | 16.0 | 17.0 | 18.0 | 19.0 |
---|
DataFrame 与 Series 之间的运算
启发性例子
计算一个二维数组与其某行之间的差 (广播)
arr = np.arange(12).reshape((3,4))
arr
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
arr[0]
array([0, 1, 2, 3])
arr - arr[0]
array([[0, 0, 0, 0],
[4, 4, 4, 4],
[8, 8, 8, 8]])
DataFrame 与 Series 之间运算
行之间计算
frame = DataFrame(np.arange(12).reshape((4,3)),
columns=list('bde'),index=['Utah','Ohio','Texas','Oregon'])
frame
| b | d | e |
---|
Utah | 0 | 1 | 2 |
---|
Ohio | 3 | 4 | 5 |
---|
Texas | 6 | 7 | 8 |
---|
Oregon | 9 | 10 | 11 |
---|
series = frame.ix[0]
series
b 0
d 1
e 2
Name: Utah, dtype: int64
frame - series
| b | d | e |
---|
Utah | 0 | 0 | 0 |
---|
Ohio | 3 | 3 | 3 |
---|
Texas | 6 | 6 | 6 |
---|
Oregon | 9 | 9 | 9 |
---|
series2 = Series(range(3),index=['b','e','f'])
series2
b 0
e 1
f 2
dtype: int64
frame + series2
| b | d | e | f |
---|
Utah | 0.0 | NaN | 3.0 | NaN |
---|
Ohio | 3.0 | NaN | 6.0 | NaN |
---|
Texas | 6.0 | NaN | 9.0 | NaN |
---|
Oregon | 9.0 | NaN | 12.0 | NaN |
---|
列之间运算
frame
| b | d | e |
---|
Utah | 0 | 1 | 2 |
---|
Ohio | 3 | 4 | 5 |
---|
Texas | 6 | 7 | 8 |
---|
Oregon | 9 | 10 | 11 |
---|
frame['d']
Utah 1
Ohio 4
Texas 7
Oregon 10
Name: d, dtype: int64
series3 = frame['d']
series3
Utah 1
Ohio 4
Texas 7
Oregon 10
Name: d, dtype: int64
frame.sub(series3,axis=0)
| b | d | e |
---|
Utah | -1 | 0 | 1 |
---|
Ohio | -1 | 0 | 1 |
---|
Texas | -1 | 0 | 1 |
---|
Oregon | -1 | 0 | 1 |
---|