6.5学习笔记(缺失值)

import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randn(5,3),index=['a','c','e','f','h'],columns=['one','two','three'])
df= df.reindex(['a','b','c','d','e','f','g','h'])
print(df['two'].isnull())

结果:
a False
b True
c False
d True
e False
f False
g True
h False
Name: two, dtype: bool

notnull():判断不是空值

处理缺失值时会将NA视为0

  print(df['one'].sum())

处理缺失值
i.清理/填充缺少数据
ii.Pandas提供各种方法来清除缺失值
iii.fillna()函数可以通过几种方法用非空数据“填充“NA值
iv.用标量值替换NaN
用0替换NaN

import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randn(3,3),index=['a','c','e'],columns=['one','two','three'])
df= df.reindex(['a','b','c'])
print(df.fillna(0))

结果:
one two three
a -1.282845 -0.455397 1.043505
b 0.000000 0.000000 0.000000
c -0.228125 0.679129 0.809223

填写NA前进和后退,使用重构索引的填充概念,来填补缺失值
填充方法向前(pad/fill)

import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randn(5,3),index=['a','c','e','f','h'],columns=['one','two','three'])
df= df.reindex(['a','b','c','d','e','f','g'])
print(df.fillna(method='pad'))

结果:
one two three
a 1.626482 -0.054672 2.649389
b 1.626482 -0.054672 2.649389
c 2.185441 0.499460 -0.219088
d 2.185441 0.499460 -0.219088
e 1.667473 0.750116 -0.927406
f 0.193735 -0.968799 -0.420697
g 0.776946 -0.644994 0.139583

向后填充

import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randn(5,3),index=['a','c','e','f','h'],columns=['one','two','three'])
df= df.reindex(['a','b','c','d','e','f','g','h'])
print(df.fillna(method='bfill'))

结果:
one two three
a 1.144226 -0.278943 0.711329
b -0.210107 -1.473591 -1.013294
c -0.210107 -1.473591 -1.013294
d 1.415249 -0.355017 -1.420598
e 1.415249 -0.355017 -1.420598
f -0.689457 -1.626022 0.900874
g -0.808437 -1.290974 -0.786823
h -0.808437 -1.290974 -0.786823

丢失缺失值
如果只想排除缺少的值,则使用dropna函数和axis参数
丢失整行缺失值

import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randn(5,3),index=['a','c','e','f','g'],columns=['one','two','three'])
df= df.reindex(['a','b','c','d','e','f','g','h'])
print(df.dropna())

结果:
one two three
a 0.895142 1.652897 -0.329553
c -1.846283 -1.107428 0.610238
e 0.309367 -0.485988 1.061889
f 0.488847 -0.043163 0.705120
g -0.001079 0.865002 0.925668

8 49:08

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值