#每天一点点#
python pandas 处理丢失数据
6行4列,以日期为行序,A,B,C,D为列序的df
import numpy as np
import pandas as pd
dates = pd.date_range('20130101',periods=6)
df = pd.DataFrame(np.arange(24).reshape((6,4)),index=dates,columns=['A','B','C','D'])
df.iloc[0,1] = np.nan
df.iloc[1,2] = np.nan
1:只要有nan,那么按照行进行删除
how 有两种情况:any 和 all
any是指只要有nan,则进行删除;all是指这一行或列全部是nan,才被删除
df_drop1 = df.dropna(axis=0,how='any')
输出结果
A B C D
2013-01-03 8 9.0 10.0 11
2013-01-04 12 13.0 14.0 15
2013-01-05 16 17.0 18.0 19
2013-01-06 20 21.0 22.0 23
2:只要有nan,那么按照列进行删除
df_drop2 = df.dropna(axis=1,how='any&