我正在尝试对包含缺少值的数据集执行RandomForest。
我的数据集看起来像:train_data = [['1' 'NaN' 'NaN' '0.0127034' '0.0435092']
['1' 'NaN' 'NaN' '0.0113187' '0.228205']
['1' '0.648' '0.248' '0.0142176' '0.202707']
...,
['1' '0.357' '0.470' '0.0328121' '0.255039']
['1' 'NaN' 'NaN' '0.00311825' '0.0381745']
['1' 'NaN' 'NaN' '0.0332604' '0.2857']]
为了估算“NaN”值,我使用:from sklearn.preprocessing import Imputer
imp=Imputer(missing_values='NaN',strategy='mean',axis=0)
imp.fit(train_data[0::,1::])
new_train_data=imp.transform(train_data)
但我得到了以下错误:Traceback (most recent call last):