做数据分析时,使用mutual_info_regression有时会遇到以下报错信息:
Found array with 0 sample(s) (shape=(0, 1)) while a minimum of 1 is required.
就比如这个场景
make_mi_scores(X2_pca.iloc[:,:10],y_train)
make_mi_scores
这个问题一直让我费解,因为我的df明明有数据啊
最后我找到源代码,看到这处报错的164行代码:
产生思考:这个问题是不是和我的解释变量X的类型是不是discrete离散型的有关系?
我定义的函数make_mi_scores中是这样指定了每个Xi的类型
discrete_features =[pd.api.types.is_numeric_dtype(t) for t in X.dtypes ]
def make_mi_scores(X,y):
X =X.copy()
discrete_features =[pd.api.types.is_numeric_dtype(t) for t in X.dtypes ]
MI_scores =mutual_info_regression(X,y,discrete_features=discrete_features,random_state=21)
MI_scores =pd.Series(MI_scores, name='MI_scores', index=X.columns)
MI_scores =MI_scores.sort_values()
return MI_scores
改变代码进行执行,发现成功了