影像组学路径图
感谢《poppelmann》同学分享的机器学习算法两种后融合策略
采用案例
乳腺癌自动诊断及数据分析
一、首先是staking策略,其基本思想就是将不同模型得到的结果拼接在一起,拼接完成后再放到下一个模型里进行预测。相当于给原始数据增加不同维度的解读,由此提高模型的精度。主要步骤如下:(在这里以XGB融合SVM dt KNN rf et LGB为例)
1.得到拼接后的训练集和测试集。为此,首先需要得到每个模型结果所对应的id,用X_train_id进行保存,同样的测试集用X_test_id 保存。
from sklearn.model_selection import train_test_split
X_train_colums = ['id', 'radius_mean', 'texture_mean', 'perimeter_mean',
'area_mean', 'smoothness_mean', 'compactness_mean', 'concavity_mean',
'concave points_mean', 'symmetry_mean', 'fractal_dimension_mean',
'radius_se', 'texture_se', 'perimeter_se', 'area_se', 'smoothness_se',
'compactness_se', 'concavity_se', 'concave points_se', 'symmetry_se',
'fractal_dimension_se', 'radius_worst', 'texture_worst',
'perimeter_worst', 'area_worst', 'smoothness_worst',
'compactness_worst', 'concavity_worst', 'concave_points_worst',
'symmetry_worst', 'fractal_dimension_worst']
X_data = bc_data[X_train_colums]
n_classes = 2
y_data = np.ravel(bc_data[['diagnosis']].applymap(lambda x: 0 if x=='M' else 1))
X_train_all, X_test_all, y_