模型评估和超参数调整（一）——管道机制（pipeline）

最新推荐文章于 2024-07-24 21:11:43 发布

Amy_mm

最新推荐文章于 2024-07-24 21:11:43 发布

阅读量3.5k

点赞数 3

分类专栏：机器学习 sklearn 文章标签： pipeline

本文链接：https://blog.csdn.net/Amy_mm/article/details/79890979

版权

本文介绍了模型评估的最佳实践，包括无偏估计、诊断问题、模型调整和性能指标选择。重点讲解了使用管道机制（Pipeline）来简化工作流程，通过加载Breast Cancer Wisconsin数据集，将特征转换并切分为训练集和测试集，然后利用Pipeline结合多个转换器和评估器，如StandardScaler、PCA和LogisticRegression，实现模型的构建和训练。

摘要由CSDN通过智能技术生成

读《python machine learning》chapt 6

Learning Best Practices for Model Evaluation and Hyperparameter Tuning

【主要内容】

（1）获得对模型评估的无偏估计

（2）诊断机器学习算法的常见问题

（3）调整机器学习模型

（4）使用不同的性能指标对评估预测模型

git源码地址 https://github.com/xuman-Amy/Model-evaluation-and-Hypamameter-tuning

【Streamlining workflows with pipeline】

【使用管道机制简化工作流程】

1、【加载数据集】

使用 Breast Cancer Wisconsin dataset数据集

# import dataset
import pandas as pd
df = pd.read_csv("G:\Machine Learning\python machine learning\python machine learning code\code\ch06\wdbc.data",header = None)
'''
column 0,1——ID 和病症（Malignant or benign
column 2-31 特征集 用于诊断病症
'''
df.head()

2、【将30个特征放入数组存储