Python机器学习库SKLearn包含的内容（目录）

最新推荐文章于 2023-04-11 19:34:00 发布

dingcheng998

最新推荐文章于 2023-04-11 19:34:00 发布

阅读量2.9k

点赞数 2

分类专栏： Python 机器学习

Python 同时被 2 个专栏收录

18 篇文章 0 订阅

订阅专栏

机器学习

17 篇文章 0 订阅

订阅专栏

#Sklearn学习
"""
一、机器学习主要分类：
      1、预处理
      2、模型选择
      3、分类
      4、回归
      5、聚类
      6、降维
      
机器学习主要包含内容：
    1、监督学习
        1.1 广义线性模型（Generalized Linear Models）
        1.2 线性二次判别分析（Linear and Quadratic Discriminant Analysis）
        1.3 核岭回归（Kernel ridge regression）
        1.4 支持向量机（Support Vector Machines）
        1.5 随机梯度下降（Stochastic Gradient Descent）
        1.6 最近邻（Nearest Neighbors）
        1.7 高斯过程（Gaussian Processes）
        1.8 交叉分解（Cross decomposition）
        1.9 朴素贝叶斯（Naive Bayes）
        1.10 决策树（Decision Trees）
        1.11 集成方法（Ensemble methods）
        1.12 多类和多标签算法（Multiclass and multilabel algorithms）
        1.13 特征选择（Feature selection）
        1.14 半监督（Semi-Supervised）
        1.15 保序回归（Isotonic regression）
        1.16 概率校准（Probability calibration）
        1.17 神经网络模型（监督）（Neural network models (supervised)）
        
    2、无监督学习
        2.1 高斯混合模型（Gaussian mixture models）
        2.2 流形学习（Manifold learning）
        2.3 聚类（Clustering）
        2.4 双聚类（Biclustering）
        2.5 分解信号分量（矩阵分解问题）（Decomposing signals in components (matrix factorization problems)）
        2.6 协方差估计（Covariance estimation）
        2.7 异常值检测（Novelty and Outlier Detection）
        2.8 密度估计（Density Estimation）
        2.9 神经网络模型（无监督）（Neural network models (unsupervised)）
        
    3、模型选择和评估
        3.1 交叉验证：评估估计器性能（Cross-validation: evaluating estimator performance）
        3.2 调整估计器的超参数（Tuning the hyper-parameters of an estimator）
        3.3 模型评价：量化预测的质量（Model evaluation: quantifying the quality of predictions）
        3.4 模型持久性（ Model persistence）
        3.5 验证曲线：绘制分数以评估模型（ Validation curves: plotting scores to evaluate models）
        
    4、数据集转换
        4.1 管道和特征：组合估计量
        4.2 特征提取
        4.3 预处理数据
        4.4 无监督降维
        4.5 随机投影
        4.6 内核近似
        4.7 成对度量，亲和度和内核
        4.8 变换预测目标（y）
        
    5、数据集加载实用程序
        5.1 通用数据集API（General dataset API）
        5.2 玩具数据集（Toy datasets）
        5.3 示例图像（Sample images）
        5.4 样品发生器（Sample generators）
        5.5 svmlight / libsvm格式的数据集（Datasets in svmlight / libsvm format）
        5.6 从外部数据集加载（Loading from external datasets）
        5.7 Olivetti面数据集（The Olivetti faces dataset）
        5.8 20个新闻组文本数据集（The 20 newsgroups text dataset）
        5.9 从mldata.org存储库下载数据集（Downloading datasets from the mldata.org repository）
        5.10 野生面部识别数据集中的标记面（The Labeled Faces in the Wild face recognition dataset）
        5.11 森林covertypes（ Forest covertypes）
        5.12 RCV1数据集（RCV1 dataset）
        5.13 波士顿房价数据集（Boston House Prices dataset）
        5.14 乳腺癌威斯康星（诊断）数据库（Breast Cancer Wisconsin (Diagnostic) Database）
        5.15 糖尿病数据集（Diabetes dataset）
        5.16 手写数字数据集的光学识别（Optical Recognition of Handwritten Digits Data Set）
        5.17 虹膜植物数据库（Iris Plants Database）
        5.18 Linnerrud数据集（Linnerrud dataset）

    6、计算规模的策略：更大的数据
        6.1 使用核外学习来扩展实例（Scaling with instances using out-of-core learning）
        
    7、计算性能
        7.1 预测延迟（Prediction Latency）
        7.2 预测吞吐量（Prediction Throughput）
        7.3 技巧和窍门（Tips and Tricks）
    
"""