从零开始学习人工智能（机器学习基础知识）（二）

最新推荐文章于 2024-04-19 15:50:10 发布

假如我是大牛

最新推荐文章于 2024-04-19 15:50:10 发布

阅读量366

点赞数

分类专栏：人工智能文章标签：机器学习

本文链接：https://blog.csdn.net/qq_28776523/article/details/106642823

版权

人工智能专栏收录该内容

5 篇文章 0 订阅

订阅专栏

决策树（Decision Tree）

在生活日常中，我们会遇到一个又一个需要做的决定，而一些因素会左右你的决定。做决策的过程，在计算机中可以抽象成一个树状结构。其中每个内部节点表示一个属性上的判断，每个分支代表一个判断结果的输出。一般情况下，一棵决策树包含一个根节点，若干内部节点和若干叶节点。
这是一张半夜睡梦中收到微信，决定是否回复的决策树
在这里插入图片描述

在这里插入图片描述

我们在做决策树的时候，会经历两个阶段：构造和剪枝。

构造

构造就是生成一棵完整的决策树。简单来说，构造的过程就是选择什么属性作为节点的过程，那么在构造过程中，会存在三种节点：

根节点：就是树的最顶端，最开始的那个节点。在上图中，“女朋友”就是一个根节点；

内部节点：就是树中间的那些节点，比如说“发微信”；

叶节点：就是树最底部的节点，也就是决策结果，比如说“继续睡觉”。

节点之间存在父子关系。那么在构造过程中，你要解决三个重要的问题：

选择哪个属性作为根节点；
选择哪些属性作为子节点；
什么时候停止并得到目标状态，即叶节点。

剪枝

剪枝就是给决策树瘦身，这一步想实现的目标就是，不需要太多的判断，同样可以得到不错的结果。之所以这么做，是为了防止“过拟合”（Overfitting）现象的发生。

过拟合：指的是模型的训练结果“太好了”，以至于在实际应用的过程中，会存在“死板”的情况，导致分类错误。

欠拟合：指的是模型的训练结果不理想。

造成过拟合的原因：

一是因为训练集中样本量较小。如果决策树选择的属性过多，构造出来的决策树一定能够“完美”地把训练集中的样本分类，但是这样就会把训练集中一些数据的特点当成所有数据的特点，但这个特点不一定是全部数据的特点，这就使得这个决策树在真实的数据分类中出现错误，也就是模型的“泛化能力”差。

泛化能力：指的分类器是通过训练集抽象出来的分类能力，你也可以理解是举一反三的能力。如果我们太依赖于训练集的数据，那么得到的决策树容错率就会比较低，泛化能力差。因为训练集只是全部数据的抽样，并不能体现全部数据的特点。

剪枝的方法：

预剪枝：在决策树构造时就进行剪枝。方法是，在构造的过程中对节点进行评估，如果对某个节点进行划分，在验证集中不能带来准确性的提升，那么对这个节点进行划分就没有意义，这时就会把当前节点作为叶节点，不对其进行划分。
后剪枝：在生成决策树之后再进行剪枝。通常会从决策树的叶节点开始，逐层向上对每个节点进行评估。如果剪掉这个节点子树，与保留该节点子树在分类准确性上差别不大，或者剪掉该节点子树，能在验证集中带来准确性的提升，那么就可以把该节点子树进行剪枝。方法是：用这个节点子树的叶子节点来替代该节点，类标记为这个节点子树中最频繁的那个类。

下面展示一段构建决策树的代码

@Data
public class TreeNode {
    private String feature;
     //key=条件 value = 子节点
    private HashMap<String, TreeNode> featuresMap;
}

@Data
public class DecisionTreeUtils {

    //每次插入时叶子节点位置
    private TreeNode treeNode;
    //返回的根节点
    private TreeNode treeRoot;

    public TreeNode addDecision(TreeNode root, List<FeatureAndConditionBean> featureAndConditionBeanList){
        String beforeCondition = null;
        treeRoot = root;
        treeNode = root;
        for (FeatureAndConditionBean featureAndConditionBean : featureAndConditionBeanList) {
            beforeCondition = create(beforeCondition,featureAndConditionBean);
        }
        return treeRoot;
    }

    private String create(String beforeCondition,FeatureAndConditionBean featureAndConditionBean){
        if (beforeCondition == null){
            if (treeNode == null){
                treeNode = new TreeNode();
            }
            treeNode.setFeature(featureAndConditionBean.getFeature());
            treeRoot = treeNode;
            return featureAndConditionBean.getCondition();
        }
        HashMap<String,TreeNode> treeNodeHashMap = treeNode.getFeaturesMap();
        if (treeNodeHashMap == null){
            treeNodeHashMap = new HashMap<>(16);
        }
        TreeNode treeNode1 = treeNodeHashMap.get(beforeCondition);
        if (treeNode1 == null){
            treeNode1 = new TreeNode();
            treeNode1.setFeature(featureAndConditionBean.getFeature());
            treeNodeHashMap.put(beforeCondition,treeNode1);
        }
        treeNode.setFeaturesMap(treeNodeHashMap);
        treeNode = treeNode1;
        return featureAndConditionBean.getCondition();
    }

    public static void main(String[] args) {
        List<FeatureAndConditionBean> featureAndConditionBeanList = new ArrayList<>(10);
        FeatureAndConditionBean featureAndConditionBean = new FeatureAndConditionBean();
        featureAndConditionBean.setFeature("脸部");
        featureAndConditionBean.setCondition("胖");
        FeatureAndConditionBean featureAndConditionBean2 = new FeatureAndConditionBean();
        featureAndConditionBean2.setFeature("眼睛");
        featureAndConditionBean2.setCondition("大");
        FeatureAndConditionBean featureAndConditionBean3 = new FeatureAndConditionBean();
        featureAndConditionBean3.setFeature("鼻子");
        featureAndConditionBean3.setCondition("挺拔");
        FeatureAndConditionBean featureAndConditionBean4 = new FeatureAndConditionBean();
        featureAndConditionBean4.setFeature("是帅哥");
        featureAndConditionBeanList.add(featureAndConditionBean);
        featureAndConditionBeanList.add(featureAndConditionBean2);
        featureAndConditionBeanList.add(featureAndConditionBean3);
        featureAndConditionBeanList.add(featureAndConditionBean4);
        DecisionTreeUtils decisionTreeUtils = new DecisionTreeUtils();
        TreeNode treeNode = decisionTreeUtils.addDecision(null,featureAndConditionBeanList);
        List<FeatureAndConditionBean> featureAndConditionBeanList2 = new ArrayList<>(10);
        FeatureAndConditionBean featureAndConditionBean5 = new FeatureAndConditionBean();
        featureAndConditionBean5.setFeature("脸部");
        featureAndConditionBean5.setCondition("胖");
        FeatureAndConditionBean featureAndConditionBean6 = new FeatureAndConditionBean();
        featureAndConditionBean6.setFeature("眼睛");
        featureAndConditionBean6.setCondition("小");
        FeatureAndConditionBean featureAndConditionBean7 = new FeatureAndConditionBean();
        featureAndConditionBean7.setFeature("鼻子");
        featureAndConditionBean7.setCondition("低");
        FeatureAndConditionBean featureAndConditionBean8 = new FeatureAndConditionBean();
        featureAndConditionBean8.setFeature("不是帅哥");
        featureAndConditionBeanList2.add(featureAndConditionBean5);
        featureAndConditionBeanList2.add(featureAndConditionBean6);
        featureAndConditionBeanList2.add(featureAndConditionBean7);
        featureAndConditionBeanList2.add(featureAndConditionBean8);
        treeNode = decisionTreeUtils.addDecision(treeNode,featureAndConditionBeanList2);
        System.out.println(treeNode);
    }
}

假如我是大牛

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
从零开始学习人工智能（机器学习基础知识）（二）

决策树（Decision Tree）在生活日常中，我们会遇到一个又一个需要做的决定，而一些因素会左右你的决定。做决策的过程，在计算机中可以抽象成一个树状结构。其中每个内部节点表示一个属性上的判断，每个分支代表一个判断结果的输出。一般情况下，一棵决策树包含一个根节点，若干内部节点和若干叶节点。这是一张半夜睡梦中收到微信，决定是否回复的决策树这边引用一篇很好的文章...
复制链接

扫一扫