python-机器学习-随机森林算法_pso改进随机森林 python-CSDN博客

本文链接：https://blog.csdn.net/weixin_35830789/article/details/122370392

这篇博客主要介绍Python中的随机森林算法，它是基于决策树的集成学习方法。文章结合之前讲解的决策树算法，详细阐述了随机森林的工作原理，并提供了实际代码演示和运行结果。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

python-机器学习-随机森林算法
本文是用python学习机器学习系列的第五篇
随机森林算法是在决策树算法的基础上的改进，本文使用的基础决策树算法是引用第二篇文章中实现的决策数算法。
链接：python-机器学习-决策树算法
代码如下：

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib as mpl
from sklearn import preprocessing
import re
from collections import defaultdict
from sklearn.model_selection import train_test_split
import DecisionTree as de

# 随机森林
class RandomForest:

   # 初始化
    def __init__(self, criterion='gini', max_depth=10, max_tree=20, random_sample=0.5):
        self.max_depth = max_depth  # 最大树深
        self.criterion = criterion  # 生成模式 ID3 或 ID4.5 或 gini
        self.max_tree = max_tree  # 最大生成树数
        self.random_sample = random_sample  # 随机样本比例
        self.forest = []  # 森林

    # 拟合函数
    def fit(self, x, y):
        data = np.hstack((x, y))
        for i in range(self.max_tree):
            ranData = self.randomSample(data)
            x2 = ranData[:, :-1]
            y2 = ranData[:, -1]
            model = de.DecisionTree(
                criterion=self.criterion, max_depth=self.max_depth)
            model.fit(x2, y2.reshape(len(y2), 1))
            self.forest.append(model)
        return self

    # 预测多个样本
    def predict(self, x):
        return np.array([self.hat(i) for i in x])

    # 预测单个样本
    def hat(self, x):
        result