机器学习-Random Forest实例

Carrie_Lei

已于 2024-09-06 10:29:08 修改

阅读量94

点赞数

分类专栏：机器学习文章标签：机器学习随机森林人工智能

于 2024-09-06 10:27:20 首次发布

本文链接：https://blog.csdn.net/finly4599/article/details/141953045

版权

机器学习专栏收录该内容

18 篇文章 0 订阅

订阅专栏

下面是一个使用随机森林（Random Forest）进行分类的示例，使用 Python 中的 scikit-learn 库。我们将继续使用经典的鸢尾花（Iris）数据集进行演示。

代码实现：

# 导入库
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, classification_report

# 加载数据集
iris = load_iris()
X = iris.data  # 特征
y = iris.target  # 标签

# 将数据集划分为训练集和测试集
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# 创建 Random Forest 分类器
rf_classifier = RandomForestClassifier(n_estimators=100, max_depth=3, random_state=42)

# 训练 Random Forest 模型
rf_classifier.fit(X_train, y_train)

# 进行预测
y_pred = rf_classifier.predict(X_test)

# 评估模型
accuracy = accuracy_score(y_test, y_pred)
print(f"Random Forest 模型的准确率: {accuracy:.2f}")

# 打印分类报告
report = classification_report(y_test, y_pred, target_names=iris.target_names)
print("分类报告:\n", report)

代码说明：

数据集加载：
- 使用 load_iris() 函数加载鸢尾花数据集，这是一个经典的分类数据集，包含 150 个样本，4 个特征和 3 个类别。
数据集划分：
- 使用 train_test_split 将数据集划分为训练集（80%）和测试集（20%）。
创建 Random Forest 分类器：
- RandomForestClassifier 是 scikit-learn 中实现的随机森林分类器。
- n_estimators=100：设置使用 100 棵决策树。
- max_depth=3：限制每棵决策树的最大深度，控制模型的复杂度以防止过拟合。
训练和预测：
- 通过 fit 方法训练随机森林模型。
- 使用 predict 方法对测试集进行预测。
模型评估：
- 使用 accuracy_score 计算模型的准确率。
- 使用 classification_report 打印更详细的分类报告，包括精确度（Precision）、召回率（Recall）、F1 分数等指标。

结果：

最终会输出模型在测试集上的准确率和分类报告。例如：

Random Forest 模型的准确率: 0.97
分类报告:
               precision    recall  f1-score   support

    setosa       1.00      1.00      1.00         10
    versicolor   0.94      1.00      0.97         11
    virginica    1.00      0.89      0.94         9

    accuracy                           0.97         30
   macro avg       0.98      0.96      0.97         30
weighted avg       0.97      0.97      0.97         30