随机森林应用案例 —— otto产品分类_随机森林分类实例

最新推荐文章于 2024-05-10 15:03:16 发布

2401_84166396

最新推荐文章于 2024-05-10 15:03:16 发布

阅读量31

点赞数 12

分类专栏：程序员文章标签：随机森林分类算法

本文链接：https://blog.csdn.net/2401_84166396/article/details/138374742

版权


查看欠采样后的数据形状

x.shape,y.shape

((61878, 93), (61878,))

x_resampled.shape,y_resampled.shape

((17361, 93), (17361,))


查看数据经过欠采样之后类别是否平衡

sns.countplot(y_resampled)
plt.show()


![在这里插入图片描述](https://img-blog.csdnimg.cn/8069ff2e3d49452d9c7f24dbb44f200c.png)


（3）把标签值转换为数字

y_resampled


![在这里插入图片描述](https://img-blog.csdnimg.cn/2c4a6b7fec244219852665cc2691f904.png)

from sklearn.preprocessing import LabelEncoder

le = LabelEncoder()
y_resampled = le.fit_transform(y_resampled)
y_resampled


![在这里插入图片描述](https://img-blog.csdnimg.cn/bb791b640e914e6587f1ecbb13f0da37.png)  
 （4）分割数据

from sklearn.model_selection import train_test_split

x_train,x_test,y_train,y_test = train_test_split(x_resampled,y_resampled,test_size=0.2)


### 4.3 模型训练

from sklearn.ensemble import RandomForestClassifier

estimator = RandomForestClassifier(oob_score=True)
estimator.fit(x_train,y_train)


### 4.4 模型评估


本题要求使用logloss进行模型评估

y_pre = estimator.predict(x_test)
y_test,y_pre


![在这里插入图片描述](https://img-blog.csdnimg.cn/a4b07c3c514e4374957d883aa3f489a9.png)



> 
> 需要注意的是：logloss在使用过程中，必须要求将输出用one-hot表示
> 
> 
>

from sklearn.preprocessing import OneHotEncoder

one_hot = OneHotEncoder(sparse=False)
y_pre = one_hot.fit_transform(y_pre.reshape(-1,1))
y_test = one_hot.fit_transform(y_test.reshape(-1,1))
y_test,y_pre


![在这里插入图片描述](https://img-blog.csdnimg.cn/bba47f1eeb2f49b09d9

最低0.47元/天解锁文章

2401_84166396

关注

12
点赞
踩
8

收藏

觉得还不错? 一键收藏
1
评论
随机森林应用案例 —— otto产品分类_随机森林分类实例

x.shape,y.shapex_resampled.shape,y_resampled.shapesns.countplot(y_resampled)plt.show()y_resampledfrom sklearn.preprocessing import LabelEncoderle = LabelEncoder()y_resampled = le.fit_transform(y_resampled)y_resampledfrom sklearn.model_selection impo
复制链接

扫一扫