r语言 c50参数,R语言 c50包分类 tree size 为0

最新推荐文章于 2023-05-07 21:45:12 发布

李应寰

最新推荐文章于 2023-05-07 21:45:12 发布

阅读量724

点赞数

文章标签： r语言 c50参数

想对众筹网的项目吸引力用决策树进行分析

看哪个行业的项目吸引力更大，筹款金额和支持数更多，结果决策树显示tree size 为0；

information_model

Call:

C5.0.default(x = information_train[-5], y = information_train$倍数分类)

Classification Tree

Number of samples: 3567

Number of predictors: 12

Tree size: 0

Non-standard options: attempt to group attributes

代码如下：

#1数据描述及准备

# 将数据下载存储的物理位置设为R工作环境

setwd("D:\\凌筱玥\\研究生\\数据挖掘\\第二次作业")

# 导入数据文件

information

information$已筹款分类

information$目标筹资分类

information$倍数分类

information$支持数分类

# str函数查看数据变量属性

str(information)

# 利用table函数查阅因子型变量行业的频数分布

table(information$行业)

# 利用summary函数描述数值型变量情况

summary(information$已筹款)

# 倍数分类变量是哑变量，1表示倍数超过1.5倍，2表示倍数低于1.5倍

table(information$倍数分类)

###########################################

###########################################

#2数据准备-创建随机的训练和测试数据

#这里我们随机的选取90%的数据作为训练数据，剩余部分为测试数据。

# 思路是：对数据进行随机排序，然后对排序后的数据按90%部分分割

set.seed(12345)########()里面的数字代表随机数字串的符号，若符号相同则随机数一致

information_rand

# 比较随机排序后的已筹款分类变量

head(information$已筹款分类) #随机前

head(information_rand$已筹款分类)#随机后

# 创建训练和测试集

information_train

information_test

# 检查训练和测试集中的CREDIT的比例情况

prop.table(table(information_train$倍数分类))

prop.table(table(information_test$倍数分类))

#3训练模型

#利用C50软件包调研C5.0算法进行模型训练工作。

# install.packages('C50')

library(C50)

Classification Tree

Number of samples: 3567

Number of predictors: 12

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
r语言 c50参数,R语言 c50包分类 tree size 为0

想对众筹网的项目吸引力用决策树进行分析看哪个行业的项目吸引力更大，筹款金额和支持数更多，结果决策树显示tree size 为0；information_modelCall:C5.0.default(x = information_train[-5], y = information_train$倍数分类)Classification TreeNumber of samples: 3567Numbe...
复制链接

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。