feature_selector包中identify_zero_importance函数对连续变量报错

包出处:GitHub - WillKoehrsen/feature-selector: Feature selector is a tool for dimensionality reduction of machine learning datasets

fs.identify_zero_importance(task = 'regression', eval_metric = 'l2', 
                            n_iterations = 10, early_stopping = True)

当运行上述代码时,程序返回

ValueError: The least populated class in y has only 1 member, which is too few. The minimum number of groups for any class cannot be less than 2.

对于回归问题,由于因变量是连续变量,本文中为-3至+3的连续变量,不应该存在某个class只有一个值的情况,因此查看函数源码,其中一行为:

train_features, valid_features, train_labels, valid_labels = train_test_split(features, labels, test_size = 0.15, stratify=labels)

这里stratify=labels是固定的传入参数,显然不能和连续变量相适配,去除该参数后恢复正常。

此外,在运行代码时会出现

UserWarning: 'early_stopping_rounds' argument is deprecated and will be removed in a future release of LightGBM. Pass 'early_stopping()' callback via 'callbacks' argument instead.

将代码299行修改为如下即可

if _early_stopping:
    train_features, valid_features, train_labels, valid_labels = train_test_split(features, labels, test_size = 0.15) #, stratify=labels

# Train the model with early stopping
    model.fit(train_features, train_labels, eval_metric = eval_metric,
                          eval_set = [(valid_features, valid_labels)],
                          callbacks = [lgb.log_evaluation(period=100), lgb.early_stopping(stopping_rounds=30)])

注意需要将参数early_stopping改名为_early_stopping,不然也会报错

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值