lightGBM dump_model 错误解决

lightGBM dump_model的功能是吧模型转变为json字典,然后就可以对其进行方便的操作,然后有时候会报

JSONDecodeError: Expecting ',' delimiter: line 7 column 95 (char 209)

以下示例本错误并且提出解决办法


from tsfresh import extract_features
from tsfresh.feature_extraction import ComprehensiveFCParameters
import lightgbm as lgb
import pandas as pd
import numpy as np
samples_timeserires = np.random.random((50,500))
y = np.random.randint(2,size=500)
ts_fc_comprehensive_settings = ComprehensiveFCParameters()
df = pd.DataFrame(samples_timeserires)
df.loc[:,'col_id'] = 0
X = pd.DataFrame()
for i in range(500):
    timeseries_container = df.loc[:,[i,'col_id']]
    timeseries_container.columns = [0,'col_id']
    statics_feats_position = extract_features(timeseries_container=timeseries_container,
                                                      column_id='col_id',
                                                      column_value=0,
                                                      default_fc_parameters=ts_fc_comprehensive_settings,
                                                      n_jobs=1,
                                                      disable_progressbar=True
                                                )
    
    X = pd.concat([X,statics_feats_position],axis=0)
X.sample()


此时进行模型训练并且尝试dump_model

dtrain = lgb.Dataset(X,y,free_raw_data=False,feature_name='auto',
                         categorical_feature='auto')
gbm = lgb.train(params={},num_boost_round=5,train_set=dtrain)
gbm.dump_model()

则出现错误如下:


经过测试,造成此错误的原因是因为X这个DataFrame的列名中含有双引号(")引起的,对列名进行重名了

X_rename = X.copy()
X_rename.columns = ['col_%03d'%i for i in range(len(X_rename.columns))]
dtrain = lgb.Dataset(X_rename,y,free_raw_data=False,feature_name='auto',
                         categorical_feature='auto')
gbm = lgb.train(params={},num_boost_round=5,train_set=dtrain)
gbm.dump_model()

结果正常了



对产生异常的列名进行打印已验证原因

normal_name = []
anomaly_name = []
for column in X.columns:
    feats = X.loc[:,column]
    if len(feats.unique())<4: 
        continue
    dtrain = lgb.Dataset(pd.DataFrame(feats), 
                         y,free_raw_data=False,feature_name='auto',
                         categorical_feature='auto')
    gbm = lgb.train(params={},num_boost_round=5,train_set=dtrain)
    try:
        gbm.dump_model()
        normal_name.append(column)
    except:
        anomaly_name.append(column)
print('normal:')     
print(normal_name)

print('\nanomaly:')     
print(anomaly_name)
























  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值