解决xgboost异常AttributeError: 'DMatrix' object has no attribute 'handle'

最新推荐文章于 2024-09-13 23:06:46 发布

水...琥珀

最新推荐文章于 2024-09-13 23:06:46 发布

阅读量5.2k

点赞数

分类专栏： Python操作实践中问题解决

本文链接：https://blog.csdn.net/shuihupo/article/details/83239697

版权

Python操作同时被 2 个专栏收录

19 篇文章 1 订阅

订阅专栏

实践中问题解决

5 篇文章 0 订阅

订阅专栏

xgboost异常AttributeError: 'DMatrix' object has no attribute 'handle'

sys:1: DtypeWarning: Columns (65) have mixed types. Specify dtype option on import or set low_memory=False.
....

xgboost异常AttributeError: 'DMatrix' object has no attribute 'handle'

if self.handle is not None:
AttributeError: 'DMatrix' object has no attribute 'handle'

当出现这个问题的时候，我们看sys1这里提示我们在读入数据的阶段某一列（65）的数据的类型是混合的，并提示了解决方案，再读入数据的时候，加上参数：

low_memory=False

traindata_df = pd.read_csv(train_path, sep=',',index_col='user_id', low_memory=False)
print(traindata_df.info())

....
remove_caller_fee               714686 non-null float64
logremove_caller_fee            713857 non-null object
dtypes: float64(13), int64(50), object(1)

虽然数据读入了，但是xgb模型仍旧报错，这是因为模型训练的数据不能是object，需要float或者int,这个坎是绕不过去啦。但是尝试 float(x).或者traindata_df["logremove_caller_fee"].astype(float)是不可以的，具体报错就不粘贴了。

#不可行的尝试：traindata_df["logremove_caller_fee"] = list(map(lambda x: float(x),traindata_df["logremove_caller_fee"]))

大招，输入输入这个命令，它会把object对象进行替换，且很智能，原本这一行是float和对象的混合，现在就统一变为对象，不影响非对象行：

traindata_df = traindata_df.convert_objects(convert_numeric=True)

traindata_df = pd.read_csv(train_path, sep=',',index_col='user_id', low_memory=False)
print(traindata_df.info())

...
mean_service_caller_time_fee    714686 non-null float64
remove_caller_fee               714686 non-null float64
logremove_caller_fee            713855 non-null float64
dtypes: float64(14), int64(50)