onehot的transform方法输出矩阵为numpy的稀疏矩阵

最新推荐文章于 2023-02-07 10:49:54 发布

ODIMAYA

最新推荐文章于 2023-02-07 10:49:54 发布

阅读量947

点赞数

本文链接：https://blog.csdn.net/ODIMAYA/article/details/105438346

版权

xgb_enc_1 = OneHotEncoder()
xgb_enc_2 = OneHotEncoder()

xgb_enc_1.fit(model_1.apply(train_gb))
xgb_enc_2.fit(model_2.apply(train_gb))
#transform输出据真为稀疏矩阵，train_lr为numpy的稠密矩阵
temp_1 = xgb_enc_1.transform(model_1.apply(train_lr))
temp_2 = xgb_enc_2.transform(model_2.apply(train_lr))
temp_3 = train_lr

temp_1
Out[24]: 
<256x1624 sparse matrix of type '<class 'numpy.float64'>'
	with 217600 stored elements in Compressed Sparse Row format>

temp_2
Out[25]: 
<256x1977 sparse matrix of type '<class 'numpy.float64'>'
	with 217600 stored elements in Compressed Sparse Row format>

temp_3.shape
Out[31]: (256, 14)

如果直接使用np.hstack进行拼接：

train_lr_ext_2 = np.hstack((temp_1,temp_3))
报错：
ValueError: all the input arrays must have same number of dimensions

稀疏矩阵与稠密矩阵维度不一致，解决此问题两种方法：

todense()函数

 a = temp_1.todense()

train_lr_ext_2 = np.hstack((a,temp_3))

train_lr_ext_2.shape
Out[34]: (256, 1638)

使用scipy.saprse的hstack()函数进行拼接

from scipy.sparse import hstack

b = hstack((temp_1,temp_3))

b.shape
Out[39]: (256, 1638)

确定要放弃本次机会？

福利倒计时

: :

立减 ¥

普通VIP年卡可用

立即使用

ODIMAYA

关注关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
onehot的transform方法输出矩阵为numpy的稀疏矩阵

xgb_enc_1 = OneHotEncoder()xgb_enc_2 = OneHotEncoder()xgb_enc_1.fit(model_1.apply(train_gb))xgb_enc_2.fit(model_2.apply(train_gb))#transform输出据真为稀疏矩阵，train_lr为numpy的稠密矩阵temp_1 = xgb_enc_1.transf...
复制链接

扫一扫