已解决ValueError: Incompatible dimension for X and Y matrices: X.shape[1] == 11 while Y.shape[1] == 1

最新推荐文章于 2024-07-16 14:28:29 发布

彤小彤_tong

最新推荐文章于 2024-07-16 14:28:29 发布

阅读量2.7k

点赞数 3

文章标签： python 人工智能

本文链接：https://blog.csdn.net/m0_64669072/article/details/128311380

版权

已解决ValueError: Incompatible dimension for X and Y matrices: X.shape[1] == 11 while Y.shape[1] == 1

在使用CountVectorizer时进行文本特征提取，将新闻表示为向量之后，在使用KNN训练数据时出现报错
ValueError: Incompatible dimension for X and Y matrices: X.shape[1] == 11 while Y.shape[1] == 1
原代码如下：

from sklearn.feature_extraction.text import CountVectorizer
vectorizer=CountVectorizer(stop_words=stop_words)
X_train=vectorizer.fit_transform(data_train[3])
X_test=vectorizer.fit_transform(data_test[1])

发现是矩阵不兼容出现的问题，通过查询当前X_train和X_test的shape发现，二者的shape维度列不相同:
在这里插入图片描述

解决方案

将源代码改为：

from sklearn.feature_extraction.text import CountVectorizer
vectorizer=CountVectorizer(stop_words=stop_words)
X_train=vectorizer.fit_transform(data_train[3])
X_test=vectorizer.transform(data_test[1])

此时再次查询shape维度，发现二者维度中列相同，此时重新进行训练，问题成功被解决。
在这里插入图片描述