关于保持范畴特征在一个整数数组中表示,即[1,2,3,4,5],我们有以下内容:Such integer representation can not be used directly with scikit-learn
estimators, as these expect continuous input, and would interpret the
categories as being ordered, which is often not desired (i.e. the set
of browsers was ordered arbitrarily). One possibility to convert
categorical features to features that can be used with scikit-learn
estimators is to use a one-of-K or one-hot encoding, which is
implemented in OneHotEncoder. This estimator transforms each
categorical feature with m possible values into m binary features,
with only one active.
因此,您可以使用one-hot encoding将数组转换为5个新列(本例中,因为您有5个可能的值)。在
这里有一些工作代码。输入是一列分类参数[1,2,3,4,5],输出是一个矩阵,共5列,5个可能的选项各1个:from sklearn.preprocessing import OneHotEncoder
enc = OneHotEncoder()
enc.fit([[1],[2],