to_categorical

最新推荐文章于 2022-08-18 15:49:20 发布

hello~bye~

最新推荐文章于 2022-08-18 15:49:20 发布

阅读量1.2k

点赞数 2

分类专栏：人工智能文章标签： to_categorical

原文链接：https://blog.csdn.net/moyu123456789/article/details/83444140

版权

人工智能专栏收录该内容

66 篇文章 2 订阅

订阅专栏

1.to_categorical的功能

简单来说，to_categorical就是将类别向量转换为二进制（只有0和1）的矩阵类型表示。其表现为将原有的类别向量转换为独热编码的形式。先上代码看一下效果：

from keras.utils.np_utils import *

#类别向量定义

b = [0,1,2,3,4,5,6,7,8]

#调用to_categorical将b按照9个类别来进行转换

b = to_categorical(b, 9)

print(b)



执行结果如下：

[[1. 0. 0. 0. 0. 0. 0. 0. 0.]

[0. 1. 0. 0. 0. 0. 0. 0. 0.]

[0. 0. 1. 0. 0. 0. 0. 0. 0.]

[0. 0. 0. 1. 0. 0. 0. 0. 0.]

[0. 0. 0. 0. 1. 0. 0. 0. 0.]

[0. 0. 0. 0. 0. 1. 0. 0. 0.]

[0. 0. 0. 0. 0. 0. 1. 0. 0.]

[0. 0. 0. 0. 0. 0. 0. 1. 0.]

[0. 0. 0. 0. 0. 0. 0. 0. 1.]]

to_categorical最为keras中提供的一个工具方法，从以上代码运行可以看出，将原来类别向量中的每个值都转换为矩阵里的一个行向量，从左到右依次是0,1,2，...8个类别。2表示为[0. 0. 1. 0. 0. 0. 0. 0. 0.]，只有第3个为1，作为有效位，其余全部为0。

2.one_hot encoding(独热编码)介绍

独热编码又称为一位有效位编码，上边代码例子中其实就是将类别向量转换为独热编码的类别矩阵。也就是如下转换：

0 1 2 3 4 5 6 7 8

0=> [1. 0. 0. 0. 0. 0. 0. 0. 0.]

1=> [0. 1. 0. 0. 0. 0. 0. 0. 0.]

2=> [0. 0. 1. 0. 0. 0. 0. 0. 0.]

3=> [0. 0. 0. 1. 0. 0. 0. 0. 0.]

4=> [0. 0. 0. 0. 1. 0. 0. 0. 0.]

5=> [0. 0. 0. 0. 0. 1. 0. 0. 0.]

6=> [0. 0. 0. 0. 0. 0. 1. 0. 0.]

7=> [0. 0. 0. 0. 0. 0. 0. 1. 0.]

8=> [0. 0. 0. 0. 0. 0. 0. 0. 1.]

那么一道思考题来了，让你自己编码实现类别向量向独热编码的转换，该怎样实现呢？

以下是我自己粗浅写的一个小例子，仅供参考：

def convert_to_one_hot(labels, num_classes):

#计算向量有多少行

num_labels = len(labels)

#生成值全为0的独热编码的矩阵

labels_one_hot = np.zeros((num_labels, num_classes))

#计算向量中每个类别值在最终生成的矩阵“压扁”后的向量里的位置

index_offset = np.arange(num_labels) * num_classes

#遍历矩阵，为每个类别的位置填充1

labels_one_hot.flat[index_offset + labels] = 1

return labels_one_hot

#进行测试

b = [2, 4, 6, 8, 6, 2, 3, 7]

print(convert_to_one_hot(b,9))



测试结果：

[[0. 0. 1. 0. 0. 0. 0. 0. 0.]

[0. 0. 0. 0. 1. 0. 0. 0. 0.]

[0. 0. 0. 0. 0. 0. 1. 0. 0.]

[0. 0. 0. 0. 0. 0. 0. 0. 1.]

[0. 0. 0. 0. 0. 0. 1. 0. 0.]

[0. 0. 1. 0. 0. 0. 0. 0. 0.]

[0. 0. 0. 1. 0. 0. 0. 0. 0.]

[0. 0. 0. 0. 0. 0. 0. 1. 0.]]

3.源码解析

to_categorical在keras的utils/np_utils.py中，源码如下：


def to_categorical(y, num_classes=None, dtype='float32'):

"""Converts a class vector (integers) to binary class matrix.

E.g. for use with categorical_crossentropy.

# Arguments

y: class vector to be converted into a matrix

(integers from 0 to num_classes).

num_classes: total number of classes.

dtype: The data type expected by the input, as a string

(`float32`, `float64`, `int32`...)

# Returns

A binary matrix representation of the input. The classes axis

is placed last.

# Example

```python

# Consider an array of 5 labels out of a set of 3 classes {0, 1, 2}:

> labels

array([0, 2, 1, 2, 0])

# `to_categorical` converts this into a matrix with as many

# columns as there are classes. The number of rows

# stays the same.

> to_categorical(labels)

array([[ 1., 0., 0.],

[ 0., 0., 1.],

[ 0., 1., 0.],

[ 0., 0., 1.],

[ 1., 0., 0.]], dtype=float32)

```

"""

#将输入y向量转换为数组

y = np.array(y, dtype='int')

#获取数组的行列大小

input_shape = y.shape

if input_shape and input_shape[-1] == 1 and len(input_shape) > 1:

input_shape = tuple(input_shape[:-1])

#y变为1维数组

y = y.ravel()

#如果用户没有输入分类个数，则自行计算分类个数

if not num_classes:

num_classes = np.max(y) + 1

n = y.shape[0]

#生成全为0的n行num_classes列的值全为0的矩阵

categorical = np.zeros((n, num_classes), dtype=dtype)

#np.arange(n)得到每个行的位置值，y里边则是每个列的位置值

categorical[np.arange(n), y] = 1

#进行reshape矫正

output_shape = input_shape + (num_classes,)

categorical = np.reshape(categorical, output_shape)

return categorical

看过源码之后，确实觉得自己的代码还需要完善。框架里的一些api，我们可以先自己想着来写，然后和源码进行对比学习，这是一个很好的学习方法。

hello~bye~

关注

2
点赞
踩
7

收藏

觉得还不错? 一键收藏
0
评论
to_categorical

1.to_categorical的功能简单来说，to_categorical就是将类别向量转换为二进制（只有0和1）的矩阵类型表示。其表现为将原有的类别向量转换为独热编码的形式。先上代码看一下效果：from keras.utils.np_utils import *#类别向量定义b = [0,1,2,3,4,5,6,7,8]#调用to_categorical将b按照9个类别来进行转换b = to_categorical(b, 9)print(b)执行结果如下：.
复制链接

扫一扫

专栏目录