TensorFlow tf.one_hot( )剖析_tf.one-hot-CSDN博客

本文链接：https://blog.csdn.net/ghy_111/article/details/80362597

本文深入讲解了独热编码(one-hot encoding)的概念及其在TensorFlow中的实现方式，包括tf.one_hot函数的各种参数设置及应用场景，并提供了详细的示例代码，帮助读者理解如何在监督学习任务中对分类标签进行编码。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

1. 首先解释one_hot encoding(独热编码)

one_hot encoding可作为机器学习中对离散型特征的一种处理手段，一般用于处理监督学习中分类问题样本的标注数据。例如有三类特征{label1，label2，label3}，需要三个bit位实现编码，bit位为1的位置对应原来的特征值，即上述特征对应的编码为{100, 010, 001}。

2.tf.one_hot( label_batch, class_num )

tf.one_hot( ) 函数原型为：

one_hot(indices, depth, on_value=None, off_value=None, axis=None, dtype=None, name=None)

indices: 代表了on_value所在的索引，其他位置值为off_value。类型为tensor，其尺寸与depth共同决定输出tensor的尺寸。

depth：编码深度。

on_value & off_value为编码开闭值，缺省分别为1和0，indices指定的索引处为on_value值；

axis：编码的轴，分情况可取-1、0或-1、0、1，默认为-1

dtype：默认为 on_value 或 off_value的类型，若未提供on_value或off_value，则默认为tf.float32类型。

返回一个 one-hot tensor。

2. 应用

(1) indices是一个标量，输出是一个长度为‘depth’的向量；

(2) indices是一个长度为features的向量，输出尺寸为：(a) 当axis==-1，features*depth (b) 当axis==0，depth*features

(3) indices是一个尺寸为[batch，features]的矩阵，输出尺寸为：

(a) 当axis==-1，batch*features*depth (b)当axis==1，batch*depth*features (c)当axis==0，depth*batch*features

第(2)种情况举例：

```python
indices = [0, 2, -1, 1]
depth = 3
on_value = 5.0
off_value = 0.0
axis = -1

```

输出尺寸为4*3， ```python
output =
[5.0 0.0 0.0] // one_hot(0)
[0.0 0.0 5.0] // one_hot(2)
[0.0 0.0 0.0] // one_hot(-1)
[0.0 5.0 0.0] // one_hot(1)

```

第(3)种情况举例：

```python
indices = [[0, 2], [1, -1]]
depth = 3
on_value = 1.0
off_value = 0.0
axis = -1

```

输出尺寸为：2*2*3，结果：```python
output =
                                      [
                                      [1.0, 0.0, 0.0] // one_hot(0)
                                    [0.0, 0.0, 1.0] // one_hot(2)
                                      ][
                                      [0.0, 1.0, 0.0] // one_hot(1)
                                      [0.0, 0.0, 0.0] // one_hot(-1)
                                      ]

```

3. 分类代码中标注数据的处理：

onehot_label_batch = tf.one_hot(label_batch, class_num)

对标签（label）数据使用one-hot vectors，label n(数字)表示成只有第n维度数字为1的class_num维向量，n为0,1...class_num-1。若有5类，class_num = 5, 标签0表示为[1, 0, 0, 0, 0]，标签1表示为[0, 1, 0, 0, 0]，以此类推。