深度学习（9）TensorFlow基础操作五: Broadcasting

最新推荐文章于 2023-08-19 21:41:06 发布

炎武丶航

最新推荐文章于 2023-08-19 21:41:06 发布

阅读量742

点赞数 1

分类专栏：深度学习 TensorFlow2 文章标签：深度学习 tensorflow

本文链接：https://blog.csdn.net/weixin_43360025/article/details/119569453

版权

深度学习同时被 2 个专栏收录

125 篇文章 53 订阅

订阅专栏

TensorFlow2

69 篇文章 12 订阅

订阅专栏

深度学习（9）TensorFlow基础操作五: Broadcasting

1. 操作思想
2. 具体例子
3. 理解
- (1) How to understand?
- (2) Why Broadcasting?
4. Broadcasting使用条件
5. Broadcasting的优点
6. x+tf.random.normal([3])
7. tf.broadcast_to()
8. Broadcast VS Tile

Broadcasting

expand
without copying data
- VS tf.tile
tf.broadcast_to
在 $Y = X @ W + b$ 中，计算时: [b, 784]@[784, 10]+[10] $\to$ [b, 10]+[10] $\to$ [b, 10]，其中[b, 10]+[10] $\to$ [b, 10]这一步计算时，需要将[10] $\to$ [b, 10]才能使其相加成为out，那么[10] $\to$ [b, 10]这个过程就叫做Broadcasting。

1. 操作思想

Insert 1 dim ahead if needed;
如果a的dim和b的dim不相等的话，就插入维度;
例如: a.shape=[4, 16, 16, 32]; b.shape=[32]; 我们首先要将a与b的小维度对齐，我们将b的[32]这个维度叫做“小维度”，a的[4]这个维度叫做“大维度”，将a与b的小维度进行对齐，然后从小维度向大维度方向插入维度，且插入值为1，即b插入维度后就会变为: [1, 1, 1, 32];
expand dims with size 1 to same size;
将插入维度后的b的插入的维度值变为和a一样的维度值，即b.shape=[4, 16, 16, 32];
经上面的操作后，a与b就可以进行相应的数学运算了。
- Feature maps: [4, 32, 32, 3]
- Bias: [3] $\to$ [1, 1, 1, 32] $\to$ [4, 32, 32, 3]

注: Broadcasting操作并没有复制或者添加数据，只不过计算的过程中添加了虚拟的维度而已，是一种运算的优化手段，计算时添加到虚拟维度和数据是不占用内存的;

2. 具体例子

在这里插入图片描述

(1) [4, 3] + [4, 3] $\to$ [4, 3] + [4, 3] $\to$ [4, 3]
(2) [4, 3] + [1, 3] $\to$ [4, 3] + [4, 3] $\to$ [4, 3]
(3) [4, 1] + [1, 3] $\to$ [4, 3] + [4, 3] $\to$ [4, 3]

3. 理解

(1) How to understand?

When it has no axis
- Create a new concept
- [classes, students, scores] + [scores]
When it has dim of size 1
- Treat it shared by all
- [classes, students, scores] + [students, 1]

(2) Why Broadcasting?

for real demanding（实际需求）
- [classes, students, scores];
- Add bias for every student: + 5 score;
memory consumption（节省内存）

无需因为计算而占用内存，如果我们不使用Broadcasting而手动添加维度计算的话会占用内存，而使用Broadcasting则会极大地减少占用内存;
- [4, 32, 8] $\to$ 1024;
- bias = [8]: [5.0, 5.0, 5.0, …] $\to$ 8;
可以看到，不使用Broadcasting会占用1024个单元来存储数据，而使用Broadcasting则只会占用8个单元来存储数据。

4. Broadcasting使用条件

Broadcastable?

Match from Last dim!（一定要从小维度开始对齐!）
If current dim=1, expand to same;
If either has no dim, insert one dim and expand to same;
otherwise, NOT broadcastable.

注: 一定要从小维度，也就是从最右边开始对齐!!!
(1) Situation 1:

[4, 32, 14, 14]
[1, 32, 1, 1] $\to$ [4, 32, 14, 14]
(2) Situation 2:
[4, 32, 14, 14]
[14, 14] $\to$ [1, 1, 14, 14] $\to$ [4, 32, 14, 14]
(3) Situation 3:
[4, 32, 14, 14]
[2, 32, 14, 14]

维度值不相等，所以不能进行Broadcasting操作!
如果将[2, 32, 14, 14]改为[1, 32, 14, 14]，则可以进行Broadcasting操作。

5. Broadcasting的优点

It’s efficient and intuitive!
高效且直观!

[4, 32, 32, 3]
- [3]
- [32, 32, 1]
- [4, 1, 1, 1]

6. x+tf.random.normal([3])

在这里插入图片描述

(1) x+tf.random.normal([3]): x和bias=[3]相加;

系统会自动判定是否满足Broadcasting操作的条件;

(2) x+tf.random.normal([32, 32, 1]): x和bias=[32, 32, 1]相加;
(3) x+tf.random.normal([4, 1, 1, 1]): x和bias=[4, 1, 1, 1]相加;
(4) x+tf.random.normal([1, 4, 1, 1]): 不满足Broadcasting操作的条件，维度的数值对不上，所以报错;

7. tf.broadcast_to()

在这里插入图片描述

(1) b = tf.broadcast_to(tf.random.normal([4, 1, 1, 1]), [4, 32, 32, 3]): 通过调用tf.broadcast_to()函数将[4, 1, 1, 1]变为新的shape=[4, 32, 32, 3];
(2) 如果维度的数值不相等，就会报错。例如: b = tf.broadcast_to(tf.random.normal([4, 1, 1, 1]), [3, 32, 32, 3])，其中[4, 1, 1, 1]和[3, 32, 32, 3]的第1个维度值不相等，不满足Broadcasting的条件，就会报错;

8. Broadcast VS Tile

在这里插入图片描述

(1) a1 = tf.broadcast_to(a, [2, 3, 4]): 将a进行Broadcasting操作，由a1.shape=[3, 4] $\to$ [2, 3, 4];
(2) a2 = tf.expand_dims(a, axis=0): 对a进行扩展维度的操作，在a中第1个维度处插入一个新的维度，a2.shape=[1, 3, 4]
(3) a2 = tf.tile(a2, [2, 1, 1]): 使用tile()函数将a2中的第1个维度复制2次; 将a2中的第2个维度复1次; 将a2中的第2个维度复制1次。所以完成tile()操作的a2.shape=[2, 3, 4]。

a1和a2在功能上是完全等价的，但是a2占用的内存空间更大，所以a1更加高效。

参考文献:
[1] 龙良曲:《深度学习与TensorFlow2入门实战》
[2] https://blog.openai.com/generative-models/

炎武丶航

关注

1
点赞
踩
3

收藏

觉得还不错? 一键收藏
0
评论
深度学习（9）TensorFlow基础操作五: Broadcasting

深度学习（9）TensorFlow基础操作五: Broadcasting1. 操作思想2. 具体例子3. 理解(1) How to understand?(2) Why Broadcasting?4. Broadcasting使用条件5. Broadcasting的优点6. x+tf.random.normal([3])7. tf.broadcast_to()8. Broadcast VS TileBroadcastingexpandwithout copying dataVS tf.tile
复制链接

扫一扫