The reason for the bug is that the tf.keras optimizers apply gradients to variable objects (of type tf.Variable), while you are trying to apply gradients to tensors (of type tf.Tensor). Tensor objects are not mutable in TensorFlow, thus the optimizer cannot apply gradients to it.
img = tf.Variable(img)
opt = tf.optimizers.Adam(learning_rate=lr, decay = 1e-6)
for _ in range(epoch):
with tf.GradientTape() as tape:
tape.watch(img)
y = model(img.value())[:, :, :, filter]
loss = -tf.math.reduce_mean(y)
grads = tape.gradient(loss, img)
opt.apply_gradients(zip([grads], [img]))
Also, it is recommended to calculate the gradients outside the tape’s context. This is because keeping it in will lead to the tape tracking the gradient calculation itself, leading to higher memory usage. This is only desirable if you want to calculate higher-order gradients. Since you don’t need those, I have kept them outside.
博客内容讲述了在使用TensorFlow进行深度学习时,遇到的问题是尝试将梯度应用于张量而不是变量对象,而TensorFlow的优化器是设计为对变量对象应用梯度的。由于张量是不可变的,导致优化器无法更新。解决方案是将张量转换为变量,并在梯度计算和应用之外进行优化。此外,建议在梯度带外计算梯度以避免内存占用过高,除非需要高阶导数。

被折叠的 条评论
为什么被折叠?



