tensorflow with求导_Tensorflow 是如何求导的？

最新推荐文章于 2021-12-20 00:37:51 发布

weixin_39575648

最新推荐文章于 2021-12-20 00:37:51 发布

阅读量100

点赞数

文章标签： tensorflow with求导

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.csdn.net/weixin_39575648/article/details/111552081

版权

符号求导不难哒。。

对于符号求导来说，最重要的是要在各个符号操作的地方，记下对导数的影响。然后使用链式法则

即可

举一个例子：

显然我们知道这个函数关于x的导数是

在Tensorflow里长这样：(位于math_grad.py)

@ops.RegisterGradient("Square")

def _SquareGrad(op, grad):

x = op.inputs[0]

# Added control dependencies to prevent 2*x from being computed too early.

with ops.control_dependencies([grad.op]):

x = math_ops.conj(x)

return grad * (2.0 * x)

这个文件的所有的函数都用RegisterGradient装饰器包装了起来，这些函数都接受两个参数，op和grad。其他的只要注册了op的地方也有各种使用这个装饰器，例如batch

RegisterGradient装饰器的类文档描述了这个装饰器的作用：

This decorator is only used when defining a new op type. For an op

with `m` inputs and `n` outputs, the gradient function is a function

that takes the original `Operation` and `n` `Tensor` objects

(representing the gradients with respect to each output of the op),

and returns `m` `Tensor` objects (representing the partial gradients

with respect to each input of the op).

For example, assuming that operations of type `"Sub"` take two

inputs `x` and `y`, and return a single output `x - y`, the

following gradient function would be registered:

```python

@tf.RegisterGradient("Sub")

def _sub_grad(unused_op, grad):

return grad, tf.negative(grad)

```

第一个参数op是操作，第二个参数是grad是之前的梯度，实际上就是链式法则的后半部分

。文档里给的例子就是x-y的梯度的示例

因为每个操作可能是只有一个符号作为输入，也可能是两个符号，前者返回这一个符号的梯度，后者返回输出对于两个符号的梯度。以上函数返回的是

和

两个梯度，显然前者是1，后者是-1，结合链式法则即可

链式法则算完了整个图的梯度之后，就乘上一个delta算出对应的数值解

这个的具体代码在gradient_checker.py 里，你可以用编辑器快速跳转跳进去

函数大概如下：

def compute_gradient(x,x_shape,y,y_shape,x_init_value=None,delta=1e-3) #忽略了一些参数

dx, dy = _compute_dx_and_dy(x, y, y_shape)

ret = _compute_gradient(x, x_shape, dx, y, y_shape, dy, x_init_value, delta,

extra_feed_dict=extra_feed_dict)

return ret

其中第一个函数_compute_dx_and_dy调用了gradients_impl.py:gradients ，按照函数说明，就是取出对应的偏导的运算符号Constructs symbolic partial derivatives of sum of `ys` w.r.t. x in `xs`.

就是我们前面所说的链式法则。

后面那个_compute_gradient是算雅克比矩阵的，返回了一个元组 (理论解，一个数值解)，delta用在数值解的雅克比矩阵上

另外显然地，计算图的导数只用构造计算图的时候算一次，只要代入对应的x,y就可以计算。就算以后动态改变计算图，也只是继续使用链式法则就好。所以每个样本进去的时候只是x,y不一样

weixin_39575648

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。