java func_计算张量流中py_func的复数损失函数的梯度

最新推荐文章于 2024-02-12 09:48:47 发布

ww09

最新推荐文章于 2024-02-12 09:48:47 发布

阅读量225

点赞数

文章标签： java func

本文链接：https://blog.csdn.net/weixin_35112865/article/details/114766869

版权

在尝试使用tf.py_func在TensorFlow中实现涉及大量循环和条件的自定义损失函数时，遇到了梯度为None的问题。已尝试自定义梯度函数，但模型训练效果不佳。寻求解决在复杂py_func中正确计算和返回梯度的方法。

摘要由CSDN通过智能技术生成

在我的模型中，我从神经网络获得形状 (1000, 1234) 的输出 . 我想从中计算损失 . 但是我不能用tensorflow操作来计算损失，因为计算损失涉及大量的 for 循环和条件代码(即 if() 表达式)，所以我想在纯python代码中使用 tf.py_func 来做，这更灵活 .

问题是， tf.py_func 的梯度是 None ，这意味着我必须自己定义渐变函数 . 有很多解决方案，例如@harpone的gist . 但这些例子太简单而无用 . 在我的情况下，我有一个形状 (1000, 1234) 的输入张量， py_func 很复杂 .

好处是，在定制 py_func ， no variable is involved . 所以我想我可以按原样返回渐变：

def _my_py_func_grad(op, grad):

return tf.ones_like(op.inputs[0]) * grad

但是当运行代码时，神经网络似乎没有学到任何东西：在几次迭代之后，最终的损失不会减少(保持)相同 .

把它改成

def _my_py_func_grad(op, grad):

return op.inputs[0] * grad

也没有帮助 .

我的问题在这里是什么？

如果我不能按原样返回渐变，有没有什么好的方法来计算渐变？

如果有人想看到代码：

# Below is a complex function to calculate loss

def py_cal_loss(ts):

""" ts of shape

[-1, num_of_anchor_boxes*(5+num_of_classes) + 5*num_of_gt_bnx_per_cell]

"""

def py_cal_loss_onecell(cell):

""" cell of shape

[num_of_anchor_boxes*(5+num_of_classes) + 5*num_of_gt_bnx_per_cell]

"""

split_num = num_of_anchor_boxes*(5+num_of_classes)

op_boxes = np.reshape(

cell[0:split_num],

[num_of_anchor_boxes, 5+num_of_classes]

)

gt_boxes = np.reshape(

cell[split_num:],

[num_of_gt_bnx_per_cell, 5]

)

max_oboxes = set()

gt_op_matched_pairs = []

for g_idx, gbox in enumerate(gt_boxes):

if gbox == [0.0, 0.0, 0.0, 0.0, 0.0]:

print("all zero found")

continue

o_idx = max_iou_with_op_boxes(gbox, op_boxes)

max_oboxes.add(o_idx)

gt_op_matched_pairs.append( (g_idx, o_idx) )

# calculate coordinate loss & objectness confidence loss

coor_loss = 0.0

objectness_loss = 0.0

for tp in gt_op_matched_pairs:

g_idx = tp[0]

o_idx = tp[1]

gbox_coord = gt_boxes[g_idx][1:]

obox_coord = op_boxes[o_idx][0:4]

coor_loss += (math.pow(gbox_coord[0] - obox_coord[0], 2) +

math.pow(gbox_coord[1] - obox_coord[1], 2) +

math.pow(math.sqrt(gbox_coord[2]) - math.sqrt(obox_coord[2]), 2) +

math.pow(math.sqrt(gbox_coord[3]) - math.sqrt(obox_coord[3]), 2))

obox_obj = op_boxes[o_idx][4]

iou = cal_iou(gbox_coord, obox_coord)

objectness_loss += math.pow(obox_obj-iou, 2)

# calculate noobjectness confidence loss

noobjectness_loss = 0.0

for o_idx, o_box in enumerate(op_boxes):

if o_idx in max_oboxes:

continue

obox_obj = op_boxes[o_idx][4]

noobjectness_loss += math.pow(obox_obj, 2)

# calculate classness loss (TODO)

return 0.5 * coor_loss + 3 * objectness_loss + noobjectness_loss

# ----- END DEF py_cal_loss_onecell ----

total_loss = 0.0

for cell in ts:

total_loss += py_cal_loss_onecell(cell)

return total_loss

# ---- END DEF py_cal_loss ----

# return the grad as it is

def _my_py_func_grad(op, grad):

return tf.ones_like(op.inputs[0]) * grad

# calculate loss using tf.py_func.

# The input, "op_and_gt_batch" is of shape (1000, 1234)

with ops.op_scope([op_and_gt_batch], "pyfunction", "MyLoss"):

rnd_name = 'PyFuncGrad' + str(np.random.randint(0, 1E+8))

tf.RegisterGradient(rnd_name)(_my_py_func_grad)

default_graph = tf.get_default_graph()

with default_graph.gradient_override_map({"PyFunc": rnd_name}):

loss_out = tf.py_func(py_cal_loss, [op_and_gt_batch], [tf.float32], name="pyfunction")

# pdb.set_trace()

return loss_out[0]

ww09

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫