@[TOC](RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [1, 1, 88, 512, 512]], which is output 0 of SigmoidBackward0, is at version 1; expected version 0 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).
)
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [1, 1, 88, 512, 512]], which is output 0 of SigmoidBackward0, is at version 1; expected version 0 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).
定位torch.cuda.FloatTensor [1, 1, 88, 512, 512]
这里可以看到,有一个尺寸为 [1, 1, 88, 512, 512]的参数存在问题!!!!!
那么问题出在哪里呢?
我的问题是出现在LOSS函数里的,那么就定位到LOSS函数之中。
查看所有shape为[1, 1, 88, 512, 512]的参数。
可以使用 np.shape(),或者.size()查看参数的shape
找到这些参数后,逐一排查,是否有一些初始化的过程,这个过程会导致参数无法反向传播,因为每次计算LOSS时,都会对该参数进行初始化,如下:
dvh_loss = torch.zeros_like(gt_dose_o)
这里后续会对dvh_loss进行计算得到一个loss,加入最后的LOSS之中,因此在反向传播时会涉及该参数,但是由于在LOSS函数中对其进行了初始化,导致无法反向传播。
只能帮助找到问题,具体修改还得看具体情况,
通常是使用已有的参数进行组合操作。