完整报错如下:
UserWarning: Gradient of Parameter bisruaudio0_dense0_weight on context gpu(0) has not been updated by backward since last step. This could mean a bug in your model that maked it only use a subset of the Parameters (Blocks) for this iteration. If you are intentionally only using a subset, call step with ignore_stale_grad=True to suppress this warning and skip updating of Parameters with stale gradient
错误部分的代码在这里,因为切片(slice)操作,不能反向求梯度(autograd)。
def forward(self, x, y, z):
x = self.dense1(x)
z[0:10] = x
z[10:20] = y
x = self.dense2(z)
return x
改正如下:利用nd.concat来拼接,而不是用slice。
def forward(self, x, y):
x = self.dense1(x)
x = nd.concat(x, y, dim=0)
x = self.dense2(x)
return x