1. transpose或permute造成内存不连续。
# before
output_tensor = in_tensor.transpose(1, 3)
# after
output_tensor = in_tensor.transpose(1, 3).contiguous()
2. 切片操作造成内存不连续。
# before
input_tensor = input_tensor[:, :H, :W, :]
#after
input_tensor = input_tensor[:, :H, :W, :].contiguous()