VQA-ReGat 项目运行遇到的错误

最新推荐文章于 2024-03-06 15:33:50 发布

snow5618

最新推荐文章于 2024-03-06 15:33:50 发布

阅读量3.6k

点赞数 3

分类专栏：报错 pytorch 视觉问答项目实战文章标签： python 深度学习

本文链接：https://blog.csdn.net/snow_maple521/article/details/109890744

版权

pytorch 同时被 3 个专栏收录

26 篇文章 3 订阅

订阅专栏

视觉问答项目实战

8 篇文章 4 订阅

订阅专栏

报错

1 篇文章 0 订阅

订阅专栏

VQA-ReGat:关系感知图形注意网络用于VQA

项目地址
 论文地址

1.torch报错：StopIteration: Caught StopIteration in replica 0 on device 0.
原因：多GPU运行此项目报错，可能是torch版本错误。
修改：按照别的博客将 weight = next(self.parameters()).data改为weight = torch.float32
2.仍报错：AttributeError: 'torch.dtype' no attribute 'new'：torch.dtype没有new属性。
原因：因为1出的修改，weight是torch.dtype类，非torch.tensor数据。
修改：于是看源码只是想获取next(self.parameters()).data的数据类型，大部分都是cuda的torch.float32的类型，

因此最终修改：

weight = 0
weight = torch.tensor(weight,dtype=torch.float32)
weight = weight.cuda()

3.报错：RuntimeError: unsupported operation: more than one element of the written-to tensor refers to a single memory location. Please clone() the tensor before performing the operation
修改：q_expand = q.expand(*repeat_vals)改为 q_expand = q.expand(*repeat_vals).clone()
4.报错：RuntimeError: CUDA out of memory. Tried to allocate 292.00 MiB (GPU 0; 10.76 GiB total capacity; 4.34 GiB already allocat
可能本项目存在很多的parameters，所以设置小点的batch_size.

VQA-ReGat结果

只记录了3个epoch结果，

--------------mutan.json--------------------------------
epoch 10：train_loss: 2.69, norm: 2.9384, score: 71.8663
	eval score: 76.90 (92.66)
	
-------------butd.json----------------------------------
epoch 28, time: 758.79
	train_loss: 2.53, norm: 2.6914, score: 73.75
	eval score: 75.40 (92.66)
	
epoch 29, time: 765.29
	train_loss: 2.53, norm: 2.6903, score: 73.72
	eval score: 75.40 (92.66)

snow5618

关注

3
点赞
踩
7

收藏

觉得还不错? 一键收藏
19
评论
VQA-ReGat 项目运行遇到的错误

VQA-ReGat:关系感知图形注意网络用于VQA项目地址论文地址1.torch报错：StopIteration: Caught StopIteration in replica 0 on device 0.原因：多GPU运行此项目报错，可能是torch版本错误。修改：按照别的博客将 weight = next(self.parameters()).data改为weight = torch.float322.仍报错：AttributeError: 'torch.dtype' no att
复制链接

扫一扫