CodesErrors_m0_46429066的博客-CSDN博客

CodesErrors

关注

关注数：文章数：19 文章阅读量：66885 文章收藏量：26

作者: m0_46429066

这个作者很懒，什么都没留下…

展开

nvcc fatal : Unsupported gpu architecture ‘compute_75‘

问题：在安装detectron2过程中，build报错，从原来的TITAN XP迁移到TITAN RTX, 通过查阅资料：detectron2和nvcc报错的描述，发现XP计算能力是7.0， RTX计算能力是7.5，而我使用的是CUDA9.0, 只支持最高7.0的算力，所以会出现7.5不支持的情况；解决办法：在setup.py文件中，增加两行代码： extra_compile_args["nvcc"] = ['--gpu-architecture=compute_70','--gpu-code=sm_7

原创 2021-04-30 17:21:29 · 850 阅读 · 1 评论
ProxyError: Conda cannot proceed due to an error in your proxy configuration.

在Linux使用anaconda创建环境时出现此错误，经过搜索发现是网络代理的问题；使用env | grep -i “_PROXY"查看是否使用了代理，可以看到两行输出；之后打开 .bashrc文件发现其中有两行网络代理，将其注释掉，重新登入服务器，就可以成功创建环境啦。...

原创 2021-03-14 22:55:29 · 2175 阅读 · 1 评论
pip._vendor.urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool

出错原因：一般在国内网安装国外的依赖包时，由于特殊的包无法使用镜像网站安装，所以必须使用国外的原网站，但是就会导致下载速度很慢的问题，可以配置时间解决：--default-time==6000

原创 2021-02-10 22:50:13 · 283 阅读 · 0 评论
Git中问题fatal: refusing to merge unrelated histories

问题：在git pull时出现fatal: refusing to merge unrelated histories也有可能在git push 和git merge时出现；原因是远程的分支和当前本地的分支没有取得联系；2. 解决办法在操作后面加入--allow-unrelated-histories例如：git pull --allow-unrelated-histories就可以解决问题，其他的操作一致。...

原创 2020-11-30 15:34:18 · 95 阅读 · 0 评论
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++

只是警告，不会影响安装，但是也有一些解决办法，最好的解决方法是需要对build_ext进行封装修改，并且替换setup.py的build_ext子命令。如下博客：解决setup.py编译C++代码的-Wstrict-prototypes警告

原创 2020-10-21 15:14:50 · 10976 阅读 · 0 评论
RuntimeError: CUDA error: no kernel image is available for execution on the device

服务器环境：4卡，三卡是GeForce GTX …，另一张卡是 TITAN Xp；发现在执行多卡分布式训练时总是报上面的错，或者单卡运行时切换到GeForce GTX …卡上也会报上面的错；查阅资料，有很多说torch版本太高，建议从1.3降到1.2；但是我的项目必须是1.3以上；所以这个方法无法实现；还有资料说是gpu太老，不能支持新的版本1.3及以上；通过验证发现只能在TITAN XP上才会不报错；确实是GeForce GTX …不能支持torch1.3的某些操作；只能进行单卡训练或者换一台服务器进行多

原创 2020-10-16 12:06:22 · 1995 阅读 · 0 评论
shapely.errors.TopologicalError: The operation ‘GEOSIntersection_r‘ could not be performed.

错误原因：在使用Polygon中包含了一个 intersection求交集的操作，查阅资料可能是因为多边形中存在一个小的多边形，解决办法就是给Polygon加上一个小的buffer, 如下所示：shgeo.Polygon([(left, up), (right, up), (right, down), (left, down)]).buffer(0.001)可以成功解决问题，参考问题如下：https://blog.csdn.net/s

原创 2020-10-15 16:46:30 · 4288 阅读 · 0 评论
cc1plus: fatal error: cuda_runtime.h: No such file or directory compilation terminated.

完整问题：cc1plus: fatal error: cuda_runtime.h: No such file or directorycompilation terminated.error: command '/usr/local/cuda/bin/nvcc' failed with exit status 1在安装mmdetection时出现此问题,python setup.py develop我的环境：pyhton：3.7.0cuda：10.0pytorch: 1.3.1gc

原创 2020-10-13 21:11:50 · 2978 阅读 · 0 评论
pytorch: RuntimeError: DataLoader worker (pid(s) 27292) exited unexpectedly

在anaconda3虚拟环境中执行python代码，代码中使用pytorch框架，运行时出现上述问题，然后根据网上查找到的资料，推荐的解决方法是将numworkers = **注释掉，还有将dataloader放在if __name__ "__main__":的代码段中，但是并不能解决我当前的问题，因为我出现这个问题是由于修改了运行python文件的版本，详见上一个问题：Linux服务器上运行py文件，出现ImportError: No module named torch问题链接所以当前的问题是P

原创 2020-08-02 15:57:37 · 1845 阅读 · 0 评论
Linux服务器上运行py文件，出现ImportError: No module named torch问题

出问题前一段时间还能正常运行文件，出现上述问题，原因不可能是没有安装包，因为使用pip list 能够看到该包，就是在使用Python运行文件时找不到，命令行输入Python，出现以下内容，Python 2.7.5 (default, Aug 4 2017, 00:39:18)，可以推断在自己的虚拟环境中没有调用环境配置的python而是服务器默认的python，所以直接使用python xx.py运行没法索引到自己虚拟环境中用pip安装的包。网上有一些方法，修改索引位置等，更改.bashrc等，但是由于

原创 2020-07-27 21:09:39 · 3365 阅读 · 3 评论
使用register_hook()函数导致显存溢出的问题

hook()函数的作用很强大，pytorch中通常会自动舍弃图计算的中间结果，所以想要获取网络中间层的输出结果或者某些变量的梯度，就可以使用Hook函数来实现，hook函数包括tensor的hook和nn.Module的hook，用法相似。hook函数主要有x.register_hook(hook), layer.register_forward_hook()和layer.register_backward_hook()，x是模型的参数，第一个主要用于获得x的梯度信息，后面两个主要用于模型前向和后向运行中获

原创 2020-05-24 12:19:33 · 1019 阅读 · 4 评论
输出，loss出现-inf 或者nan

原因有很多种，比如输入数据不对，或者模型有问题；一般考虑梯度爆炸或者Loss爆炸，可以用debug模式检查问题的原因，观察什么时候出现的问题；我产生的原因是在训练模型时，使用了model.eval(),将其改为model.train()后解决。...

原创 2020-05-01 18:03:11 · 3759 阅读 · 1 评论
RuntimeError: module must have its parameters and buffers on device cuda:0 (device_ids[0]) but found

问题：RuntimeError: module must have its parameters and buffers on device cuda:0 (device_ids[0]) but found one of them on device: cuda:1原因：对模型使用的是一机多卡模式，model = torch.nn.DataParallel(model），对其设置和后续处理可参照...

原创 2020-04-20 22:02:34 · 3703 阅读 · 3 评论
a leaf Variable that requires grad has been used in an in-place operation

在pytorch的计算图中，需要梯度的变量在前向计算时，不能使用+=或者-=等操作，必须类似x=x+1等操作。

原创 2020-03-31 14:38:57 · 956 阅读 · 0 评论
IndexError: too many indices for tensor of dimension 1

我是在Pytorch中使用tensor索引时出现的问题，索引的使用可以见博文https://blog.csdn.net/xpy870663266/article/details/101597144但是我使用了一个图像数据tensor作为索引值，一个一维tensor作为被索引表，tensor索引值必须是long,bite或者bool类型，我在过程中将三维图像转为了uint8类型，就出现了上述...

原创 2020-03-20 13:35:50 · 15745 阅读 · 1 评论
Function MulBackward0 returned an invalid gradienttorch.FloatTensor but got torch.cuda.FloatTensor

关键原因在于在forward过程中，使用的某些变量是torch.Tensor，没有放到cuda上面，使用torchsnooper检查每一个变量的类型，统一格式后就可以解决掉问题。Some variables initialized in forward were not put deployed on cuda.详见：https://discuss.pytorch.org/t/runtimee...

原创 2020-03-05 21:07:41 · 2226 阅读 · 0 评论
RuntimeError:Function MulBackward0 returned an invalid gradient at index 0

问题：RuntimeError:Function MulBackward0 returned an invalid gradient at index 0 - expected type torch.FloatTensorbut got torch.cuda.FloatTensor解决方法：关键是数据处理过程中格式发生了变化，而且找到开始变化的地方很麻烦，如果用print去看，很费时间，所以借用...

原创 2020-03-05 18:15:30 · 5117 阅读 · 0 评论
TypeError: build_optimizer() missing 1 required positional argument: 'params'

named_parameters()迭代打印参数的名字和参数。检查参数是否已经初始化。可查看链接内容named_parameters()介绍

原创 2020-02-28 17:00:15 · 1361 阅读 · 0 评论
TypeError: Caught TypeError in DataLoader worker process 0. TypeError:'tuple' object is not callable

TypeError:Caught TypeError in DataLoader worker process 0. [2/1865]Original Traceback (most recent...

原创 2020-02-28 13:37:38 · 4153 阅读 · 0 评论

CodesErrors

作者: m0_46429066

nvcc fatal : Unsupported gpu architecture ‘compute_75‘

ProxyError: Conda cannot proceed due to an error in your proxy configuration.

pip._vendor.urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool

Git中问题fatal: refusing to merge unrelated histories

cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++

RuntimeError: CUDA error: no kernel image is available for execution on the device

shapely.errors.TopologicalError: The operation ‘GEOSIntersection_r‘ could not be performed.

cc1plus: fatal error: cuda_runtime.h: No such file or directory compilation terminated.

pytorch: RuntimeError: DataLoader worker (pid(s) 27292) exited unexpectedly

Linux服务器上运行py文件，出现ImportError: No module named torch问题

使用register_hook()函数导致显存溢出的问题

输出，loss出现-inf 或者nan

RuntimeError: module must have its parameters and buffers on device cuda:0 (device_ids[0]) but found

a leaf Variable that requires grad has been used in an in-place operation

IndexError: too many indices for tensor of dimension 1

Function MulBackward0 returned an invalid gradienttorch.FloatTensor but got torch.cuda.FloatTensor

RuntimeError:Function MulBackward0 returned an invalid gradient at index 0

TypeError: build_optimizer() missing 1 required positional argument: 'params'

TypeError: Caught TypeError in DataLoader worker process 0. TypeError:'tuple' object is not callable