pytorch训练模型遇到的问题

最新推荐文章于 2024-03-13 09:49:17 发布

qq_33343450

最新推荐文章于 2024-03-13 09:49:17 发布

阅读量6.4k

点赞数

文章标签： pytorch 深度学习人工智能

本文链接：https://blog.csdn.net/qq_33343450/article/details/120646268

版权

pytorch训练模型遇到的问题

1、AttributeError: 'DataParallel' object has no attribute 'fc'
2、TypeError: zip argument #122 must support iteration
3、ValueError: Sum of input lengths does not equal the length of the input dataset!
4、TypeError: default_collate: batch must contain tensors, numpy arrays, numbers, dicts or lists; found

1、AttributeError: ‘DataParallel’ object has no attribute ‘fc’

	在 pytorch 多GPU训练下，存储 整个模型 （ 而不是model.state_dict() ）后再调用模型可能会遇到下面的情况：

model = torch.load('path/to/model')
if isinstance(model,torch.nn.DataParallel):
		model = model.module

#下面就可以正常使用了
model.eval()

2、TypeError: zip argument #122 must support iteration

3、ValueError: Sum of input lengths does not equal the length of the input dataset!

切分数据集切分长度加起来和总长度不一样：上下取整的原因

#所以round不能用来切分数据集，小数点为5的时候，都会上取整
round(505*0.7)	#353.5. 变为354
round(505*0.3)	#151.5 变为152

#原来
len_dataset = dataset_.__len__()
train_ratio, valid_ratio = 0.7, 0.3
train_dataset, valid_dataset = random_split(
    dataset= dataset_
    ,lengths=[round(train_ratio * len_dataset), round(valid_ratio * len_dataset)]
   	,generator=torch.Generator(device=device.type).manual_seed(0)
    )
#改为
len_dataset = dataset_.__len__()
train_ratio, valid_ratio = 0.7, 0.3
train_dataset, valid_dataset = random_split(
    dataset= dataset_
    ,lengths=[int(train_ratio * len_dataset), math.ceil(valid_ratio * len_dataset)]
   	,generator=torch.Generator(device=device.type).manual_seed(0)
    )

4、TypeError: default_collate: batch must contain tensors, numpy arrays, numbers, dicts or lists; found <class ‘NoneType’>

5、TypeError: pic should be PIL Image or ndarray. Got <class ‘int’>

6、TypeError: pic should be Tensor or ndarray. Got <class ‘PIL.Image.Image’>

7、TypeError: pic should be Tensor or ndarray. Got <class ‘PIL.Image.Image’>

8、RuntimeError: DataLoader worker (pid 182436) is killed by signal: Terminated.

9、RuntimeError: stack expects each tensor to be equal size, but got [1] at entry 0 and [2] at entry 2

IndexError: Dimension out of range (expected to be in range of [-1, 0], but got 1)

CUDA initialization: The NVIDIA driver on your system is too old (found version 10010). Please update your GPU driver by downloading and installing a new version from the URL: http://www.nvidia.com/Download/index.aspx Alternatively, go to: https://pytorch.org to install a PyTorch version that has been compiled with your version of the CUDA driver. (Triggered internally at /opt/conda/conda-bld/pytorch_1607370120218/work/c10/cuda/CUDAFunctions.cpp:100.)
cuda、torch版本不一致

python常见bug

1、TypeError: unsupported operand type(s) for -: ‘builtin_function_or_method’ and ‘float’

print(f'time.time - t0')
应该是
print(f'time.time() - t0')

pip下载bug
pip._vendor.urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host=‘files.pythonhosted.org’, port=443): Read timed out.

#1、修改所有下载源延迟时间
pip install --default-timeout=100 alphabet
#2、修改清华下载源延迟时间
pip --default-timeout=100 install tensorflow==2.0.0 -i https://pypi.tuna.tsinghua.edu.cn/simple 
博客链接：https://blog.csdn.net/woai8339/article/details/91351707

os总结
https://www.cnblogs.com/funsion/p/4017989.html

1、pil转opencv

import cv2  
from PIL import Image  
import numpy  
  
image = Image.open("plane.jpg")  
image.show()  
img = cv2.cvtColor(numpy.asarray(image),cv2.COLOR_RGB2BGR)  
cv2.imshow("OpenCV",img)  
cv2.waitKey()

2、opencv转pil

import cv2  
from PIL import Image  
import numpy  
  
img = cv2.imread("plane.jpg")  
cv2.imshow("OpenCV",img)  
image = Image.fromarray(cv2.cvtColor(img,cv2.COLOR_BGR2RGB))  
image.show()  
cv2.waitKey()

qq_33343450

关注

0
点赞
踩
5

收藏

觉得还不错? 一键收藏
1
评论
pytorch训练模型遇到的问题

pytorch训练模型遇到的问题1、AttributeError: 'DataParallel' object has no attribute 'fc'2、TypeError: zip argument #122 must support iteration1、AttributeError: ‘DataParallel’ object has no attribute ‘fc’ 在 pytorch 多GPU训练下，存储整个模型（而不是model.state_dict() ）后再调用模型可能会遇到
复制链接

扫一扫