BrokenPipeError: [Errno 32] Broken pipe
前言:今天在训练yolov5.6.1版本,突然出现BrokenPipeError: [Errno 32] Broken pipe错误。
一、 运行命令python train.py 出现如下错误
Traceback (most recent call last):
File "train.py", line 643, in <module>
main(opt)
File "train.py", line 539, in main
train(opt.hyp, opt, device, callbacks)
File "train.py", line 237, in train
prefix=colorstr('test: '))[0]
File "D:\liufq\yolov5-6.1\utils\datasets.py", line 122, in create_dataloader
collate_fn=LoadImagesAndLabels.collate_fn4 if quad else LoadImagesAndLabels.collate_fn), dataset
File "D:\liufq\yolov5-6.1\utils\datasets.py", line 134, in __init__
self.iterator = super().__iter__()
File "E:\Anaconda3\envs\yolov550\lib\site-packages\torch\utils\data\dataloader.py", line 359, in __iter__
return self._get_iterator()
File "E:\Anaconda3\envs\yolov550\lib\site-packages\torch\utils\data\dataloader.py", line 305, in _get_iterator
return _MultiProcessingDataLoaderIter(self)
File "E:\Anaconda3\envs\yolov550\lib\site-packages\torch\utils\data\dataloader.py", line 918, in __init__
w.start()
File "E:\Anaconda3\envs\yolov550\lib\multiprocessing\process.py", line 112, in start
self._popen = self._Popen(self)
File "E:\Anaconda3\envs\yolov550\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "E:\Anaconda3\envs\yolov550\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "E:\Anaconda3\envs\yolov550\lib\multiprocessing\popen_spawn_win32.py", line 89, in __init__
reduction.dump(process_obj, to_child)
File "E:\Anaconda3\envs\yolov550\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
BrokenPipeError: [Errno 32] Broken pipe
二、对比之前的版本的代码,没有发现错误。在错误里面dataloader.py报错。
def create_dataloader(path, imgsz, batch_size, stride, single_cls=False, hyp=None, augment=False, cache=False, pad=0.0,
rect=False, rank=-1, workers=8, image_weights=False, quad=False, prefix='', shuffle=False):
if rect and shuffle:
LOGGER.warning('WARNING: --rect is incompatible with DataLoader shuffle, setting shuffle=False')
shuffle = False
with torch_distributed_zero_first(rank): # init dataset *.cache only once if DDP
dataset = LoadImagesAndLabels(path, imgsz, batch_size,
augment=augment, # augmentation
hyp=hyp, # hyperparameters
rect=rect, # rectangular batches
cache_images=cache,
single_cls=single_cls,
stride=int(stride),
pad=pad,
image_weights=image_weights,
prefix=prefix)
batch_size = min(batch_size, len(dataset))
nd = torch.cuda.device_count() # number of CUDA devices
nw = min([os.cpu_count() // max(nd, 1), batch_size if batch_size > 1 else 0, workers]) # number of workers
sampler = None if rank == -1 else distributed.DistributedSampler(dataset, shuffle=shuffle)
loader = DataLoader if image_weights else InfiniteDataLoader # only DataLoader allows for attribute updates
return loader(dataset,
batch_size=batch_size,
shuffle=shuffle and sampler is None,
num_workers=nw,
sampler=sampler,
pin_memory=True,
collate_fn=LoadImagesAndLabels.collate_fn4 if quad else LoadImagesAndLabels.collate_fn), dataset
三、查看官方的Dataload函数
num_workers (int, optional) – how many subprocesses to use for data loading. 0
means that the data will be loaded in the main process. (default: 0)
该参数是指在进行数据集加载时,启用的线程数目。
四、在参数里面设置 num_works = 0
问题解决。
train: weights=, cfg=./models/yolov5s-se-ghost.yaml, data=data\PV_Data\PV.yaml, hyp=data\hyps\hyp.scratch-low.yaml, epochs=300, batch_size=16, imgsz=640, rect=False, resume=False, nosave=False, noval=False, noautoanchor=False, evolve=None, bucket=, cache=None, image_weights=False, device=, multi_scale=False, single_cls=False, optimizer=SGD, sync_bn=False, workers=0, project=runs\train, name=exp, exist_ok=False, quad=False, cos_lr=False, label_smoothing=0.0, patience=100, freeze=[0], save_period=-1, local_rank=-1, entity=None, upload_dataset=False, bbox_interval=-1, artifact_alias=latest
github: skipping check (not a git repository), for updates see https://github.com/ultralytics/yolov5
YOLOv5 2022-2-22 torch 1.9.0+cu102 CUDA:0 (GeForce RTX 2080 Ti, 11264MiB)
hyperparameters: lr0=0.01, lrf=0.01, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.1, box=0.05, cls=0.5, cls_pw=1.0, obj=1.0, obj_pw=1.0, iou_t=0.2, anchor_t=4.0, fl_gamma=0.0, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, degrees=0.0, translate=0.1, scale=0.5, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.5, mosaic=1.0, mixup=0.0, copy_paste=0.0
Weights & Biases: run 'pip install wandb' to automatically track and visualize YOLOv5 runs (RECOMMENDED)
TensorBoard: Start with 'tensorboard --logdir runs\train', view at http://localhost:6006/
Overriding model.yaml nc=2 with nc=3
from n params module arguments
0 -1 1 3520 models.common.Conv [3, 32, 6, 2, 2]
1 -1 1 18560 models.common.Conv [32, 64, 3, 2]
2 -1 1 3440 models.common.GhostBottleneck [64, 64, 3, 1]
3 -1 1 18784 models.common.GhostBottleneck [64, 128, 3, 2]
4 -1 3 32928 models.common.GhostBottleneck [128, 128, 3, 1]
5 -1 1 2184 models.common.SElayer [128, 16]
6 -1 1 66240 models.common.GhostBottleneck [128, 256, 3, 2]
7 -1 3 115008 models.common.GhostBottleneck [256, 256, 3, 1]
8 -1 1 8464 models.common.SElayer [256, 16]
9 -1 1 1180672 models.common.Conv [256, 512, 3, 2]
10 -1 1 656896 models.common.SPPF [512, 512, 5]
11 -1 1 33312 models.common.SElayer [512, 16]
12 -1 1 142208 models.common.GhostBottleneck [512, 512, 3, 1]
13 -1 1 1024 models.common.DWConv [512, 256, 1, 1]
14 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
15 [-1, 8] 1 0 models.common.Concat [1]
16 -1 1 5120 models.common.DWConv [512, 256, 3, 1]
17 -1 1 38336 models.common.GhostBottleneck [256, 256, 3, 1]
18 -1 1 512 models.common.DWConv [256, 128, 1, 1]
19 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
20 [-1, 5] 1 0 models.common.Concat [1]
21 -1 1 2560 models.common.DWConv [256, 128, 3, 1]
22 -1 1 10976 models.common.GhostBottleneck [128, 128, 3, 1]
23 -2 1 1408 models.common.DWConv [128, 128, 3, 2]
24 [-1, 18] 1 0 models.common.Concat [1]
25 -1 1 38336 models.common.GhostBottleneck [256, 256, 3, 1]
26 -2 1 2816 models.common.DWConv [256, 256, 3, 2]
27 [-1, 13] 1 0 models.common.Concat [1]
28 -1 1 142208 models.common.GhostBottleneck [512, 512, 3, 1]
29 [22, 25, 28] 1 21576 models.yolo.Detect [3, [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]], [128, 256, 512]]
E:\Anaconda3\envs\yolov550\lib\site-packages\torch\_tensor.py:575: UserWarning: floor_divide is deprecated, and will be removed in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values.
To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). (Triggered internally at ..\aten\src\ATen\native\BinaryOps.cpp:467.)
return torch.floor_divide(self, other)
Model Summary: 412 layers, 2547088 parameters, 2547088 gradients, 5.5 GFLOPs
Scaled weight_decay = 0.0005
optimizer: SGD with parameter groups 73 weight (no decay), 82 weight, 82 bias