SSDA-YOLO的环境配置及训练

SSDA运行

1、安装环境,CUDA11.8,python=3.11

 pip install torch==2.2.0 torchvision==0.17.0 torchaudio==2.2.0 --index-url https://download.pytorch.org/whl/cu118
pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple

2、准备数据集**cityscapes-->foggy cityscapes**,并下载代码需要的转换后数据集

转换数据集的格式为YOLOV5数据集,VOC格式,使用以下连个‘代码进行,

修改代码中的文件路径即可

PS:在运行代码时需要将cityscapes和foggy cityscapes数据集的文件存放进行更改

以下图片中的annotations文件夹不是本次算法要求的!!!!!

图片路径如下所示:

 本次代码还设置了转换后的数据集,根据Readme.md中的提示进行下载:点击图片中倒数2,3行中的gooledrive进行下载

3、修改cityscapes_csfoggy_VOC.yaml文件中的路径文件

修改path 后的文件路径,出于代码算法的要求,将数据集的文件夹命名改为和代码一致,即

cityscapes ---->   CityScapes

foggy_cityscapes ----->  CityScapesFoggy

4、训练

训练前的准备

#安装需要的依赖
​
pip install wandb
​
#下载权重文件放在weights文件夹下
https://github.com/ultralytics/yolov5/releases/download/v5.0/yolov5s.pt
https://github.com/ultralytics/yolov5/releases/download/v5.0/yolov5m.pt
https://github.com/ultralytics/yolov5/releases/download/v5.0/yolov5l.pt
https://github.com/ultralytics/yolov5/releases/download/v5.0/yolov5x.pt
python ssda_yolov5_train.py \
  --weights weights/yolov5l.pt \
  --data cityscapes_csfoggy_VOC.yaml \
  --name cityscapes2foggy_ssda_960_yolov5l \
  --img 960 --device 0 --batch-size 4 --epochs 200 \
  --lambda_weight 0.005 --consistency_loss --alpha_weight 2.0

PS : nproc_per_node参数为使用GPU数量

name 为保存的文件夹名字

问题解决

1、要解决这个问题,需要使用torchrun来替代torch.distributed.launch。torchrun是一个新的工具

python -m torch.distributed.run
/home/ubuntu/anaconda3/envs/SSDA/lib/python3.11/site-packages/torch/distributed/launch.py:183: FutureWarning: The module torch.distributed.launch is deprecated
and will be removed in future. Use torchrun.
Note that --use-env is set by default in torchrun.
If your script expects `--local-rank` argument to be set, please
change it to read from `os.environ['LOCAL_RANK']` instead. See 
https://pytorch.org/docs/stable/distributed.html#launch-utility for 
further instructions

2、加载数据可视化wandb工具的问题

wandb: (1) Create a W&B account
wandb: (2) Use an existing W&B account
wandb: (3) Don't visualize my results
wandb: Enter your choice: 

你可在Wandb 网站 (Weights & Biases),申请一个秘钥(免费),并在有网的情况下输入。

也可选择修改源码部分,让wandb不启动,如下。

img

PS:我的算法中init中没有代码,故没有进行上述这一步

将wandb_untils.py中以下代码注释,并加上wandb = None

try:
    import wandb

    assert hasattr(wandb, '__version__')  # verify package import not local dir
    LOGGER.warning(DEPRECATION_WARNING)

except (ImportError, AssertionError):
    wandb = None
# try:
#     import wandb
#     assert hasattr(wandb, '__version__')  # verify package import not local dir
#     LOGGER.warning(DEPRECATION_WARNING)
# except (ImportError, AssertionError):
#     wandb = None
wandb = None

3、使用 python -m torch.distributed.run训练出现以下错误:

torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: 1) local_rank: 0 (pid: 78738) of binary: /home/ubuntu/anaconda3/envs/SSDA/bin/python

 

尝试解决,最终使用 :

python   ssda_yolov5_train.py   --weights weights/yolov5l.pt   --data cityscapes_csfoggy_VOC.yaml   --name cityscapes2foggy_ssda_960_yolov5l   --img 960 --device 0 --batch-size 24 --epochs 200   --lambda_weight 0.005 --consistency_loss --alpha_weight 2.0

4、运行时报错

AttributeError: module 'numpy' has no attribute 'int'.
`np.int` was a deprecated alias for the builtin `int`. To avoid this error in existing code, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
    https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations. Did you mean: 'inf'?

根据提示,将对应py文件中的 np.int 替换为 np.int_

网上有解决方法为更换numpy版本,尝试后未成功!,可自己尝试

5、CUDA out of memory

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 338.00 MiB. GPU 0 has a total capacity of 23.63 GiB of which 246.06 MiB is free. Including non-PyTorch memory, this process has 22.64 GiB memory in use. Of the allocated memory 21.91 GiB is allocated by PyTorch, and 284.71 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

修改运行代码batchsize,将其变小,我最后设置为4可运行

6、RuntimeError: result type Float can't be cast to the desired output type long int

Traceback (most recent call last):
  File "/home/ubuntu/domain_projects/SSDA-YOLO-master/ssda_yolov5_train.py", line 831, in <module>
    main(opt)
  File "/home/ubuntu/domain_projects/SSDA-YOLO-master/ssda_yolov5_train.py", line 824, in main
    train(opt.hyp, opt, device)
  File "/home/ubuntu/domain_projects/SSDA-YOLO-master/ssda_yolov5_train.py", line 492, in train
    loss_sr, loss_items_sr = compute_loss(pred_sr, targets_sr.to(device))  # loss scaled by batch_size
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/domain_projects/SSDA-YOLO-master/utils/loss.py", line 118, in __call__
    tcls, tbox, indices, anchors = self.build_targets(p, targets)  # targets
                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/domain_projects/SSDA-YOLO-master/utils/loss.py", line 216, in build_targets
    indices.append((b, a, gj.clamp_(0, gain[3] - 1), gi.clamp_(0, gain[2] - 1)))  # image, anchor, grid indices
                          ^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: result type Float can't be cast to the desired output type long int

解决方法:修改【utils】中的【loss.py】里面的两处内容

1.打开你的【utils】文件下的【loss.py】

2.按【Ctrl】+【F】打开搜索功能,输入【for i in range(self.nl)】找到下面的一行内容:

将图中注释部分改为下述代码(图片中注释下方那行代码即为改好的)

 anchors, shape = self.anchors[i], p[i].shape 

3.按【Ctrl】+【F】打开搜索功能,输入【indices.append】找到下面的一行内容:

同理,加入下面代码进行修改:(修改好如上图所示)

indices.append((b, a, gj.clamp_(0, shape[2] - 1), gi.clamp_(0, shape[3] - 1)))  # image, anchor, grid

再次运行即可

  • 19
    点赞
  • 18
    收藏
    觉得还不错? 一键收藏
  • 18
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 18
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值