yolov5s训练过程

森林盲点

已于 2023-11-23 10:16:59 修改

阅读量649

点赞数

文章标签： YOLO 深度学习人工智能

于 2023-07-26 17:46:31 首次发布

本文链接：https://blog.csdn.net/qq_27245699/article/details/131943855

版权

1、训练

训练命令：python -m torch.distributed.run --master_port 2950 --nproc_per_node 3 train.py --batch 12 --data ./data/base.yaml --epochs 200 --weights ./weight/yolov5s.pt --multi-scale --workers 24 --device 1,2,3 --img 960 >>train_lj.log 2>&1

python -m torch.distributed.run --master_port 2950 --nproc_per_node 3 train.py --batch 12 --data ./data/base.yaml --epochs 200 --weights ./weight/yolov5m.pt --multi-scale --workers 24 --device 1,2,3 --img 960 >>train_lj.log 2>&1

python -m torch.distributed.run --master_port 2950 --nproc_per_node 4 train.py --batch 48 --data ./data/base.yaml --epochs 200 --weights ./weight/yolov5m.pt --multi-scale --workers 48 --device 0,1,2,3 --img 640 >>train_lj.log 2>&1

python -m torch.distributed.run --master_port 2950 --nproc_per_node 4 train.py --batch 24 --data ./data/base.yaml --epochs 200 --weights ./weight/yolov5m.pt --multi-scale --workers 24 --device 0,1,2,3 --img 960 >>train_lj.log 2>&1

nproc_per_node：显卡数量

batch：一共12，平均每块卡4

workers：cpu线程数量，平均每块卡8个线程

img：输入图像的大小；

>>train_lj.log：打印log；

注意：我用的t4卡，yolov5s的，batch只能为4，workers设置成24，训练速度有所提升；

利用开设窗口，将训练挂死在后台：

sudo apt-get install tmux   # 安装
tmux new -s ${session-name} # 创建一个会话，并设置绘画名
# 快捷键[ Ctrl+b ] 是tmux的前缀键，用完前缀键后可以继续按指定键来完成指定命令
[ Ctrl+b ] [ d ]                         # 将会话与窗口分离，或者[ Ctrl+b ] tmux detach
tmux ls                                  # 查看所有会话，或者使用tmux list-session
tmux attach -t ${session-name}           #  根据会话名将terminal窗口接入会话
tmux kill-session -t ${session-name}     #  根据会话名杀死会话
tmux switch -t ${session-name}           # 根据会话名切换会话
tmux rename-session -t 0 ${session-name} # 根据会话名，重命名会话

tmux                        # 进入tmux窗口

训练过程：

[terminal]: tmux new -s train_model       # 创建一个会话，并设置绘画名:train_model
[tmux]: conda activate env_name           # 在tmux会话中，我们激活我们要使用的conda环境
[tmux]: python train.py                   # 在tmux会话中，开始训练我们的模型
[tmux]: [ Ctrl+b ] [ d ]                  # 将会话与窗口分离
[terminal]: tmux ls                       # 查看我们刚刚创建的会话
[terminal]: watch -n 1 -c gpustat --color # 监控我们的gpu信息  （没用到）

tensorboard --logdir runs/train --port 7101 --bind_all

设置tensorboard端口为7101，web端 ip:7101就可以看到了；

--bind_all：其他设备可以通过web端查看；

python -m torch.distributed.run --master_port 2950 --nproc_per_node 3 train.py --batch 84 --data ./data/secondstage.yaml --epochs 200 --weights ./weight/yolov5s.pt --multi-scale --workers 114 --device 1,2,3 --img 640 >>train_lj.log 2>&1

python -m torch.distributed.run --master_port 2950 --nproc_per_node 4 train.py --batch 80 --data ./data/secondstage.yaml --epochs 200 --weights ./weight/yolov5m.pt --multi-scale --workers 96 --device 0,1,2,3 --img 512 >>train_lj.log 2>&1

python -m torch.distributed.run --master_port 2950 --nproc_per_node 7 train.py --batch 126 --data ./data/secondstage.yaml --epochs 200 --weights ./runs/train/exp53/weights/best.pt --multi-scale --workers 140 --device 0,1,2,3,4,5,6 --img 512 >>train_lj.log 2>&1

python -m torch.distributed.run --master_port 2950 --nproc_per_node 4 train.py --batch 88 --data ./data/secondstage.yaml --epochs 200 --weights ./weight/yolov5m.pt --multi-scale --workers 80 --device 0,1,2,3 --img 512 >>train_lj.log 2>&1

调整batch大小和workers（cpu调用线程传输数据）来使资源使用率达到最大化；

我使用的是特斯拉显卡T4，一共三块可用，显卡内存每块15G；

利用top命令再按1，查询cpu使用情况及内存条占用情况，资源充足，唯一的瓶颈就是显卡的内存；

训练数据的图像尺寸设置为640*640，每块卡batchsize：28，每块卡的workers：38

对比960*960，每块卡batchsize：4，每块卡的workers：7或8

10服务器：训练一阶段命令：

python -m torch.distributed.run --master_port 2950 --nproc_per_node 2 train.py --batch 24 --data ./data/base.yaml --epochs 200 --weights ./weight/yolov5s.pt --multi-scale --workers 48 --device 1,2 --img 640 >>train_lj.log 2>&1

2、测试

python detect.py -h
--weights ：预训练模型.pt的路径，默认值为：weights/yolov5s.pt
--source：输入的数据源，可以是：图片、目录、视频、网络摄像头、http和rtsp流，默认值为：interence/images
--output： 输出检测结果的路径，默认值为：inference/output
--img-size ：用于推理图片的大小（pixels），默认值为：640
--conf-thres：对象的置信度阈值（object confidence threshold），默认值为：0.4
--iou-thres ：NMS的IOU阈值（ IOU threshold for NMS），默认值为：0.5
--fourcc：输出视频的编码格式（必须是ffmepeg支持的），例如：H264格式，默认格式为：mp4v
--half： 使用版精度F16推理（half precision FP16 inference），布尔值，默认为true
--device：cuda设备，例如：0或0,1,2,3或cpu，默认''
--view-img ：显示结果，‘布尔值，默认为true’
--save-txt ：把结果保存到*.txt文件中
--classes：过滤类别 CLASSES [CLASSES …]，filter by class
--agnostic-nms：类不可知 NMS
--augment：增强推理（augmented inference）

推理图像
python detect.py --source inference/1_input/1_img/bus.jpg --weights ./weights/yolov5s.pt --output inference/2_output/1_img/
推理目录
python detect.py --source inference/1_input/2_imgs --weights ./weights/yolov5s.pt --output inference/2_output/2_imgs
检测中有些图片置信度比较低的可以通过--conf-thres参数过滤掉
python detect.py --source inference/1_input/2_imgs --weights ./weights/yolov5s.pt --output inference/2_output/2_imgs --conf-thres 0.8
推理视频
python detect.py --source test.mp4 --weights ./weights/yolov5s.pt --output test_result/3_video
# 示例语法
python detect.py --source ./file.jpg  # 图片 
                          ./file.mp4  # 视频
                          ./dir  # 目录
                          0  # 网络摄像头
      'rtsp://170.93.143.139/rtplive/470011e600ef003a004ee33696235daa' # rtsp流
      'http://112.50.243.8/PLTV/88888888/224/3221225900/1.m3u8'  # http流

python detect.py --source ./test_video/1.mp4 --weights ./runs/train/exp21/weights/best.pt --img 960


python detect.py --weights ./runs/train/exp27/weights/best.pt --device 0 --source ./test_video/test_tailer01.mp4

检测结果在：runs/detect/exp6

森林盲点

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
yolov5s训练过程

训练命令：python -m torch.distributed.run --master_port 2950 --nproc_per_node 3 train.py --batch 12 --data ./data/base.yaml --epochs 200 --weights ./weight/yolov5s.pt --multi-scale --workers 24 --device 1,2,3 --img 960 >>train_lj.log 2>&1。>>train_lj.log：打印log；
复制链接

扫一扫