主要介绍租用AutoDL平台,训练YOLOV5。
1. 注册AutoDL平台
网站:https://www.autodl.com/home
给出几张图自行研究注册(简单)
租用完,如下图
JupyterLab: 用上传下载文件以及训练
AutoPanel: 用于资源检测,训练可视化
2.JupyterLab
2.1文件操作
- 对yolov5源代码进行压缩,windows下载7zip进行压缩,压缩格式选择tar后缀,ubuntu自带tar解压,rar、zip等,解压会报错
2.解压命令:
tar -xvf yolov5_7.0_Attention_Multiple.tar # 解压
用**-xvf**格式,否则报错
2.2路径问题
train.py文件的路径
def parse_opt(known=False):
parser = argparse.ArgumentParser()
parser.add_argument('--weights', type=str, default='/root/yolov5Attention/yolov5s.pt', help='initial weights path')
parser.add_argument('--cfg', type=str, default='/root/yolov5Attention/models/yolov5s.yaml', help='model.yaml path')
parser.add_argument('--data', type=str, default='/root/yolov5Attention/datasets/Mydatasets.yaml', help='dataset.yaml path')
parser.add_argument('--hyp', type=str, default=ROOT / 'data/hyps/hyp.scratch-low.yaml', help='hyperparameters path')
parser.add_argument('--epochs', type=int, default=240, help='total training epochs')
parser.add_argument('--batch-size', type=int, default=32, help='total batch size for all GPUs, -1 for autobatch')
parser.add_argument('--imgsz', '--img', '--img-size', type=int, default=640, help='train, val image size (pixels)')
parser.add_argument('--rect', action='store_true', help='rectangular training')
parser.add_argument('--resume', nargs='?', const=True, default=False, help='resume most recent training')
parser.add_argument('--nosave', action='store_true', help='only save final checkpoint')
parser.add_argument('--noval', action='store_true', help='only validate final epoch')
parser.add_argument('--noautoanchor', action='store_true', help='disable AutoAnchor')
parser.add_argument('--noplots', action='store_true', help='save no plot files')
parser.add_argument('--evolve', type=int, nargs='?', const=300, help='evolve hyperparameters for x generations')
parser.add_argument('--bucket', type=str, default='', help='gsutil bucket')
parser.add_argument('--cache', type=str, nargs='?', const='ram', help='image --cache ram/disk')
parser.add_argument('--image-weights', action='store_true', help='use weighted image selection for training')
parser.add_argument('--device', default='0', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')
parser.add_argument('--multi-scale', action='store_true', help='vary img-size +/- 50%%')
parser.add_argument('--single-cls', action='store_true', help='train multi-class data as single-class')
parser.add_argument('--optimizer', type=str, choices=['SGD', 'Adam', 'AdamW'], default='SGD', help='optimizer')
parser.add_argument('--sync-bn', action='store_true', help='use SyncBatchNorm, only available in DDP mode')
parser.add_argument('--workers', type=int, default=8, help='max dataloader workers (per RANK in DDP mode)')
parser.add_argument('--project', default=ROOT / 'runs/train', help='save to project/name')
parser.add_argument('--name', default='yolov5s', help='save to project/name')
parser.add_argument('--exist-ok', action='store_true', help='existing project/name ok, do not increment')
parser.add_argument('--quad', action='store_true', help='quad dataloader')
parser.add_argument('--cos-lr', action='store_true', help='cosine LR scheduler')
parser.add_argument('--label-smoothing', type=float, default=0.0, help='Label smoothing epsilon')
parser.add_argument('--patience', type=int, default=100, help='EarlyStopping patience (epochs without improvement)')
parser.add_argument('--freeze', nargs='+', type=int, default=[0], help='Freeze layers: backbone=10, first3=0 1 2')
parser.add_argument('--save-period', type=int, default=-1, help='Save checkpoint every x epochs (disabled if < 1)')
parser.add_argument('--seed', type=int, default=0, help='Global training seed')
parser.add_argument('--local_rank', type=int, default=-1, help='Automatic DDP Multi-GPU argument, do not modify')
# Logger arguments
parser.add_argument('--entity', default=None, help='Entity')
parser.add_argument('--upload_dataset', nargs='?', const=True, default=False, help='Upload data, "val" option')
parser.add_argument('--bbox_interval', type=int, default=-1, help='Set bounding-box image logging interval')
parser.add_argument('--artifact_alias', type=str, default='latest', help='Version of dataset artifact to use')
return parser.parse_known_args()[0] if known else parser.parse_args()
parser.add_argument(‘–weights’, type=str, default=‘/root/yolov5Attention/yolov5s.pt’, help=‘initial weights path’)
parser.add_argument(’–cfg’, type=str, default=‘/root/yolov5Attention/models/yolov5s.yaml’, help=‘model.yaml path’)
parser.add_argument(’–data’, type=str, default='/root/yolov5Attention/datasets/Mydatasets.yaml’, help=‘dataset.yaml path’)
要用绝对路径:/root/xxxx
数据集路径mydatasets.yaml文件
./yolo:代表yolo当前目录
…/yolo:代表yolo的根目录(父目录)
主意好上述细节,配置完调整好训练参数,可以开始训练了
2.3训练问题
首先,cd到yolov5的文件夹下,再配置conda环境
conda create --name myenv # 创建环境
conda activate myenv # 激活环境
安装训练的依赖库:
pip install -r requirements.txt
注意:如果你windows上生成的requirements.txt中torch是cpu版本,可以删除txt中相关库,因为AutoDL装机的时候自带gpu版本的torch库
开始训练,在终端中输入:
python train.py
3.可视化
3.1 显卡信息
新建终端2
显卡占用率查看,输入指令:
watch -n 4 nvidia-smi # 每4s刷新一次
3.2训练过程
首先 cd到yolov5根目录下
AutoPanel----->TensorBoard
新建终端3,运行指令:
ps -ef | grep tensorboard | awk '{print $2}' | xargs kill -9 # 先杀死之前的进程
tensorboard --port 6007 --logdir runs/train # 在开启新进程
刷新AutoPanel----->TensorBoard界面