Yolov8_obb旋转框训练、测试、推理

nice-wyh

已于 2024-12-16 16:57:20 修改

阅读量1k

点赞数 5

分类专栏：深度学习文章标签： YOLO

于 2024-12-16 14:36:28 首次发布

本文链接：https://blog.csdn.net/qq_46454669/article/details/144507629

版权

深度学习专栏收录该内容

4 篇文章

订阅专栏

一、代码准备

可以去官网下载https://github.com/ultralytics

环境配置同Yolov5

二、DOTA1.0数据集转换

（1）原始数据集格式如下

937.0 913.0 921.0 912.0 923.0 874.0 940.0 875.0 small-vehicle 0

（2）通过坐标在 0 和 1 之间归一化的四个角点来指定边界框，支持的 OBB 数据集格式如下

class_index, x1, y1, x2, y2, x3, y3, x4, y4

（3）新建一个pre_data.py文件实现标签转换

from ultralytics.data.converter import convert_dota_to_yolo_obb

convert_dota_to_yolo_obb('./datasets/DOTAv1')

注意，如果你的数据是jpg或者其他格式，记得注释以下几行

（4）跳转到convert_dota_to_yolo_obb.py函数，对class_mapping进行修改

class_mapping = {
    "plane": 0,
    "baseball-diamond": 1,
    "bridge": 2,
    "ground-track-field": 3,
    "small-vehicle": 4,
    "large-vehicle": 5,
    "ship": 6,
    "tennis-court": 7,
    "basketball-court": 8,
    "storage-tank": 9,
    "soccer-ball-field": 10,
    "roundabout": 11,
    "harbor": 12,
    "swimming-pool": 13,
    "helicopter": 14,
}

(5)在ultralytics-main下新建一个数据集文件夹并设置如下结构，

其中，images/train和images/val分别放置DOTA数据集切割后的原始图片文件，labels/train_original和labels/val_original分别放置原始的标签文件，labels/train和labels/val为空，然后运行步骤(3)的代码，运行结束转换后的标签会保存在labels/train和labels/val中，转换后的格式如下。

4 0.915039 0.891602 0.899414 0.890625 0.901367 0.853516 0.917969 0.854492

三、运行代码

（1）下载预训练权重(也可以不下载，后面运行train.py时候也会自己下载)

https://docs.ultralytics.com/tasks/obb/

（2）构建数据集，按照下面目录格式，其中test可为空，一定要对应。

（3）创建一个dota8-obb.yaml，然后将路径和类别改成自己的。

# Ultralytics YOLO 🚀, AGPL-3.0 license
# DOTA 1.0 dataset https://captain-whu.github.io/DOTA/index.html for object detection in aerial images by Wuhan University
# Documentation: https://docs.ultralytics.com/datasets/obb/dota-v2/
# Example usage: yolo train model=yolov8n-obb.pt data=DOTAv1.yaml
# parent
# ├── ultralytics
# └── datasets
#     └── dota1  ← downloads here (2GB)

# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
path: ../datasets/DOTAv1 # dataset root dir
train: images/train # train images (relative to 'path') 1411 images
val: images/val # val images (relative to 'path') 458 images
# test: images/test # test images (optional) 937 images

# Classes for DOTA 1.0
names:
  0: plane
  1: ship
  2: storage tank
  3: baseball diamond
  4: tennis court
  5: basketball court
  6: ground track field
  7: harbor
  8: bridge
  9: large vehicle
  10: small vehicle
  11: helicopter
  12: roundabout
  13: soccer ball field
  14: swimming pool

（4）修改yolov8-obb.yaml，修改nc即可.

yolov8-obb.yaml的路径是在yolo/ultralytics/cfg/models/v8下，修改nc为自己的类别数

# Ultralytics YOLO 🚀, AGPL-3.0 license
# YOLOv8 Oriented Bounding Boxes (OBB) model with P3-P5 outputs. For Usage examples see https://docs.ultralytics.com/tasks/detect

# Parameters
nc: 15 # number of classes
scales: # model compound scaling constants, i.e. 'model=yolov8n.yaml' will call yolov8.yaml with scale 'n'
  # [depth, width, max_channels]
  n: [0.33, 0.25, 1024] # YOLOv8n summary: 225 layers,  3157200 parameters,  3157184 gradients,   8.9 GFLOPs
  s: [0.33, 0.50, 1024] # YOLOv8s summary: 225 layers, 11166560 parameters, 11166544 gradients,  28.8 GFLOPs
  m: [0.67, 0.75, 768] # YOLOv8m summary: 295 layers, 25902640 parameters, 25902624 gradients,  79.3 GFLOPs
  l: [1.00, 1.00, 512] # YOLOv8l summary: 365 layers, 43691520 parameters, 43691504 gradients, 165.7 GFLOPs
  x: [1.00, 1.25, 512] # YOLOv8x summary: 365 layers, 68229648 parameters, 68229632 gradients, 258.5 GFLOPs

# YOLOv8.0n backbone
backbone:
  # [from, repeats, module, args]
  - [-1, 1, Conv, [64, 3, 2]] # 0-P1/2
  - [-1, 1, Conv, [128, 3, 2]] # 1-P2/4
  - [-1, 3, C2f, [128, True]]
  - [-1, 1, Conv, [256, 3, 2]] # 3-P3/8
  - [-1, 6, C2f, [256, True]]
  - [-1, 1, Conv, [512, 3, 2]] # 5-P4/16
  - [-1, 6, C2f, [512, True]]
  - [-1, 1, Conv, [1024, 3, 2]] # 7-P5/32
  - [-1, 3, C2f, [1024, True]]
  - [-1, 1, SPPF, [1024, 5]] # 9

# YOLOv8.0n head
head:
  - [-1, 1, nn.Upsample, [None, 2, "nearest"]]
  - [[-1, 6], 1, Concat, [1]] # cat backbone P4
  - [-1, 3, C2f, [512]] # 12

  - [-1, 1, nn.Upsample, [None, 2, "nearest"]]
  - [[-1, 4], 1, Concat, [1]] # cat backbone P3
  - [-1, 3, C2f, [256]] # 15 (P3/8-small)

  - [-1, 1, Conv, [256, 3, 2]]
  - [[-1, 12], 1, Concat, [1]] # cat head P4
  - [-1, 3, C2f, [512]] # 18 (P4/16-medium)

  - [-1, 1, Conv, [512, 3, 2]]
  - [[-1, 9], 1, Concat, [1]] # cat head P5
  - [-1, 3, C2f, [1024]] # 21 (P5/32-large)

  - [[15, 18, 21], 1, OBB, [nc, 1]] # OBB(P3, P4, P5)

（5）新建一个train.py，我使用的权重是“yolov8n-obb.pt”，设置相关参数如下，即可运行。值得注意的是：如果你使用的权重是“yolov8n-obb.pt”，只需要把下面代码中的配置文件yolov8-obbs.yaml改成yolov8n-obb.yaml，依次类推。

from ultralytics import YOLO

def main():
    # Load a model
    model = YOLO("yolov8n-obb.yaml").load('yolov8n-obb.pt')  # build a new model from YAML

    # Train the model
    results = model.train(data="datasets/DOTAv1.yaml", epochs=100, imgsz=640, task = 'obb',                 device=0, workers=4, batch=4)

if __name__ == '__main__':
    main()

四、验证

from ultralytics import YOLO
 
def main():
    model = YOLO(r'runs/obb/train/weights/best.pt')
    model.val(data='datasets/DOTAv1.yaml', imgsz=640, batch=4, workers=4)
 
    # 如果你有test就用下面的语句
    # model.val(data='datasets/DOTAv1.yaml',split='test', imgsz=640, batch=4, workers=4)
 
if __name__ == '__main__':
    main()

五、推理

from ultralytics import YOLO
from PIL import Image

# Load a model
model = YOLO("runs/obb/train/weights/best.pt")  # pretrained YOLO11n model

# Run batched inference on a list of images
results = model(["datasets/0496.png", "datasets/0497.png"])  # return a list of Results objects

# Process results list
for idx, result in enumerate(results):
    print(result)
    boxes = result.boxes  # Boxes object for bounding box outputs
    masks = result.masks  # Masks object for segmentation masks outputs
    keypoints = result.keypoints  # Keypoints object for pose outputs
    probs = result.probs  # Probs object for classification outputs
    obb = result.obb  # Oriented boxes object for OBB outputs
    # result.show()  # display to screen
    im_bgr = result.plot(labels=False)  # BGR-order numpy array
    im_rgb = Image.fromarray(im_bgr[..., ::-1])  # RGB-order PIL image
    im_rgb.save("result{}.jpg".format(idx))

    # result.save(filename="result{}.jpg".format(idx))  # save to disk