FairMOT训练测试自定义数据集

最新推荐文章于 2023-01-06 21:22:39 发布

碳水大炸弹

最新推荐文章于 2023-01-06 21:22:39 发布

阅读量2.5k

点赞数 2

分类专栏：计算机视觉学习笔记文章标签： python

本文链接：https://blog.csdn.net/pulana_/article/details/126884311

版权

计算机视觉学习笔记专栏收录该内容

5 篇文章 1 订阅

订阅专栏

该博客详细介绍了如何进行视频目标追踪的标注与数据处理，包括使用LabelImage打标签，生成gt.txt文件，构建数据集目录结构，运行gen_labels.py生成训练所需标签，以及使用FairMOT框架进行训练和测试的步骤。内容涵盖从数据预处理到模型训练和应用的全过程。

摘要由CSDN通过智能技术生成

1、将自己的数据集（视频需转为一帧一帧的图片，转换工具Convert to JPG - Convert images, documents and videos to JPG (img2go.com)）用labelImage打标签，标记需要追踪的部分，标记完成后生成每张图片对应的xml文件，即voc格式

2、根据xml文件（voc格式）生成整个数据集的gt.txt文件，gt的数据格式：

<frame>,<id>,<bb_left>,<bb_top>,<bb_width>,<bb_height>,<cos>

其中，<frame>表示目标出现在哪一帧，<id> 表示目标所属的tracklet ID。接下来的四个值表示目标边界框在二维帧坐标中的位置，由左上角坐标及边界框的宽度和高度表示。<cos> 表示目标的完整性是否需要被考虑（1）或忽略（0），本数据集默认所有已标注目标均需考虑，均为1。
代码如下：

import os
import xml.etree.ElementTree as ET
import sys

if __name__ == "__main__":
    xmls_path = "./data/rocket1/xml_labels/rocket3-2"
    target_path = "./data/rocket1/xml_labels/"

    f = open(target_path + "/" + 'rocket3-2gt' + ".txt", 'w')
    i = 0
    path_list = os.listdir(xmls_path)
    path_list.sort(key=lambda x: (int(x[10:-4])))
    for xmlFilePath in path_list:
        print(os.path.join(xmls_path, xmlFilePath))
        try:
            tree = ET.parse(os.path.join(xmls_path, xmlFilePath))

            # 获得根节点
            root = tree.getroot()
        except Exception as e:  # 捕获除与程序退出sys.exit()相关之外的所有异常
            print("parse test.xml fail!")
            sys.exit()

        i += 1
        for object in root.iter('object'):
            name = object.find('name')
            if name.text == 'Top':
                item = 1
            elif name.text == 'flag':
                item = 2
            elif name.text == 'bottom':
                item = 3

            bndbox = object.find('bndbox')
            #for bndbox in root.iter('bndbox'):
            node = []
            for child in bndbox:
                node.append(int(child.text))
            xmin, ymin = node[0], node[1]
            xmax, ymax = node[2], node[3]
            width = xmax - xmin
            height = ymax - ymin
            #print(xmin, ymin, xmax, ymax, width, height)
            #cat = str(1) + ',' + str(-1) + ',' + str(-1) + ',' + str(-1)
            string = str(i) + ',' + str(item) + ',' + str(xmin) + ',' + str(ymax) + ',' + str(width) + ',' + str(
                height) + ',' + str(1)
            # print(string)
            f.write(string + '\n')

    f.close()

3、数据集目录结构如下：

src---data---rocket1---images---train---rocket2-1---gt---gt.txt（第2步生成gt.txt后移动到此）

---img1（存放数据集jpg文件）

---seqinfo.ini

seqinfo.ini内容如下，需根据自己的数据集修改name，seqLength，imWidth，imHeight

[Sequence]
name=rocket2-1
imDir=img1
frameRate=30
seqLength=26
imWidth=1920
imHeight=1080
imExt=.jpg

4、运行gen_labels.py文件，根据gt.txt生成每张图片的标签，即生成训练所需的数据格式，

FairMOT训练所需的数据格式：

<class> <id> <x_center/img_width> <y_center/img_height> <w/img_width><h/img_height>

class ：目标类别
id ：目标id
x_center/img_width ：归一化中心列坐标
y_center/img_height ：归一化中心行坐标
w/img_width ：归一化宽
h/img_height ：归一化高

gen_labels.py代码如下

import os.path as osp
import os
import numpy as np


def mkdirs(d):
    if not osp.exists(d):
        os.makedirs(d)


seq_root = './data/rocket1/images/train'
label_root = './data/rocket1/labels_with_ids/train'
mkdirs(label_root)
seqs = ['rocket2-1', 'rocket3-1', 'rocket3-2']

tid_curr = 0
tid_last = -1
for seq in seqs:
    seq_info = open(osp.join(seq_root, seq, 'seqinfo.ini')).read()
    seq_width = int(seq_info[seq_info.find('imWidth=') + 8:seq_info.find('\nimHeight')])
    seq_height = int(seq_info[seq_info.find('imHeight=') + 9:seq_info.find('\nimExt')])

    gt_txt = osp.join(seq_root, seq, 'gt', 'gt.txt')
    gt = np.loadtxt(gt_txt, dtype=np.float64, delimiter=',')
    idx = np.lexsort(gt.T[:2, :])
    gt = gt[idx, :]

    seq_label_root = osp.join(label_root, seq, 'img1')
    mkdirs(seq_label_root)

    for fid, tid, x, y, w, h, mark, _, _, _ in gt:
        if mark == 0:
            continue
        fid = int(fid)
        tid = int(tid)
        if not tid == tid_last:
            tid_curr += 1
            tid_last = tid
        x += w / 2
        y += h / 2
        label_fpath = osp.join(seq_label_root, seq + '_' + '{:03d}.txt'.format(fid-1))
        label_str = '0 {:d} {:.6f} {:.6f} {:.6f} {:.6f}\n'.format(
            tid_curr, x / seq_width, y / seq_height, w / seq_width, h / seq_height)
        with open(label_fpath, 'a') as f:
            f.write(label_str)

5、执行gen_labels.py后在src/data/rocket1/labels_with_ids/train/rocket2-1目录下生成img1文件夹，img1中是对应每张图片的txt标签文件

6、训练

python train.py  mot --exp_id rocket1 --gpus 0 --batch_size 6 --load_model ''

7、用训练生成的model_last.pth进行测试，可直接输入视频文件测试

python demo.py mot --load_model ../models/model_last.pth --conf_thres 0.4