CornerNet训练不完全指南

最新推荐文章于 2023-10-19 13:03:05 发布

叫我西瓜超人

最新推荐文章于 2023-10-19 13:03:05 发布

阅读量5.5k

点赞数 5

分类专栏：深度学习

本文链接：https://blog.csdn.net/watermelon1123/article/details/88990392

版权

深度学习专栏收录该内容

19 篇文章 2 订阅

订阅专栏

简介

CornerNet是ECCV2018的一篇one-stage目标检测的文章，在coco上达到了42.1的AP，在当时是one-stage目标检测的SOTA，文章比较亮眼的两个点：

抛弃传统的anchor，通过左上角及右下角坐标点来估计目标位置。
全新的Corner Pooling方法

网络结构如下：

从左至右：骨干网络Hourglass、左上/右下两个角点预测分支、每个角点预测分支Corner Pooling后，得到：

分支1：Heatmaps预测图像中所有角点坐标

分支2：Embeddings对左上/右下角点配对

分支3：Offsets得到因为图像经过网络后，因大小变化而产生的坐标偏移补偿

CornerNet原理感兴趣的话可以看下博客或阅读下论文：

https://blog.csdn.net/SIGAI_CSDN/article/details/87858196

论文链接：https://arxiv.org/abs/1808.01244

本篇博客将着重介绍如何运用CornerNet训练数据。

官方代码：https://github.com/princeton-vl/CornerNet

安装

下载代码：

git clone https://github.com/princeton-vl/CornerNet.git

创建虚拟环境：

conda create --name CornerNet --file conda_packagelist.txt

创建虚拟环境的时候，某些包可能出现下载过慢的情况，这里建议科学上网或下载到本地安装。

这里我打包了虚拟环境中所有的依赖文件，将本地文件路径替换掉conda_packagelist.txt中的路径即可。

链接：https://pan.baidu.com/s/1wi4Ac-90jjakANO4v9f2RQ
提取码：5cu0

激活CornerNet虚拟环境：

source activate CornerNet

Corner Pooling层作者是用C++实现的，需要单独编译：

cd <CornerNet dir>/models/py_utils/_cpools/
python setup.py install --user

编译NMS代码：

cd <CornerNet dir>/external
make

如果出现错误：No module named 'Cython'，用conda install Cython安装即可。

安装cocoAPI：

mkdir <CornerNet dir>/data
cd <CornerNet dir>/data
git clone git@github.com:cocodataset/cocoapi.git coco
cd <CornerNet dir>/data/coco/PythonAPI
make

如果url ：git@github.com:cocodataset/cocoapi.git有问题，也可以从下述网址下载cocoAPI：

https://github.com/cocodataset/cocoapi.git

注意下载到data文件夹后将cocoapi改名为coco

训练自己的数据

voc数据训练

CornerNet默认代码通过cocoAPI加载coco json文件的形式来训练的，如果想训练voc格式的数据，有两种办法：一、修改代码加载标注文件形式，改为加载xml的格式。二、将xml转化为coco json格式的标注文件。这里将采用第二种方法。

首先注意图片名称和标注文件名称要统一，且coco中图片明明是按照image id的顺序排列的，注意名称中只能包含数字，这里建议重新统一命名，可采用如下代码：

# coding utf-8
import os, shutil
import random
import numpy as np

def renameData(xmlDir, imgDir):
    xmlFiles = os.listdir(xmlDir)
    total = len(xmlFiles)
    cur = 0
    for xml in xmlFiles:
        cur += 1
        if cur % 500 == 1:
            print("Total/cur:", total, "/", cur)
        imgPath = imgDir + xml[:-4] + ".jpg"
        
        outName = ("%08d" % (cur))
        
        outXMLPath = ("%s/%s.xml" % (xmlDir,outName))
        outImgPath = ("%s/%s.jpg" % (imgDir,outName))
        
        os.rename(xmlDir+xml,outXMLPath)
        os.rename(imgPath,outImgPath)
        
    
    print("picker number:",cur)
    

if __name__ == '__main__':
    #下面的图片路径和标注路径注意更改为自己的实际路径
    xmlDir = "/home/yzy/yzy/ImageData/test/Annotations/"    
    imgDir = "/home/yzy/yzy/ImageData/test/JPEGImages/"
    
    print(xmlDir)
    print(imgDir)
    
    renameData(xmlDir, imgDir)

再将voc xml转化为coco json，代码如下：

#coding:utf-8

# pip install lxml

import sys
import os
import json
import xml.etree.ElementTree as ET


START_BOUNDING_BOX_ID = 1

#注意下面的dict存储的是实际检测的类别，需要根据自己的实际数据进行修改
#这里以自己的数据集person和hat两个类别为例，如果是VOC数据集那就是20个类别
#注意类别名称和xml文件中的标注名称一致
PRE_DEFINE_CATEGORIES = {"person": 0,"hat":1}

def get(root, name):
    vars = root.findall(name)
    return vars


def get_and_check(root, name, length):
    vars = root.findall(name)
    if len(vars) == 0:
        raise NotImplementedError('Can not find %s in %s.'%(name, root.tag))
    if length > 0 and len(vars) != length:
        raise NotImplementedError('The size of %s is supposed to be %d, but is %d.'%(name, length, len(vars)))
    if length == 1:
        vars = vars[0]
    return vars


def get_filename_as_int(filename):
    try:
        filename = os.path.splitext(filename)[0]
        return int(filename)
    except:
        raise NotImplementedError('Filename %s is supposed to be an integer.'%(filename))


def convert(xml_dir, json_file):
    xmlFiles = os.listdir(xml_dir)
    
    json_dict = {"images":[], "type": "instances", "annotations": [],
                 "categories": []}
    categories = PRE_DEFINE_CATEGORIES
    bnd_id = START_BOUNDING_BOX_ID
    num = 0
    for line in xmlFiles:
#         print("Processing %s"%(line))
        num +=1
        if num%50==0:
            print("processing ",num,"; file ",line)
            
        xml_f = os.path.join(xml_dir, line)
        tree = ET.parse(xml_f)
        root = tree.getroot()
        ## The filename must be a number
        filename = line[:-4]
        image_id = get_filename_as_int(filename)
        size = get_and_check(root, 'size', 1)
        width = int(get_and_check(size, 'width', 1).text)
        height = int(get_and_check(size, 'height', 1).text)
        # image = {'file_name': filename, 'height': height, 'width': width,
        #          'id':image_id}
        image = {'file_name': (filename+'.jpg'), 'height': height, 'width': width,
                 'id':image_id}
        json_dict['images'].append(image)
        ## Cruuently we do not support segmentation
        #  segmented = get_and_check(root, 'segmented', 1).text
        #  assert segmented == '0'
        for obj in get(root, 'object'):
            category = get_and_check(obj, 'name', 1).text
            if category not in categories:
                new_id = len(categories)
                categories[category] = new_id
            category_id = categories[category]
            bndbox = get_and_check(obj, 'bndbox', 1)
            xmin = int(get_and_check(bndbox, 'xmin', 1).text) - 1
            ymin = int(get_and_check(bndbox, 'ymin', 1).text) - 1
            xmax = int(get_and_check(bndbox, 'xmax', 1).text)
            ymax = int(get_and_check(bndbox, 'ymax', 1).text)
            assert(xmax > xmin)
            assert(ymax > ymin)
            o_width = abs(xmax - xmin)
            o_height = abs(ymax - ymin)
            ann = {'area': o_width*o_height, 'iscrowd': 0, 'image_id':
                   image_id, 'bbox':[xmin, ymin, o_width, o_height],
                   'category_id': category_id, 'id': bnd_id, 'ignore': 0,
                   'segmentation': []}
            json_dict['annotations'].append(ann)
            bnd_id = bnd_id + 1

    for cate, cid in categories.items():
        cat = {'supercategory': 'none', 'id': cid, 'name': cate}
        json_dict['categories'].append(cat)
    json_fp = open(json_file, 'w')
    json_str = json.dumps(json_dict)
    json_fp.write(json_str)
    json_fp.close()


'''
在生成coco格式的annotations文件之前:
1.执行renameData.py对xml和jpg统一命名；
2.
3.执行splitData方法，切分好对应的train/val/test数据集
'''
if __name__ == '__main__':
    folder_list= ["train","val","test"]
    #注意更改base_dir为本地实际图像和标注文件路径
    base_dir = "/home/yzy/yzy/ImageData/hat/" 
    for i in range(3):
        folderName = folder_list[i]
        xml_dir = base_dir + folderName + "/Annotations/"
        json_dir = base_dir + folderName + "/instances_" + folderName + ".json"
        
        print("deal: ",folderName)
        print("xml dir: ",xml_dir)
        print("json file: ",json_dir)
        
        
        convert(xml_dir,json_dir)

转化完之后的目录结构应该是这样的，hat是本地的图片和标注信息存储路径，annotations下面保存着训练集，测试集和验证集的标注文件，images下面是训练图片。

.
└── hat
    ├── annotations
    │   ├── instances_test.json
    │   ├── instances_train.json
    │   └── instances_val.json
    └── images

注意转换完，将图片和标注文件夹hat保存到CornerNet/data下

进入CornerNet/db目录，参照coco.py 构建自己的数据读取接口，并保存在db目录下，以person和hat二分类的hat.py为例：

import sys
sys.path.insert(0, "data/coco/PythonAPI/")

import os
import json
import numpy as np
import pickle

from tqdm import tqdm
from db.detection import DETECTION
from config import system_configs
from pycocotools.coco import COCO
from pycocotools.cocoeval import COCOeval

#类的名字可以根据自己的数据集来命名
class HAT(DETECTION):
    def __init__(self, db_config, split):
        super(HAT, self).__init__(db_config)
        data_dir   = system_configs.data_dir
        result_dir = system_configs.result_dir
        cache_dir  = system_configs.cache_dir

        self._split = split
        #这里的trainval,minival和testdev的值和标注json文件的名称有关
        #比如训练集的名称为instances_train.json，则trainval的值为train
        # self._dataset = {
        #     "trainval": "trainval2014",
        #     "minival": "minival2014",
        #     "testdev": "testdev2017"
        # }[self._split]
        self._dataset = {
            "trainval": "train",
            "minival": "val",
            "testdev": "test"
        }[self._split]
        
        #此处的hat根据当前数据读取的接口Py文件名称修改，比如我的接口为hat.py，则修改为“hat”
        #self._coco_dir = os.path.join(data_dir, "coco")
        self._coco_dir = os.path.join(data_dir, "hat")

        self._label_dir  = os.path.join(self._coco_dir, "annotations")
        self._label_file = os.path.join(self._label_dir, "instances_{}.json")
        self._label_file = self._label_file.format(self._dataset)

        self._image_dir  = os.path.join(self._coco_dir, "images", self._dataset)
        self._image_file = os.path.join(self._image_dir, "{}")
        #同样根据当前接口文件的名称修改
        #self._data = "coco"
        self._data = "hat"
        self._mean = np.array([0.40789654, 0.44719302, 0.47026115], dtype=np.float32)
        self._std  = np.array([0.28863828, 0.27408164, 0.27809835], dtype=np.float32)
        self._eig_val = np.array([0.2141788, 0.01817699, 0.00341571], dtype=np.float32)
        self._eig_vec = np.array([
            [-0.58752847, -0.69563484, 0.41340352],
            [-0.5832747, 0.00994535, -0.81221408],
            [-0.56089297, 0.71832671, 0.41158938]
        ], dtype=np.float32)
        
        #修改类别id
        # self._cat_ids = [
        #     1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 13, 
        #     14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 
        #     24, 25, 27, 28, 31, 32, 33, 34, 35, 36, 
        #     37, 38, 39, 40, 41, 42, 43, 44, 46, 47, 
        #     48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 
        #     58, 59, 60, 61, 62, 63, 64, 65, 67, 70, 
        #     72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 
        #     82, 84, 85, 86, 87, 88, 89, 90
        # ]
        self._cat_ids = [0,1]
        self._classes = {
            ind + 1: cat_id for ind, cat_id in enumerate(self._cat_ids)
        }
        self._coco_to_class_map = {
            value: key for key, value in self._classes.items()
        }

        #修改cache文件的名称
        #self._cache_file =os.path.join(cache_dir,"coco_{}.pkl".format(self._dataset))
        self._cache_file = os.path.join(cache_dir, "hat_{}.pkl".format(self._dataset))
        self._load_data()
        self._db_inds = np.arange(len(self._image_ids))

        self._load_coco_data() 

    def _load_data(self):
        print("loading from cache file: {}".format(self._cache_file))
        if not os.path.exists(self._cache_file):
            print("No cache file found...")
            self._extract_data()
            with open(self._cache_file, "wb") as f:
                pickle.dump([self._detections, self._image_ids], f)
        else:
            with open(self._cache_file, "rb") as f:
                self._detections, self._image_ids = pickle.load(f)

    def _load_coco_data(self):
        self._coco = COCO(self._label_file)
        with open(self._label_file, "r") as f:
            data = json.load(f)

        coco_ids = self._coco.getImgIds()
        eval_ids = {
            self._coco.loadImgs(coco_id)[0]["file_name"]: coco_id
            for coco_id in coco_ids
        }

        self._coco_categories = data["categories"]
        self._coco_eval_ids   = eval_ids

    def class_name(self, cid):
        cat_id = self._classes[cid]
        cat    = self._coco.loadCats([cat_id])[0]
        return cat["name"]

    def _extract_data(self):
        self._coco    = COCO(self._label_file)
        self._cat_ids = self._coco.getCatIds()

        coco_image_ids = self._coco.getImgIds()

        self._image_ids = [
            self._coco.loadImgs(img_id)[0]["file_name"] 
            for img_id in coco_image_ids
        ]
        self._detections = {}
        for ind, (coco_image_id, image_id) in enumerate(tqdm(zip(coco_image_ids, self._image_ids))):
            image      = self._coco.loadImgs(coco_image_id)[0]
            bboxes     = []
            categories = []

            for cat_id in self._cat_ids:
                annotation_ids = self._coco.getAnnIds(imgIds=image["id"], catIds=cat_id)
                annotations    = self._coco.loadAnns(annotation_ids)
                category       = self._coco_to_class_map[cat_id]
                for annotation in annotations:
                    bbox = np.array(annotation["bbox"])
                    bbox[[2, 3]] += bbox[[0, 1]]
                    bboxes.append(bbox)

                    categories.append(category)

            bboxes     = np.array(bboxes, dtype=float)
            categories = np.array(categories, dtype=float)
            if bboxes.size == 0 or categories.size == 0:
                self._detections[image_id] = np.zeros((0, 5), dtype=np.float32)
            else:
                self._detections[image_id] = np.hstack((bboxes, categories[:, None]))

    def detections(self, ind):
        image_id = self._image_ids[ind]
        detections = self._detections[image_id]

        return detections.astype(float).copy()

    def _to_float(self, x):
        return float("{:.2f}".format(x))

    def convert_to_coco(self, all_bboxes):
        detections = []
        for image_id in all_bboxes:
            coco_id = self._coco_eval_ids[image_id]
            for cls_ind in all_bboxes[image_id]:
                category_id = self._classes[cls_ind]
                for bbox in all_bboxes[image_id][cls_ind]:
                    bbox[2] -= bbox[0]
                    bbox[3] -= bbox[1]

                    score = bbox[4]
                    bbox  = list(map(self._to_float, bbox[0:4]))

                    detection = {
                        "image_id": coco_id,
                        "category_id": category_id,
                        "bbox": bbox,
                        "score": float("{:.2f}".format(score))
                    }

                    detections.append(detection)
        return detections

    def evaluate(self, result_json, cls_ids, image_ids, gt_json=None):
        if self._split == "testdev":
            return None

        coco = self._coco if gt_json is None else COCO(gt_json)

        eval_ids = [self._coco_eval_ids[image_id] for image_id in image_ids]
        cat_ids  = [self._classes[cls_id] for cls_id in cls_ids]

        coco_dets = coco.loadRes(result_json)
        coco_eval = COCOeval(coco, coco_dets, "bbox")
        coco_eval.params.imgIds = eval_ids
        coco_eval.params.catIds = cat_ids
        coco_eval.evaluate()
        coco_eval.accumulate()
        coco_eval.summarize()
        return coco_eval.stats[0], coco_eval.stats[12:]

修改CornerNet/db下的datasets.py，增加刚hat.py中定义的数据接口类：

from db.coco import MSCOCO
from db.hat import HAT

datasets = {
    "MSCOCO": MSCOCO,
    "HAT": HAT
}

修改CornerNet/config下的配置文件CornerNet.json，做如下配置修改：

"dataset": "HAT"            #dataset的值修改为数据接口hat.py中定义的类的名字
"batch_size": 16             #batch_size根据自己电脑配置修改
"chunk_sizes": [4,4,4,4]    #[4,4,4,4]代表使用4个gpu，每个gpu每个batch处理4张图片，4*4=16
"learning_rate": 0.00025    #学习率可能需要修改，还在摸索中
"max_iter": 500000          #最大iteration次数根据自己的实际情况来设置
"categories": 2             #类别数目根据自己数据集修改

修改db/detection.py下line.8 self._configs["categories"] = 2 类别数目根据自己数据集修改

将CornerNet/sample下coco.py复制一份，改名为数据接口hat.py放在sample下（sample下主要做数据增强相关工作）。

训练数据：

单gpu训练：

python train.py CornerNet

多gpu训练：

CUDA_VISIBLE_DEVICES =0,1,2,3 python train.py CornerNet

我没找到官网代码里哪里指定多显卡训练的gpu_id，因此采用系统可见显卡的方式。

要注意训练的gpu个数要和配置文件中的CornerNet.json里的batch_size和chunk_size相对应。

实测单个1080TI也就最多支持batch_size=4

训练显示如下：

----------（待填坑）----------

Q1：CornerNet训练自己的数据集效果怎么样呢？

正在训练呢，好汉我们一个星期之后见结果_(:з」∠)_

Q2：数据归一化的均值和方差是否应该通过实际数据集来计算？

这部分在CornerNet/db/coco.py的_mean和_std两个参数，我觉得是需要根据自己的数据集做实际调整的，但是我用的默认均值和标准差做的归一化，后续测试一下是不是要修改。

Q3：速度怎么样？

我没试过，我同事说他用1080TI测试1帧，吐血...待我测试一下

Q4：超参数怎么调整？

母鸡啊，训练一次要好久啊，待我试验一下....o(￣ヘ￣o＃)

叫我西瓜超人

关注

5
点赞
踩
41

收藏

觉得还不错? 一键收藏
21
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录