Fast Rcnn 之数据准备阶段 code 分享

最新推荐文章于 2022-12-14 23:56:24 发布

MultiMediaGroup_USTC

最新推荐文章于 2022-12-14 23:56:24 发布

阅读量1.4k

点赞数 1

分类专栏： Deep Learning coding 文章标签：算法

本文链接：https://blog.csdn.net/u013854886/article/details/53433110

版权

本文分享Fast R-CNN算法的数据准备阶段，包括数据流程、初始数据、大体流程和具体实现步骤。从初始的image names、boxes、overlaps，到数据增强、目标框回归目标计算，详细解析了roidb数据集的生成，以及在训练前的参数设置。

摘要由CSDN通过智能技术生成

Fast Rcnn 之数据准备阶段 code 分享

1.Rcnn系列简介

rcnn系列是object detection领域经典算法，从rcnn到fast-rcnn再到faster-rcnn，三篇工作都有Ross Girshick大神的重要贡献。关于object detection系列的算法思想介绍，有很多博客介绍的很清晰，推荐 cs231n学习笔记-CNN-目标检测、定位、分割，但是关于fast rcnn或者是faster rcnn工程中的code介绍却不多见。

step1 数据准备阶段(roidb)

整体流程介绍

先从整体出发介绍数据准备阶段的流程框架，在掌握大体框架之后，再去看具体代码功能实现，这样有助于理清头绪和快速掌握，不至于陷入一些恼人的代码细节之中。

初始数据

初始数据包括 image_index, groundtruth_annotation, selective_search_box，
1. image_index即是 image names
2. gt_annotation为人工标注的box位置，每一个box为四元组
3. selective_search_box为 offline 计算好的 proposal box

大体流程

读每个 image 对应的 gt_annotation，将 [box_location, gt_class, gt_overlap …] 等重要信息存入 roidb。
读每个 image 对应的 selective_box，将 [box_location, gt_class, gt_overlap …] 等重要信息存入 roidb。
水平翻转图片做 data augmentation；存入图片的路径，为训练时读图做准备。
计算每个box的回归目标；以类别为粒度，对 box 信息进行归一化。
初始化网络，将 roidb 送入第一层。

具体详解

启动训练的脚本是 ./tools/train_net.py，按照代码顺序，解释其数据准备的几个关键流程。

### ./tools/train_net.py
if __name__ == '__main__':
    args = parse_args() # 1.参数解析

    print('Called with args:')
    print(args)

    if args.cfg_file is not None:
        cfg_from_file(args.cfg_file)
    if args.set_cfgs is not None:
        cfg_from_list(args.set_cfgs)

    print('Using config:')
    pprint.pprint(cfg)

    if not args.randomize:
        # fix the random seeds (numpy and caffe) for reproducibility
        np.random.seed(cfg.RNG_SEED)
        caffe.set_random_seed(cfg.RNG_SEED)

    # set up caffe
    caffe.set_mode_gpu()
    if args.gpu_id is not None:
        caffe.set_device(args.gpu_id)

    imdb = get_imdb(args.imdb_name) # 2.产生roidb数据集
    print 'Loaded dataset `{:s}` for training'.format(imdb.name)
    roidb = get_training_roidb(imdb) # 3.为roidb准备训练时所需信息

    output_dir = get_output_dir(imdb, None)
    print 'Output will be saved to `{:s}`'.format(output_dir)

    train_net(args.solver, roidb, output_dir, # 4.设定参数并训练。
              pretrained_model=args.pretrained_model,
              max_iters=args.max_iters)