2021-01-29

最新推荐文章于 2023-02-21 12:21:45 发布

嘻嘻哈哈哟

最新推荐文章于 2023-02-21 12:21:45 发布

阅读量392

点赞数

分类专栏： detectron2 文章标签： python 深度学习 pytorch

本文链接：https://blog.csdn.net/m0_37350758/article/details/113372961

版权

detectron2 专栏收录该内容

1 篇文章 0 订阅

订阅专栏

Detectron2 介绍

一、背景

Detectron是构建在Caffe2和Python之上,实现了10多篇计算机视觉最新的成果。Facebook AI研究院又开源了Detectron的升级版,也就是接下来我们要介绍的：Detectron2。

Detectron2 是 Facebook AI Research 推出的一个CV库,它实现了最新的目标检测算法,是对先前版本 Detectron 的完全重写,号称目标检测三大开源神器之一(Detectron2/mmDetection/SimpleDet)。

与 mmdetection 、TensorFlow Object Detection API一样,Detectron2 也是通过配置文件来设置各种参数,一点点修改到最后进行目标检测。

二、特性

基于PyTorch深度学习框架进行进一步的封装：PyTorch可以提供更直观的命令式编程模型，开发者可以更快的进行迭代模型设计和实验。
包含更多的功能：支持panoptic segmentation（Kaiming He et.al, CVPR2019），densepose，Cascade R-CNN，rotated bounding boxes等等。
可扩展性强：从Detectron2开始,Facebook引入了自定义设计，允许用户更加方便地定制适合自己任务的目标检测器。这种可扩展性使得Detectron2更加灵活。
更及时与全面的支持语义分割和全景分割的最新学术成果，而且将一直更新下去。
实现质量：从头开始重写推出的Detectron2,解决了原始Detectron中的几个实现问题,比原始Detectron更快。

模型是目标检测和分割最核心的部分，detectron2的模型也是模块的，主要包括4个核心的部分：

backbone：模型的CNN特征提取器，目前只支持resnet，另外一点是detectron2也把FPN作为backbone的一部分；
proposal_generator：候选框生成器，目前只支持RPN，一般用于faster R-CNN的一部分，其实RPN单拿出来也是一个简单的one-stage检测模型；
roi_heads：faster R-CNN系列的detector，包括ROIPooler，box_head，mask_head等，其中ROIPooler就是指的ROIPool和ROIAlign方法；
meta_arch：定义最终的模型，不能说是一个单独的模块，应该要集成backbone，proposal_generator和roi_heads构建最终模型。

模块的话好处是可以复用模块来进行组合，比如你可以组合backbone，proposal_generator和roi_heads来构建不同的模型。由于detectron2官方只是支持RCNN模型，所以实现的模块可能不够通用。

三、环境要求

带有Python≥3.6的Linux或macOS
PyTorch≥1.3
torchvision的PyTorch安装相匹配。你可以在pytorch.org上将它们安装在一起以确保这一点。
演示和可视化所需的OpenCV(可选)
pycocotools

四、环境配置

官方教程文档：https://detectron2.readthedocs.io/en/latest/tutorials/install.html

Detectron2开源代码链接：

https://github.com/facebookresearch/detectron2

Detectron2的使用说明文档：

https://detectron2.readthedocs.io/index.html

Detectron2环境配置（Ubuntu）：

https://www.bilibili.com/video/BV1bK4y1Y7Qq?from=search&seid=8204924034312637977

Detectron2环境配置（Win10）：

https://www.bilibili.com/video/BV1jZ4y1W7Nb?from=search&seid=8204924034312637977

Win10环境配置感兴趣的话可以尝试配置

五、defaults配置详解

# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
from .config import CfgNode as CN

# -----------------------------------------------------------------------------
# Convention about Training / Test specific parameters
# 关于训练和测试特定参数的设定
# -----------------------------------------------------------------------------
# Whenever an argument can be either used for training or for testing, the
# corresponding name will be post-fixed by a _TRAIN for a training parameter,
# or _TEST for a test-specific parameter.
# 当一个参数既可以用于训练又可以用于测试时，对应的名称将被一个_TRAIN作为训练参数，或_TEST为作为特定于测试的参数
# For example, the number of images during training will be
# IMAGES_PER_BATCH_TRAIN, while the number of images for testing will be
# IMAGES_PER_BATCH_TEST
# 例如，训练期间的图像数量将是IMAGES_PER_BATCH_TRAIN,而测试的图像数量将是IMAGES_PER_BATCH_TEST

# -----------------------------------------------------------------------------
# Config definition
# 配置定义
# -----------------------------------------------------------------------------

_C = CN()  # 模型部分配置

# The version number, to upgrade from old configs to new ones if any
# changes happen. It's recommended to keep a VERSION in your config file.
# 如果版本号发生任何变化，将从旧的配置升级到新配置，建议在配置文件中保留一个版本
_C.VERSION = 2

_C.MODEL = CN()
_C.MODEL.LOAD_PROPOSALS = False  # 表示在FasterRCNN中是否使用候选框
_C.MODEL.MASK_ON = False  # 是不是分割任务，在mask-rcnn中会用到
_C.MODEL.KEYPOINT_ON = False  # 是不是关键点检测的任务
_C.MODEL.DEVICE = "cuda"  # 计算是运行在GPU(cuda)还是cpu，如果想使用cpu可以设置cpu
_C.MODEL.META_ARCHITECTURE = "GeneralizedRCNN"  # 模型结构也就是指定网络结构构建方式

# Path (a file path, or URL like detectron2://.., https://..) to a checkpoint file
# to be loaded to the model. You can find available models in the model zoo.
# detectron2提供许多预先训练好的模型参数，在训练完之后可以将训练好的模型参数指定在此处
_C.MODEL.WEIGHTS = ""

# Values to be used for image normalization (BGR order, since INPUT.FORMAT defaults to BGR).
# To train on images of different number of channels, just set different mean & std.
# Default values are the mean pixel value from ImageNet: [103.53, 116.28, 123.675]
# 用于将图像进行标准化处理（因为 INPUT.FORMAT默认为BGR，所以图片颜色通道按BGR）。
# 要训练不同数量通道的图像，只需设置不同的均值和标准差即可
# 默认值为ImageNet的均值像素值[103.530, 116.280, 123.675]
_C.MODEL.PIXEL_MEAN = [103.530, 116.280, 123.675]
# When using pre-trained models in Detectron1 or any MSRA models,
# std has been absorbed into its conv1 weights, so the std needs to be set 1.
# Otherwise, you can use [57.375, 57.120, 58.395] (ImageNet std)
# 当你在Detectron1或者任何 MSRA 模型中使用预训练模型时，std已经被加载到它的conv1权重中，因此std需要设置为1
# 否则，你可以使用[57.375, 57.120, 58.395] (ImageNet std)
_C.MODEL.PIXEL_STD = [1.0, 1.0, 1.0]


# -----------------------------------------------------------------------------
# INPUT 输入部分配置
# -----------------------------------------------------------------------------
_C.INPUT = CN()
# Size of the smallest side of the image during training
# 指定训练集中图片的最小尺寸
_C.INPUT.MIN_SIZE_TRAIN = (800,)
# Sample size of smallest side by choice or random selection from range give by
# 从给定范围内随机选择或随机选择的最小边的样本
# INPUT.MIN_SIZE_TRAIN
_C.INPUT.MIN_SIZE_TRAIN_SAMPLING = "choice"
# Maximum size of the side of the image during training
# 训练期间图像的最大尺寸
_C.INPUT.MAX_SIZE_TRAIN = 1333
# Size of the smallest side of the image during testing. Set to zero to disable resize in testing.
# 测试期间图片最小边的大小，设置为0可以在测试中禁用大小调整
_C.INPUT.MIN_SIZE_TEST = 800
# Maximum size of the side of the image during testing
# 测试期间图像的最大尺寸
_C.INPUT.MAX_SIZE_TEST = 1333

# `True` if cropping is used for data augmentation during training
# 如果训练期间需要通过裁剪做数据增强，那这里设置为true
_C.INPUT.CROP = CN({"ENABLED": False})
# Cropping type:
# - "relative" crop (H * CROP.SIZE[0], W * CROP.SIZE[1]) part of an input of size (H, W)
# "relative"方法为以CROP.SIZE为系数，对原图的宽和高进行等比例缩放
# - "relative_range" uniformly sample relative crop size from between [CROP.SIZE[0], [CROP.SIZE[1]].
# "relative_range"方法从[CROP.SIZE[0], [CROP.SIZE[1]]中均匀取值
#   and  [1, 1] and use it as in "relative" scenario.
# - "absolute" crop part of an input with absolute size: (CROP.SIZE[0], CROP.SIZE[1]).
# "absolute"方法为选取一个绝对值
_C.INPUT.CROP.TYPE = "relative_range"
# Size of crop in range (0, 1] if CROP.TYPE is "relative" or "relative_range" and in number of
# pixels if CROP.TYPE is "absolute"
_C.INPUT.CROP.SIZE = [0.9, 0.9]


# Whether the model needs RGB, YUV, HSV etc.
# Should be one of the modes defined here, as we use PIL to read the image:
# https://pillow.readthedocs.io/en/stable/handbook/concepts.html#concept-modes
# with BGR being the one exception. One can set image format to BGR, we will
# internally use RGB for conversion and flip the channels over
# 颜色有很多模式，例如我们熟悉RGB或者是YUV HSV，不同颜色模式用于不同用途，有关颜色模式的更详细内容
# 大家可以自己上网找一找。这里我们只要指定颜色模式，detectron2内部根据指定模式进行颜色模式转换
_C.INPUT.FORMAT = "BGR"
# The ground truth mask format that the model will use.
# Mask R-CNN supports either "polygon" or "bitmask" as ground truth.
# 在此用于指定mask格式（也就是语义分割形式）其中ground truth格式可以使ploy（多边形）或者
# bitmask两种方式中任意选择一个
# ground truth 正确标注的数据
_C.INPUT.MASK_FORMAT = "polygon"  # alternative: "bitmask"