此文件指定Detectron的默认配置选项。你不应该
更改此文件中的值。相反,你应该写一个配置文件(在yaml中)
并使用merge_cfg_from_file(yaml_file)加载它并覆盖默认值
选项。
tools目录中的大多数工具都使用--cfg选项来指定覆盖
文件和覆盖(键,值)对的可选列表:
- 有关使用merge_cfg_from_file的示例,请参阅tools / {train,test} _net.py
- 请参阅配置/ * / * .yaml,例如配置文件
Detectron支持许多不同的模型类型,每种类型都有很多
不同的选择。结果是一组巨大的配置选项。
MODEL
# The type of model to use
# The string must match a function in the modeling.model_builder module
# (e.g., 'generalized_rcnn', 'mask_rcnn', ...)
'''
# 要使用的模型类型
# 该字符串必须与modeling.model_builder模块中的函数匹配
# 例如'generalized_rcnn','mask_rcnn',...)'''
__C.MODEL.TYPE = ''
# The backbone conv body to use
# The string must match a function that is imported in modeling.model_builder
# (e.g., 'FPN.add_fpn_ResNet101_conv5_body' to specify a ResNet-101-FPN
# backbone)
'''
# 使用的主干卷积网络
# 该字符串必须与在modeling.model_builder中导入的函数匹配
# 例如,'FPN.add_fpn_ResNet101_conv5_body'指定ResNet-101-FPN
# bone)'''
__C.MODEL.CONV_BODY = ''
# Number of classes in the dataset; must be set
# E.g., 81 for COCO (80 foreground + 1 background)
'''
# 数据集中的类数;必须设定
# E.g.,81为COCO(80前景+ 1背景)'''
__C.MODEL.NUM_CLASSES = -1
# The meaning of FASTER_RCNN depends on the context (training vs. inference):
# 1) During training, FASTER_RCNN = True means that end-to-end training will be
# used to jointly train the RPN subnetwork and the Fast R-CNN subnetwork
# (Faster R-CNN = RPN + Fast R-CNN).
# 2) During inference, FASTER_RCNN = True means that the model's RPN subnetwork
# will be used to generate proposals rather than relying on precomputed
# proposals. Note that FASTER_RCNN = True can be used at inference time even
# if the Faster R-CNN model was trained with stagewise training (which
# consists of alternating between RPN and Fast R-CNN training in a way that
# finally leads to a single network).
'''
# FASTER_RCNN的含义取决于上下文(训练与推段):
# 1)在训练期间,FASTER_RCNN = True表示将进行端到端训练
# 用于联合训练RPN子网和FAST R-CNN子网
#(Faster R-CNN = RPN +Fast R-CNN)。
# 2)在推段期间,FASTER_RCNN = True表示模型的RPN子网
# 将用于生成提案而不是依赖于预先计算
# 个提案。请注意,FASTER_RCNN = True可以在推段时使用
# 如果faster R-CNN模型是通过分阶段训练训练的(其中
# 由RPN和fast R-CNN训练之间的交替组成
# finally导致单个网络)'''
__C.MODEL.FASTER_RCNN = False
# Indicates the model makes instance mask predictions (as in Mask R-CNN)
'''表示模型进行实例掩码预测(如掩码R-CNN)'''
__C.MODEL.MASK_ON = False
# ---------------------------------------------------------------------------- #
# Misc options
# ---------------------------------------------------------------------------- #
# Number of GPUs to use (applies to both training and testing)
'''
要使用的GPU数量(适用于训练和测试)'''
__C.NUM_GPUS = 1
SOLVER
# L2 regularization hyperparameter
__C.SOLVER.WEIGHT_DECAY = 0.0005
# Schedule type (see functions in utils.lr_policy for options)
# E.g., 'step', 'steps_with_decay', ...
__C.SOLVER.LR_POLICY = 'step'
# Some LR Policies (by example):
# 'step'
# lr = SOLVER.BASE_LR * SOLVER.GAMMA ** (cur_iter // SOLVER.STEP_SIZE)
# 'steps_with_decay'
# SOLVER.STEPS = [0, 60000, 80000]
# SOLVER.GAMMA = 0.1
# lr = SOLVER.BASE_LR * SOLVER.GAMMA ** current_step
# iters [0, 59999] are in current_step = 0, iters [60000, 79999] are in
# current_step = 1, and so on
# 'steps_with_lrs'
# SOLVER.STEPS = [0, 60000, 80000]
# SOLVER.LRS = [0.02, 0.002, 0.0002]
# lr = LRS[current_step]
# 'cosine_decay'
# lr = SOLVER.BASE_LR * (cos(PI * cur_iter / SOLVER.MAX_ITER) * 0.5 + 0.5)
# 'exp_decay'
# lr smoothly decays from SOLVER.BASE_LR to SOLVER.GAMMA * SOLVER.BASE_LR
# lr = SOLVER.BASE_LR * exp(np.log(SOLVER.GAMMA) * cur_iter / SOLVER.MAX_ITER)
# Hyperparameter used by the specified policy
# For 'step', the current LR is multiplied by SOLVER.GAMMA at each step
# For 'exp_decay', SOLVER.GAMMA is the ratio between the final and initial LR.
__C.SOLVER.GAMMA = 0.1
# Uniform step size for 'steps' policy
__C.SOLVER.STEP_SIZE = 30000
# Non-uniform step iterations for 'steps_with_decay' or 'steps_with_lrs'
# policies
__C.SOLVER.STEPS = []
# Learning rates to use with 'steps_with_lrs' policy
__C.SOLVER.LRS = []
# Maximum number of SGD iterations
__C.SOLVER.MAX_ITER = 40000
# Momentum to use with SGD
__C.SOLVER.MOMENTUM = 0.9
FAST-RCNN
# The type of RoI head to use for bounding box classification and regression
# The string must match a function this is imported in modeling.model_builder
# (e.g., 'head_builder.add_roi_2mlp_head' to specify a two hidden layer MLP)
__C.FAST_RCNN.ROI_BOX_HEAD = ''
# Hidden layer dimension when using an MLP for the RoI box head
__C.FAST_RCNN.MLP_HEAD_DIM = 1024
# Hidden Conv layer dimension when using Convs for the RoI box head
__C.FAST_RCNN.CONV_HEAD_DIM = 256
# Number of stacked Conv layers in the RoI box head
__C.FAST_RCNN.NUM_STACKED_CONVS = 4
# RoI transformation function (e.g., RoIPool or RoIAlign)
# (RoIPoolF is the same as RoIPool; ignore the trailing 'F')
'''
#RoI转换功能(例如,RoIPool或RoIAlign)
# RoIPoolF与RoIPool相同;忽略尾随'F')'''
__C.FAST_RCNN.ROI_XFORM_METHOD = 'RoIPoolF'
# Number of grid sampling points in RoIAlign (usually use 2)
# Only applies to RoIAlign
'''
# RoIAlign中的网格采样点数(通常使用2)
# 仅适用于RoIAlign'''
__C.FAST_RCNN.ROI_XFORM_SAMPLING_RATIO = 0
# RoI transform output resolution
# Note: some models may have constraints on what they can use, e.g. they use
# pretrained FC layers like in VGG16, and will ignore this option
'''
# Roi转换输出分辨率
# 注意:某些模型可能会限制它们可以使用的内容,例如:他们使用
# 预训练FC层,如VGG16,并将忽略此选项'''
__C.FAST_RCNN.ROI_XFORM_RESOLUTION = 14
FPN
# ---------------------------------------------------------------------------- #
# FPN options
# ---------------------------------------------------------------------------- #
__C.FPN = AttrDict()
# FPN is enabled if True
__C.FPN.FPN_ON = False
# Channel dimension of the FPN feature levels
__C.FPN.DIM = 256
# Initialize the lateral connections to output zero if True
__C.FPN.ZERO_INIT_LATERAL = False
# Stride of the coarsest FPN level
# This is needed so the input can be padded properly
__C.FPN.COARSEST_STRIDE = 32
#
# FPN may be used for just RPN, just object detection, or both
#
# Use FPN for RoI transform for object detection if True
__C.FPN.MULTILEVEL_ROIS = False
# Hyperparameters for the RoI-to-FPN level mapping heuristic
__C.FPN.ROI_CANONICAL_SCALE = 224 # s0
__C.FPN.ROI_CANONICAL_LEVEL = 4 # k0: where s0 maps to
# Coarsest level of the FPN pyramid
__C.FPN.ROI_MAX_LEVEL = 5
# Finest level of the FPN pyramid
__C.FPN.ROI_MIN_LEVEL = 2
# Use FPN for RPN if True
__C.FPN.MULTILEVEL_RPN = False
# Coarsest level of the FPN pyramid
__C.FPN.RPN_MAX_LEVEL = 6
# Finest level of the FPN pyramid
__C.FPN.RPN_MIN_LEVEL = 2
# FPN RPN anchor aspect ratios
__C.FPN.RPN_ASPECT_RATIOS = (0.5, 1, 2)
# RPN anchors start at this size on RPN_MIN_LEVEL
# The anchor size doubled each level after that
# With a default of 32 and levels 2 to 6, we get anchor sizes of 32 to 512
__C.FPN.RPN_ANCHOR_START_SIZE = 32
# Use extra FPN levels, as done in the RetinaNet paper
__C.FPN.EXTRA_CONV_LEVELS = False
# Use GroupNorm in the FPN-specific layers (lateral, etc.)
__C.FPN.USE_GN = False
Mask-RCNN
# The type of RoI head to use for instance mask prediction
# The string must match a function this is imported in modeling.model_builder
# (e.g., 'mask_rcnn_heads.ResNet_mask_rcnn_fcn_head_v1up4convs')
__C.MRCNN.ROI_MASK_HEAD = ''
# Resolution of mask predictions
__C.MRCNN.RESOLUTION = 14
# RoI transformation function and associated options
__C.MRCNN.ROI_XFORM_METHOD = 'RoIAlign'
# RoI transformation function (e.g., RoIPool or RoIAlign)
__C.MRCNN.ROI_XFORM_RESOLUTION = 7
# Number of grid sampling points in RoIAlign (usually use 2)
# Only applies to RoIAlign
__C.MRCNN.ROI_XFORM_SAMPLING_RATIO = 0
# Number of channels in the mask head
__C.MRCNN.DIM_REDUCED = 256
# Use dilated convolution in the mask head
'''膨胀卷积'''
__C.MRCNN.DILATION = 2
# Upsample the predicted masks by this factor
__C.MRCNN.UPSAMPLE_RATIO = 1
# Use a fully-connected layer to predict the final masks instead of a conv layer
__C.MRCNN.USE_FC_OUTPUT = False
# Weight initialization method for the mask head and mask output layers
'''mask head和mask输出层的权重初始化方法'''
__C.MRCNN.CONV_INIT = 'GaussianFill'
# Use class specific mask predictions if True (otherwise use class agnostic mask
# predictions)
__C.MRCNN.CLS_SPECIFIC_MASK = True
# Multi-task loss weight for masks
__C.MRCNN.WEIGHT_LOSS_MASK = 1.0
# Binarization threshold for converting soft masks to hard masks
__C.MRCNN.THRESH_BINARIZE = 0.5
TRAIN
# Initialize network with weights from this .pkl file
__C.TRAIN.WEIGHTS = ''
# Datasets to train on
# Available dataset list: detectron.datasets.dataset_catalog.datasets()
# If multiple datasets are listed, the model is trained on their union
__C.TRAIN.DATASETS = ()
# Scales to use during training
# Each scale is the pixel size of an image's shortest side
# If multiple scales are listed, then one is selected uniformly at random for
# each training image (i.e., scale jitter data augmentation)
'''#在训练期间使用秤
#每个比例是图像最短边的像素大小
#如果列出了多个刻度,则随机均匀选择一个刻度
#每个训练图像(即,比例抖动数据增强)'''
__C.TRAIN.SCALES = (600, ) # 要符合实际图片最短边,否则训练时间会变长
# Max pixel size of the longest side of a scaled input image
'''缩放输入图像的最长边的最大像素大小'''
__C.TRAIN.MAX_SIZE = 1000 # 要符合实际图片最长边,否则训练时间会变长
# Images *per GPU* in the training minibatch
# Total images per minibatch = TRAIN.IMS_PER_BATCH * NUM_GPUS
__C.TRAIN.IMS_PER_BATCH = 2
# RoI minibatch size *per image* (number of regions of interest [ROIs])
# Total number of RoIs per training minibatch =
# TRAIN.BATCH_SIZE_PER_IM * TRAIN.IMS_PER_BATCH * NUM_GPUS
# E.g., a common configuration is: 512 * 2 * 8 = 8192
__C.TRAIN.BATCH_SIZE_PER_IM = 64
# Target fraction of RoI minibatch that is labeled foreground (i.e. class > 0)
__C.TRAIN.FG_FRACTION = 0.25
# Overlap threshold for an RoI to be considered foreground (if >= FG_THRESH)
__C.TRAIN.FG_THRESH = 0.5
# Overlap threshold for an RoI to be considered background (class = 0 if
# overlap in [LO, HI))
__C.TRAIN.BG_THRESH_HI = 0.5
__C.TRAIN.BG_THRESH_LO = 0.0
# Use horizontally-flipped images during training?
__C.TRAIN.USE_FLIPPED = True
# Overlap required between an RoI and a ground-truth box in order for that
# (RoI, gt box) pair to be used as a bounding-box regression training example
__C.TRAIN.BBOX_THRESH = 0.5
# Snapshot (model checkpoint) period
# Divide by NUM_GPUS to determine actual period (e.g., 80000/8 => 10000 iters)
# to allow for linear training schedule scaling
__C.TRAIN.SNAPSHOT_ITERS = 80000
# Train using these proposals
# During training, all proposals specified in the file are used (no limit is
# applied)
# Proposal files must be in correspondence with the datasets listed in
# TRAIN.DATASETS
__C.TRAIN.PROPOSAL_FILES = ()
# Make minibatches from images that have similar aspect ratios (i.e. both
# tall and thin or both short and wide)
# This feature is critical for saving memory (and makes training slightly
# faster)
__C.TRAIN.ASPECT_GROUPING = True
TEST
# Initialize network with weights from this .pkl file
__C.TEST.WEIGHTS = ''
# Datasets to test on
# Available dataset list: detectron.datasets.dataset_catalog.datasets()
# If multiple datasets are listed, testing is performed on each one sequentially
__C.TEST.DATASETS = ()
# Scale to use during testing
__C.TEST.SCALE = 600
# Max pixel size of the longest side of a scaled input image
__C.TEST.MAX_SIZE = 1000
# Overlap threshold used for non-maximum suppression (suppress boxes with
# IoU >= this threshold)
__C.TEST.NMS = 0.3
# Apply Fast R-CNN style bounding-box regression if True
__C.TEST.BBOX_REG = True
# Test using these proposal files (must correspond with TEST.DATASETS)
__C.TEST.PROPOSAL_FILES = ()
# Run GenerateProposals on GPU if set to True
__C.TEST.GENERATE_PROPOSALS_ON_GPU = False
# Limit on the number of proposals per image used during inference
__C.TEST.PROPOSAL_LIMIT = 2000
# NMS threshold used on RPN proposals
__C.TEST.RPN_NMS_THRESH = 0.7
# Number of top scoring RPN proposals to keep before applying NMS
# When FPN is used, this is *per FPN level* (not total)
__C.TEST.RPN_PRE_NMS_TOP_N = 12000
# Number of top scoring RPN proposals to keep after applying NMS
# This is the total number of RPN proposals produced (for both FPN and non-FPN
# cases)
__C.TEST.RPN_POST_NMS_TOP_N = 2000
# Proposal height and width both need to be greater than RPN_MIN_SIZE
# (at orig image scale; not scale used during training or inference)
__C.TEST.RPN_MIN_SIZE = 0
# Maximum number of detections to return per image (100 is based on the limit
# established for the COCO dataset)
__C.TEST.DETECTIONS_PER_IM = 100
# Minimum score threshold (assuming scores in a [0, 1] range); a value chosen to
# balance obtaining high recall with not having too many low precision
# detections that will slow down inference post processing steps (like NMS)
__C.TEST.SCORE_THRESH = 0.05
# Save detection results files if True
# If false, results files are cleaned up (they can be large) after local
# evaluation
__C.TEST.COMPETITION_MODE = True
# Evaluate detections with the COCO json dataset eval code even if it's not the
# evaluation code for the dataset (e.g. evaluate PASCAL VOC results using the
# COCO API to get COCO style AP on PASCAL VOC)
__C.TEST.FORCE_JSON_DATASET_EVAL = False
# [Inferred value; do not set directly in a config]
# Indicates if precomputed proposals are used at test time
# Not set for 1-stage models and 2-stage models with RPN subnetwork enabled
__C.TEST.PRECOMPUTED_PROPOSALS = True
# Evaluate proposals in class-specific Average Recall (AR).
# It means that one first computes AR within each category and then averages
# over the categories. It is not biased towards the AR of frequent categories
# compared with class-agnostic AR.
__C.TEST.CLASS_SPECIFIC_AR = False
# ---------------------------------------------------------------------------- #
# Test-time augmentations for bounding box detection
# See configs/test_time_aug/e2e_mask_rcnn_R-50-FPN_2x.yaml for an example
# ---------------------------------------------------------------------------- #
__C.TEST.BBOX_AUG = AttrDict()
# Enable test-time augmentation for bounding box detection if True
__C.TEST.BBOX_AUG.ENABLED = False
# Heuristic used to combine predicted box scores
# Valid options: ('ID', 'AVG', 'UNION')
__C.TEST.BBOX_AUG.SCORE_HEUR = 'UNION'
# Heuristic used to combine predicted box coordinates
# Valid options: ('ID', 'AVG', 'UNION')
__C.TEST.BBOX_AUG.COORD_HEUR = 'UNION'
# Horizontal flip at the original scale (id transform)
__C.TEST.BBOX_AUG.H_FLIP = False
# Each scale is the pixel size of an image's shortest side
__C.TEST.BBOX_AUG.SCALES = ()
# Max pixel size of the longer side
__C.TEST.BBOX_AUG.MAX_SIZE = 4000
# Horizontal flip at each scale
__C.TEST.BBOX_AUG.SCALE_H_FLIP = False
# Apply scaling based on object size
__C.TEST.BBOX_AUG.SCALE_SIZE_DEP = False
__C.TEST.BBOX_AUG.AREA_TH_LO = 50**2
__C.TEST.BBOX_AUG.AREA_TH_HI = 180**2
# Each aspect ratio is relative to image width
__C.TEST.BBOX_AUG.ASPECT_RATIOS = ()
# Horizontal flip at each aspect ratio
__C.TEST.BBOX_AUG.ASPECT_RATIO_H_FLIP = False
# ---------------------------------------------------------------------------- #
# Test-time augmentations for mask detection
# See configs/test_time_aug/e2e_mask_rcnn_R-50-FPN_2x.yaml for an example
# ---------------------------------------------------------------------------- #
__C.TEST.MASK_AUG = AttrDict()
# Enable test-time augmentation for instance mask detection if True
__C.TEST.MASK_AUG.ENABLED = False
# Heuristic used to combine mask predictions
# SOFT prefix indicates that the computation is performed on soft masks
# Valid options: ('SOFT_AVG', 'SOFT_MAX', 'LOGIT_AVG')
__C.TEST.MASK_AUG.HEUR = 'SOFT_AVG'
# Horizontal flip at the original scale (id transform)
__C.TEST.MASK_AUG.H_FLIP = False
# Each scale is the pixel size of an image's shortest side
__C.TEST.MASK_AUG.SCALES = ()
# Max pixel size of the longer side
__C.TEST.MASK_AUG.MAX_SIZE = 4000
# Horizontal flip at each scale
__C.TEST.MASK_AUG.SCALE_H_FLIP = False
# Apply scaling based on object size
__C.TEST.MASK_AUG.SCALE_SIZE_DEP = False
__C.TEST.MASK_AUG.AREA_TH = 180**2
# Each aspect ratio is relative to image width
__C.TEST.MASK_AUG.ASPECT_RATIOS = ()
# Horizontal flip at each aspect ratio
__C.TEST.MASK_AUG.ASPECT_RATIO_H_FLIP = False
# ---------------------------------------------------------------------------- #
# Test-augmentations for keypoints detection
# configs/test_time_aug/keypoint_rcnn_R-50-FPN_1x.yaml
# ---------------------------------------------------------------------------- #
__C.TEST.KPS_AUG = AttrDict()
# Enable test-time augmentation for keypoint detection if True
__C.TEST.KPS_AUG.ENABLED = False
# Heuristic used to combine keypoint predictions
# Valid options: ('HM_AVG', 'HM_MAX')
__C.TEST.KPS_AUG.HEUR = 'HM_AVG'
# Horizontal flip at the original scale (id transform)
__C.TEST.KPS_AUG.H_FLIP = False
# Each scale is the pixel size of an image's shortest side
__C.TEST.KPS_AUG.SCALES = ()
# Max pixel size of the longer side
__C.TEST.KPS_AUG.MAX_SIZE = 4000
# Horizontal flip at each scale
__C.TEST.KPS_AUG.SCALE_H_FLIP = False
# Apply scaling based on object size
__C.TEST.KPS_AUG.SCALE_SIZE_DEP = False
__C.TEST.KPS_AUG.AREA_TH = 180**2
# Eeach aspect ratio is realtive to image width
__C.TEST.KPS_AUG.ASPECT_RATIOS = ()
# Horizontal flip at each aspect ratio
__C.TEST.KPS_AUG.ASPECT_RATIO_H_FLIP = False