TensorRT插件RPROI的使用说明

NvPluginFasterRCNN Plugin

Table Of Contents

Description

The NvPluginFasterRCNN performs object detection for the Faster R-CNN model. This plugin is included in TensorRT and used in sampleFasterRCNN to perform inference.

NvPluginFasterRCNN decodes predicted bounding boxes, extracts their corresponding objectness score, extracts region of interest from predicted bounding boxes using non maximum suppression, and extracts the feature map of region of interest (ROI) using ROI pooling for downstreaming object classification tasks.

This plugin is optimized for the above steps and it allows you to do Faster R-CNN inference in TensorRT.

Structure

The NvPluginFasterRCNN takes four inputs; scores, deltas, fmap and iinfo.

scores
Bounding box (region proposal) objectness scores. scores has shape [N, A x 2, H, W] where N is the batch size, A is the number anchor boxes per pixel on the feature map, H is the height of feature map, and W is the width of feature map. The second dimension is A x 2 because Faster R-CNN uses binary Softmax (probability of having object and probability of not having object) to classify the objectness for each bounding box.

deltas
Predicted bounding box offsets. deltas has shape [N, A x 4, H, W] where N is the batch size, A is the number anchor boxes per pixel on the feature map, H is the height of feature map, and W is the width of feature map. The second dimension is A x 4 because each anchor box or bounding box consists of four parameters.

fmap
Feature map using for bounding box regression and classification. fmap has shape [N, C, H, W] where N is the batch size, C is the number of channels in feature map, H is the height of feature map, and W is the width of feature map.

iinfo
Original image input information. iinfo has shape [N, 3] where N is the batch size, 3 represents the height, width, and resize scale (the same as featureStride) of original input image.

The NvPluginFasterRCNN generates the following two outputs:

rois
Coordinates of region of interest bounding boxes on the original input image. rois has shape [N, 1, nmsMaxOut, 4], where N is the batch size, nmsMaxOut is the maximum number of region of interest bounding boxes, and 4 represents and region of interest bounding box coordinates [x_1, y_1, x_2, y_2]. Here, x_1 and y_1 are the coordinates of bounding box at the top-left corner, and x_2 and y_2 are the coordinates of bounding box at the bottom-right corner.

pfmap
ROI pooled feature map corresponding to the region of interest. pfmap has shape [N, nmsMaxOut, C, poolingH, poolingW] where N is the batch size, nmsMaxOut is the maximum number of region of interest bounding boxes, C is the number of channels in the feature map, poolingH is the height of ROI pooled feature map, and poolingW is the width of ROI pooled feature map.

NvPluginFasterRCNN essentially does region proposal inference followed by region of interest (ROI) pooling.

The proposal inference step includes three steps: extract objectness scores from scores input, decode predicted bounding box from deltas input, non-maximum suppression and get the region of interest bounding boxes using the extracted objectness scores and the decoded bounding boxes.

The ROI pooling step uses the inferred region of interest bounding boxes information to extract its corresponding regions on feature map, and does POI pooling to get uniformly shaped features from different shaped region of interest bounding boxes.

Parameters

NvPluginFasterRCNN has plugin creator class RPROIPluginCreator and plugin class RPROIPlugin.

The RPROIParams data structure was used to create RPROIPlugin instance. The data structure is defined below and consists of the following attributes:

struct RPROIParams
{
	int poolingH, poolingW, featureStride, preNmsTop,
		nmsMaxOut, anchorsRatioCount, anchorsScaleCount;
	float iouThreshold, minBoxSize, spatialScale;
};
TypeParameterDescription
intpoolingHThe height of the output in pixels after ROI pooling on the feature map.
intpoolingWThe width of the output in pixels after ROI pooling on the feature map.
intfeatureStrideThe ratio of the input image size to the feature map size; assuming the max pooling layers in the neural network uses square filters. For example, the input image size is [1600, 800], after max pooling of size [4, 4] twice, the feature map now becomes [100, 50], and featureStride = 4^2 = 16. In the Faster R-CNN settings from the paper, the value is 16.
intpreNmsTopThe number of region proposals before applying NMS using objectness which is the probability of containing an object in the region proposal. The region proposals will be sorted using its objectness. If the number of regions you proposed from the previous region proposal network (RPN) is greater than preNmsTop, the exceeded region proposals with low objectness will be ignored. This value is particularly useful during training to control the number of bounding boxes for regression, but is theoretically useless during inference. In the Faster R-CNN settings from the paper, the value is 6000.
intnmsMaxOutThe number of region proposals after applying NMS. The region proposals will be sorted using its objectness and then applied NMS. At most the nmsMaxOut region proposals exist after NMS is considered as regions of interest.
intanchorsRatioCountThe number of anchor box ratios. For example, if the anchor box ratios (aspect ratios) are 1:1, 1:2, and 2:1, then anchorsRatioCount = 3.
intanchorsScaleCountThe number of anchor box scales. If the anchor box scales (scale factors) are 8, 16, and 32, then anchorsScaleCount = 3.
floatiouThresholdThe IOU threshold used for the NMS step.
floatminBoxSizeThe minimum box size used for the anchor box calculation.
floatspatialScaleThe inverse of featureStride, in other words, spatialScale = 1.0 / featureStride.

Additional resources

The following resources provide a deeper understanding of the NvPluginFasterRCNN plugin:

Networks:

Documentation:

License

For terms and conditions for use, reproduction, and distribution, see the TensorRT Software License Agreement
documentation.

Changelog

May 2019
This is the first release of this README.md file.

Known issues

There are no known issues in this plugin.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值