pytorch的mask-rcnn的模型参数解释

最新推荐文章于 2025-05-23 14:00:19 发布

TomcatLikeYou

最新推荐文章于 2025-05-23 14:00:19 发布

阅读量1.2k

点赞数 14

文章标签： pytorch 人工智能 python

本文链接：https://blog.csdn.net/qq_37293230/article/details/138081000

版权

文章详细描述了使用MaskR-CNNResNet50FPN模型进行图像识别的训练过程，包括损失函数（如分类、边界框回归和掩码损失）的计算，以及推理阶段的预测输出。涉及的关键组件和步骤对于理解深度学习在目标检测中的应用至关重要。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

输入图像1920x1080,batch_size=8为例.

训练阶段

入参
- images: List(Tensor(3,1920,1080))[8]
- targets: List(dict()[3])[8] dict详情见下表:

key	type	dtype	size	remark
boxes	Tensor	float32	(n,4)¹	the ground-truth boxes in [x1, y1, x2, y2] format, with 0 <= x1 < x2 <= W and 0 <= y1 < y2 <= H.
labels	Tenosr	int64	(n,)	the class label for each ground-truth box
maskes	Tensor	uint8	(n,1920,1080)[N,H,W]	the segmentation binary masks for each instance,实际就是0和1,有对象的区域就是1,否则就是0,这个照片有多少个对象就有多少个mask
area*	Tensor	float32	(n,)	对象面积
iscrowd*	Tensor	int64	(n,)	是否为一群对象(coco数据集会标注)
image_id*	int			图像编号

*为非必要参数,有一些数据集合处理的时候会标注上去*

key	type	dtype	size	损失函数	remark
loss_classifer	Tensor	float32	()	CrossEntropyLoss	对象分类损失（Classification Loss）：
loss_box_reg	Tensor	float32	()	Smooth L1 Loss/MSE	边界框回归损失（Bounding Box Regression Loss）：
loss_mask	Tensor	float32	()	Binary Cross-Entropy Loss	掩膜损失（Mask Loss）：
loss_objectness	Tensor	float32	()	CrossEntropyLoss	RPN分类损失（RPN Classification Loss）：前景/背景二分类损失
loss_rpn_box_reg	Tensor	float32	()	Smooth L1/MSE	RPN边界框回归损失（RPN Bounding Box Regression Loss）

key	type	dtype	size	remark
boxes	Tensor	float32	(m,4)²	the predicted boxes in [x1, y1, x2, y2] format,预测的所有的边界框
labels	Tensor	int64	(m,)	the predicted labels for each instance
boxes	Tensor	float32	(m,)	the scores or each instance
boxes	Tensor	float32	(m,1,1920,1080)[M, 1, H, W]	the predicted masks for each instance, in 0-1 range. In order to obtain the final segmentation masks, the soft masks can be thresholded, generally with a value of 0.5 (mask >= 0.5).实际存储的是一个软掩膜,0.5以下的也有,存在比较平滑的过度