[拆轮子] PaddleDetection 中的预处理 NormalizeImage

最新推荐文章于 2023-11-18 21:55:19 发布

氵文大师

最新推荐文章于 2023-11-18 21:55:19 发布

阅读量542

点赞数

分类专栏： PaddleDetection 文章标签： python 算法 numpy

本文链接：https://blog.csdn.net/HaoZiHuang/article/details/128418895

版权

PaddleDetection 专栏收录该内容

13 篇文章 1 订阅

订阅专栏

相对路径在这里 ppdet/data/transform/operators.py

上一篇 https://blog.csdn.net/HaoZiHuang/article/details/128398000 中略讲了其基类 BaseOperator

其 __init__ 中初始化了 self._id 比如下边的这个类实例化后，打印一下这个属性是：

>>> self._id
'NormalizeImage_d78ed6'

class NormalizeImage(BaseOperator):
    def __init__(self,
                 mean=[0.485, 0.456, 0.406],
                 std=[0.229, 0.224, 0.225],
                 is_scale=True,
                 norm_type='mean_std'):
        """
        Args:
            mean (list): the pixel mean
            std (list): the pixel variance
            is_scale (bool): scale the pixel to [0,1]
            norm_type (str): type in ['mean_std', 'none']
        """
        super(NormalizeImage, self).__init__()
        self.mean = mean
        self.std = std
        self.is_scale = is_scale
        self.norm_type = norm_type
        if not (isinstance(self.mean, list) and isinstance(self.std, list) and
                isinstance(self.is_scale, bool) and
                self.norm_type in ['mean_std', 'none']):
            raise TypeError("{}: input type is invalid.".format(self))
        from functools import reduce
        if reduce(lambda x, y: x * y, self.std) == 0:
            raise ValueError('{}: std is invalid!'.format(self))

    def apply(self, sample, context=None):
        """Normalize the image.
        Operators:
            1.(optional) Scale the pixel to [0,1]
            2.(optional) Each pixel minus mean and is divided by std
        """
        im = sample['image']
        im = im.astype(np.float32, copy=False)
        if self.is_scale:
            scale = 1.0 / 255.0
            im *= scale

        if self.norm_type == 'mean_std':
            mean = np.array(self.mean)[np.newaxis, np.newaxis, :]
            std = np.array(self.std)[np.newaxis, np.newaxis, :]
            im -= mean
            im /= std
        sample['image'] = im
        return sample

self.mean、self.std 分别是用来对图片进行正则化参数，分别是 [0.485, 0.456, 0.406], [0.229, 0.224, 0.225]

如果 self.is_scale 为 True，则用255对原图先进行归一化
如果 self.norm_type 为 none，则不对图片进行正则化，如果为 'mean_std' 则用self.mean和self.std 进行正则化

NormalizeImage 类仅对图片进行处理

>>> pprint(sample)
{'curr_iter': 0,

 'flipped': True,
 
 'gt_bbox': array([[ 639.524    ,  241.79735  ,  683.641    ,  366.2275   ],
       [ 827.6553   ,  287.004    , 1065.       ,  456.85568  ],
       [   0.       ,  361.1787   ,  111.67373  ,  502.13394  ],
       [ 308.9322   ,  400.6204   ,  533.1966   ,  559.8373   ]],
      dtype=float32),
      
 'gt_class': array([[58],
		......
       [60]], dtype=int32),
       
 'h': 426.0,
 
 'im_id': array([139]),
 
 'im_shape': array([ 736., 1065.], dtype=float32),
 
 'image': array([[[-0.7650483 , -0.757703  , -1.0724183 ],
 		......
        [ 0.8618033 , -0.23249283, -0.7238344 ]]], dtype=float32),
        
 'is_crowd': array([[0],
		......
       [0]], dtype=int32),
       
 'scale_factor': array([1.7903621, 1.7861136], dtype=float32),
 
 'w': 640.0}

注意与 Decode 输出不同的是多了个 'flipped': True，因为我之前通过了 RandomFlip

在这里可能会遇到问题，看一下你的图片是 $x_1y_1x_2y_2$ 标注的还是 $x_cy_cx_2y_2$ 还是 $x_cy_cwh$

x1, y1, x2, y2 = sample['gt_bbox'][1].astype(int)
xx = cv2.rectangle(im, (x1, y1), (x2, y2), 255, thickness=2, lineType=8)
cv2.imwrite("xxx.png", xx)

这里有上边几种格式互相转换的函数们：
https://blog.csdn.net/HaoZiHuang/article/details/128213305

氵文大师

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录