超分辨率：将背景和人脸分离，人脸、背景分别做增分后将人脸贴回背景图

庄仪浩

已于 2022-03-25 17:55:06 修改

阅读量4.6k

点赞数 6

分类专栏： Python 文章标签： python 计算机视觉人工智能

于 2021-11-21 21:02:00 首次发布

本文链接：https://blog.csdn.net/qq_20265015/article/details/121457113

版权

Python 专栏收录该内容

4 篇文章 0 订阅

订阅专栏

项目场景：再对一张图像做超分辨率，提取人脸单独做人脸超分，最后把结果贴回原图，实现人脸和背景分离的超分结合。

背景（普通自然景物超分辨率）和人脸超分辨率相结合，可以实现更高的超分效果，提升结果的观感。

# 问题描述与原因分析：对一张有人脸的图片做超分时候，如果单纯是使用一个自然场景的超分辨率网络，背景部分应该可以较好地还原，毕竟模型在训练的时候有大量的自然景物数据集作为支撑，但是对于人脸区域，使用景物的超分网络效果不一定好，因为人脸超分网络需要用大量的人脸（正脸）数据来训练。

解决方案：

总体思想是先将图片中的所有人脸检测出来，单独做人脸区域超分，然后对背景做超分，使用Mask的方式将人脸再贴到结果图片。

第一步是检测人脸，我这里用的是 facexlib 这个人脸检测库，因为找到类似的代码就直接用了，还有一个很著名的dlib人脸检测库可以使用，它们的都是基于模型学习的方法来检测人脸，检测出的人脸框出并resize到 512*512 分辨率。

pip install facexlib

from facexlib.utils.face_restoration_helper import FaceRestoreHelper
##定义一个检测器
face_helper = FaceRestoreHelper(
            upscale,
            face_size=512,
            crop_ratio=(1, 1),
            det_model='retinaface_resnet50',
            save_ext='png',
            device=self.device)


## 开始检测 输入的是一张图片
def enhance_with_face(self,img，paste_back = True)
	self.face_helper.clean_all()
	self.face_helper.read_image(img)
	# get face landmarks for each face
    self.face_helper.get_face_landmarks_5(only_center_face=only_center_face)
    # align and warp each face 
    self.face_helper.align_warp_face() 
    '''
    通过以上步骤就可以获取到图片中的对齐人脸图片 512*512，脸不够大的会自行resize() 以及对齐信息
    （对齐其实就是一个图像旋转操作，对齐有助于恢复，恢复完需要将图像旋转回去才能贴回原图）
    '''
	# face restoration
     for cropped_face in self.face_helper.cropped_faces:
            # prepare data
            
            '''
            这里的cropped_face 是可以用openCV 直接保留下来的 numpy uint8 格式的数据
            下面是一些操作，升维，归一化，数据类型变为float,转为tensor ，要把输入变为符合对应人脸超分模型的输入的格式
            '''
            
            ## todo 有空的时侯 把升维度，归一化，bgr转rgb等操作改为可判断操作 重构一下代码
            cropped_face_t = img2tensor(cropped_face / 255., bgr2rgb=True, float32=True)
            # normalize(cropped_face_t, (0.5, 0.5, 0.5), (0.5, 0.5, 0.5), inplace=True)
            cropped_face_t = cropped_face_t.unsqueeze(0).to(self.device)
            
            try:
                ##TODO here can change the mote sota algorithm TO  get  higher quality face
                output_img = self.model(cropped_face_t)
            except RuntimeError as error:
                print(f'\tFailed inference for GFPGAN: {error}.')
                output_img = cropped_face
            
            # restored_face = restored_face.astype('uint8')
            output_img = output_img.data.squeeze().float().cpu().clamp_(0, 1).numpy()
            output_img = np.transpose(output_img[[2, 1, 0], :, :], (1, 2, 0))
            output_img = (output_img * 255.0).round().astype(np.uint8)
            # cv2.imwrite('1.png',output_img)

			''' 这里的output_img 是可以cv2进行保存的 uint8格式'''
            self.face_helper.add_restored_face(output_img)

        ## here handle the background with real-esrgan,bg_upsampler 是一个自然场景的超分网络
        ## always paste_back == True
        if paste_back:
            if self.bg_upsampler is not None:
                # Now only support RealESRGAN
                bg_img = self.bg_upsampler.enhance(img, outscale=self.scale)[0]
            else:
                bg_img = None
            self.face_helper.get_inverse_affine(None)

			'''将裁出来的人脸，增分后，旋转回原来的角度，（之前水平对齐过，保留了旋转参数） 再贴回到原图片中
			这是一个比较复杂的操作 ， 大概是 output = Mask * face + (1-Mask) * background
			为了避免像拼图，裁出来人脸拼接回去时有明显的裂缝，需要对背景做一个加模糊和腐蚀的操作
			具体流程在最下方  paste_faces_to_input_image方法中
 '''
            restored_img = self.face_helper.paste_faces_to_input_image(upsample_img=bg_img)
            return self.face_helper.cropped_faces, self.face_helper.restored_faces, restored_img
        else:
            return self.face_helper.cropped_faces, self.face_helper.restored_faces, None


'''
主要思路就是 人脸图片的旋转，定义一个带模糊和腐蚀操作的mask,然后将人脸贴回到原图中
'''
def paste_faces_to_input_image(self, save_path=None, upsample_img=None):
    h, w, _ = self.input_img.shape
    h_up, w_up = int(h * self.upscale_factor), int(w * self.upscale_factor)

    if upsample_img is None:
        # simply resize the background
        upsample_img = cv2.resize(self.input_img, (w_up, h_up), interpolation=cv2.INTER_LANCZOS4)
    else:
        upsample_img = cv2.resize(upsample_img, (w_up, h_up), interpolation=cv2.INTER_LANCZOS4)

    assert len(self.restored_faces) == len(
        self.inverse_affine_matrices), ('length of restored_faces and affine_matrices are different.')
    for restored_face, inverse_affine in zip(self.restored_faces, self.inverse_affine_matrices):
        # Add an offset to inverse affine matrix, for more precise back alignment
        if self.upscale_factor > 1:
            extra_offset = 0.5 * self.upscale_factor
        else:
            extra_offset = 0
        inverse_affine[:, 2] += extra_offset
        inv_restored = cv2.warpAffine(restored_face, inverse_affine, (w_up, h_up))
        mask = np.ones(self.face_size, dtype=np.float32)
        inv_mask = cv2.warpAffine(mask, inverse_affine, (w_up, h_up))
        # remove the black borders
        inv_mask_erosion = cv2.erode(
            inv_mask, np.ones((int(2 * self.upscale_factor), int(2 * self.upscale_factor)), np.uint8))
        pasted_face = inv_mask_erosion[:, :, None] * inv_restored
        total_face_area = np.sum(inv_mask_erosion)  # // 3
        # compute the fusion edge based on the area of face
        w_edge = int(total_face_area**0.5) // 20
        erosion_radius = w_edge * 2
        inv_mask_center = cv2.erode(inv_mask_erosion, np.ones((erosion_radius, erosion_radius), np.uint8))
        blur_size = w_edge * 2
        inv_soft_mask = cv2.GaussianBlur(inv_mask_center, (blur_size + 1, blur_size + 1), 0)
        if len(upsample_img.shape) == 2:  # upsample_img is gray image
            upsample_img = upsample_img[:, :, None]
        inv_soft_mask = inv_soft_mask[:, :, None]

        if len(upsample_img.shape) == 3 and upsample_img.shape[2] == 4:  # alpha channel
            alpha = upsample_img[:, :, 3:]
            upsample_img = inv_soft_mask * pasted_face + (1 - inv_soft_mask) * upsample_img[:, :, 0:3]
            upsample_img = np.concatenate((upsample_img, alpha), axis=2)
        else:
            upsample_img = inv_soft_mask * pasted_face + (1 - inv_soft_mask) * upsample_img

    if np.max(upsample_img) > 256:  # 16-bit image
        upsample_img = upsample_img.astype(np.uint16)
    else:
        upsample_img = upsample_img.astype(np.uint8)
    if save_path is not None:
        path = os.path.splitext(save_path)[0]
        save_path = f'{path}.{self.save_ext}'
        imwrite(upsample_img, save_path)
    return upsample_img

效果对比
人脸对齐前：

人脸对齐后：

在这里插入图片描述

效果：
原图像，有背景的线条，也有人脸

仅使用一个自然场景的超分网络 real-esrgan ，可以看到背景的窗格效果锐化了很多，但是人脸还是不太ok,比如牙齿头发细节不够
在这里插入图片描述

背景和人脸分开做超分，人脸看起来又得到明显加强，例如牙齿部位，脸型轮廓，但还是看起来怪怪的，用的还是今年的sota方法，以后如有更好的算法，也会再回来记录一下。
在这里插入图片描述

自己微调的模型生成的人脸效果
在这里插入图片描述

庄仪浩

关注

6
点赞
踩
15

收藏

觉得还不错? 一键收藏
0
评论
超分辨率：将背景和人脸分离，人脸、背景分别做增分后将人脸贴回背景图

项目场景：再对一张图像做超分辨率，提取人脸单独做人脸超分，最后把结果贴回原图，实现人脸和背景分离的超分结合。背景（自然景物超分辨率）和人脸超分辨率相结合，可以实现更高的超分效果，提升结果的观感。# 问题描述与原因分析：对一张有人脸的图片做超分时候，如果单纯是使用一个自然场景的超分辨率网络，背景部分应该可以较好地还原，毕竟模型在训练的时候有大量的自然景物数据集作为支撑，但是对于人脸区域，使用景物的超分网络效果不一定好，因为人脸超分网络需要用大量的人脸（正脸）数据来训练。解决方案：总体思
复制链接

扫一扫