前阵子出了一期关于 artificial mask dataset 的推文,有很多小伙伴在后台留言表示想知道到底是怎么实现给图片人物带上口罩的,今天就这个问题从代码层面跟大家分享一波
先来看看准备条件,这是项目运行时的环境
python 3.7.4 numpy==1.20.2
Pillow==8.2.0
face_recognition==1.3.0
opencv_python==4.5.1.48
接下来我们挑选一张小栗旬的照片
载入图片,获取图片中人脸的边界信息
import face_recognition
face_path = "E:\\ImagesFavour\\xlx.jpg"
# 以默认的"RGB"格式读取图像,返回一个numpy array
face_image_np = face_recognition.load_image_file(face_path)
# 获取脸部的检测框
face_locations = face_recognition.face_locations(face_image_np)
print(face_locations)
face_locations是一个含有人脸边界框坐标的列表,每一个坐标代表一个人脸,下面就是print后的结果
[(96, 225, 225, 96)]
我们在图中加上边界框试试
import cv2
from PIL import Image
# 给图中的每张人脸绘制2D边界框
for face_location in face_locations:
cv2.rectangle(face_image_np, (96, 96), (225, 225), (255, 0, 0), 3)
face_img = Image.fromarray(face_image_np)
face_img.show()
图片显示如下
获取人脸的特征点坐标(landmark)
#返回一个列表,列表中每个元素为字典
face_landmarks = face_recognition.face_landmarks(face_image_np, face_locations)
print(face_landmarks[0].keys())
face_landmarks是一个列表,每个元素为一个字典,可以看看字典中都包含了哪些人脸特征
dict_keys(['chin', 'left_eyebrow', 'right_eyebrow', 'nose_bridge', 'nose_tip', 'left_eye', 'right_eye', 'top_lip', 'bottom_lip'])
我们只需要关注"chin", "nose_bridge"的位置就可以给人物带上口罩了,先来看看这两个部位的特征点在图上如何体现的
for face_landmark in face_landmarks:
#绘制nose_bridge特征点
for point in face_landmark['nose_bridge']:
cv2.circle(face_image_np, point, 2, (0, 0, 255), thickness=2)
#绘制chin的特征点
for point in face_landmark['chin']:
cv2.circle(face_image_np, point, 2, (0, 0, 255), thickness=2)
face_img = Image.fromarray(face_image_np)
face_img.show()
结果如下,可见检测结果还是非常准确的
现在可以开始套口罩了,先准备一张口罩图
然后调整口罩的大小,以及定位在人脸中的位置,这一块在函数mask_face中实现
import numpy as np
def get_distance_from_point_to_line(point, line_point1, line_point2):
distance = np.abs((line_point2[1] - line_point1[1]) * point[0] +
(line_point1[0] - line_point2[0]) * point[1] +
(line_point2[0] - line_point1[0]) * line_point1[1] +
(line_point1[1] - line_point2[1]) * line_point1[0]) / \
np.sqrt((line_point2[1] - line_point1[1]) * (line_point2[1] - line_point1[1]) +
(line_point1[0] - line_point2[0]) * (line_point1[0] - line_point2[0]))
return int(distance)
def mask_face(face_landmark: dict, mask_img, face_img):
nose_bridge = face_landmark['nose_bridge']
nose_point = nose_bridge[len(nose_bridge) * 1 // 4]
nose_v = np.array(nose_point)
chin = face_landmark['chin']
chin_len = len(chin)
chin_bottom_point = chin[chin_len // 2]
chin_bottom_v = np.array(chin_bottom_point)
chin_left_point = chin[chin_len // 8]
chin_right_point = chin[chin_len * 7 // 8]
# split mask and resize
width = mask_img.width
height = mask_img.height
width_ratio = 1.2
new_height = int(np.linalg.norm(nose_v - chin_bottom_v))
# left
mask_left_img = mask_img.crop((0, 0, width // 2, height))
mask_left_width = get_distance_from_point_to_line(chin_left_point, nose_point, chin_bottom_point)
mask_left_width = int(mask_left_width * width_ratio)
mask_left_img = mask_left_img.resize((mask_left_width, new_height))
# right
mask_right_img = mask_img.crop((width // 2, 0, width, height))
mask_right_width = get_distance_from_point_to_line(chin_right_point, nose_point, chin_bottom_point)
mask_right_width = int(mask_right_width * width_ratio)
mask_right_img = mask_right_img.resize((mask_right_width, new_height))
# merge mask
size = (mask_left_img.width + mask_right_img.width, new_height)
mask_img = Image.new('RGBA', size, (255, 255, 255))
mask_img.paste(mask_left_img, (0, 0), mask_left_img)
mask_img.paste(mask_right_img, (mask_left_img.width, 0), mask_right_img)
# rotate mask
angle = np.arctan2(chin_bottom_point[1] - nose_point[1], chin_bottom_point[0] - nose_point[0])
rotated_mask_img = mask_img.rotate(angle, expand=True)
# calculate mask location
center_x = (nose_point[0] + chin_bottom_point[0]) // 2
center_y = (nose_point[1] + chin_bottom_point[1]) // 2
offset = mask_img.width // 2 - mask_left_img.width
radian = angle * np.pi / 180
box_x = center_x + int(offset * np.cos(radian)) - rotated_mask_img.width // 2
box_y = center_y + int(offset * np.sin(radian)) - rotated_mask_img.height // 2
pixeldata = mask_img.load()
for i in range(mask_img.width):
for j in range(mask_img.height):
if pixeldata[i, j][0] > 200 and pixeldata[i, j][1] > 200:
pixeldata[i, j] = (255, 255, 255, 0)
face_img.paste(mask_img, (box_x, box_y), mask_img)
最后我们完成戴口罩这个动作
mask_img = Image.open("E:\\ImagesFavour\\kzh.png").convert('RGBA')
#每张人脸带一次口罩
for face_landmark in face_landmarks:
mask_face(face_landmark, mask_img, face_img)
face_img.show()
可以看到效果非常不错
项目源自:Prajna Bhandary
https://github.com/prajnasb/observations/tree/master/mask_classifier/Data_Generator