首先说明一点,为什么本结内容是“AI美颜磨皮算法一”?而不是“AI美颜磨皮算法”?
AI美颜磨皮算法目前还没有具体定义,各大公司也都处于摸索阶段,因此,这里只是依据自己的实现方案做了区分,本文算法与下一篇“AI美颜磨皮算法二”在算法角度,有着很大的差异,由此做了区分。
先看一下磨皮算法的一般流程:
这个流程图是一般传统的磨皮算法流程图,而本文将基于这个流程图,结合深度学习做一些改进。
在这个流程图中,主要的模块有两个:滤波模块和肤色区域检测模块;
滤波模块中,包含了三种算法:
1,保边滤波器滤波算法
该方法是指通过一些具有保留边缘的能力的滤波器,来将图像磨平,达到皮肤平滑的目的;
这类滤波器主要有:
①双边滤波器
②导向滤波器
③Surface Blur表面模糊滤波器
④局部均值滤波器
⑤加权最小二乘滤波器(WLS滤波器)
⑥Smart blur等等,详情可参考本人博客。
此方法皮肤区域比较平滑,细节较少,需要后期添加细节信息,来保留一些自然的纹理;
2,高反差减弱算法
高反差保留算法是指通过高反差来得到皮肤细节的MASK,根据MASK中细节区域,比如皮肤中的斑点区域位置,将原图对应区域进行颜色减淡处理,以此来达到斑点弱化,美肤的目的;
该方法在保留纹理的同时,减弱了皮肤瑕疵与斑点的颜色,使得皮肤看起来比较光滑自然;
3,其他算法
这里是指一些未知的算法,当然已知的也有,比如:基于保边滤波和高反差的磨皮算法,该方法同时对原图做了1-2步骤,得到一张光滑的滤波图和高反差对应的细节MASK,然后将MASK作为alpha通道,把原图和滤波图进行Alpha融合,达到平滑皮肤的同时,去除斑点,保留纹理的作用;
皮肤区域识别检测模块
目前常用的皮肤检测主要是基于颜色空间的皮肤颜色统计方法;
该方法具有较高的误检率,容易将类肤色判定为肤色,这样就导致了非皮肤区域图像被滤波器平滑掉了,也就是不该磨皮的图像区域被模糊了;
重点来了,下面我们在传统磨皮算法流程中使用深度学习来改进或者提高我们磨皮的质量,比如:使用深度学习进行皮肤区域分割,得到更为精确的皮肤区域,从而使得我们最后的磨皮效果超越传统算法的效果;
下面,我们介绍基于深度学习的皮肤区域分割:
分割的方法有很多,CNN/FCN/UNet/DenseNet等等,这里我们使用UNet进行皮肤分割:
Unet做图像分割,参考论文如:UNet:Convolutional Networks for Biomedical Image Segmentation.
它最开始的网络模型如下:
这是一个全卷积神经网络,输入和输出都是图像,没有全连接层,较浅的高分辨率层用来解决像素定位的问题,较深的层用来解决像素分类的问题;
左边进行卷积和下采样,同时保留当前结果,右边进行上采样时将上采样结果和左边对应结果进行融合,以此来提高分割效果;
这个网络中左右是不对称的,后来改进的Unet基本上在图像分辨率上呈现出对称的样式,本文这里使用Keras来实现,网络结构如下:
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) (None, 256, 256, 3) 0
__________________________________________________________________________________________________
conv2d_1 (Conv2D) (None, 256, 256, 32) 896 input_1[0][0]
__________________________________________________________________________________________________
batch_normalization_1 (BatchNor (None, 256, 256, 32) 128 conv2d_1[0][0]
__________________________________________________________________________________________________
activation_1 (Activation) (None, 256, 256, 32) 0 batch_normalization_1[0][0]
__________________________________________________________________________________________________
conv2d_2 (Conv2D) (None, 256, 256, 32) 9248 activation_1[0][0]
__________________________________________________________________________________________________
batch_normalization_2 (BatchNor (None, 256, 256, 32) 128 conv2d_2[0][0]
__________________________________________________________________________________________________
activation_2 (Activation) (None, 256, 256, 32) 0 batch_normalization_2[0][0]
__________________________________________________________________________________________________
max_pooling2d_1 (MaxPooling2D) (None, 128, 128, 32) 0 activation_2[0][0]
__________________________________________________________________________________________________
conv2d_3 (Conv2D) (None, 128, 128, 64) 18496 max_pooling2d_1[0][0]
__________________________________________________________________________________________________
batch_normalization_3 (BatchNor (None, 128, 128, 64) 256 conv2d_3[0][0]
__________________________________________________________________________________________________
activation_3 (Activation) (None, 128, 128, 64) 0 batch_normalization_3[0][0]
__________________________________________________________________________________________________
conv2d_4 (Conv2D) (None, 128, 128, 64) 36928 activation_3[0][0]
__________________________________________________________________________________________________
batch_normalization_4 (BatchNor (None, 128, 128, 64) 256 conv2d_4[0][0]
__________________________________________________________________________________________________
activation_4 (Activation) (None, 128, 128, 64) 0 batch_normalization_4[0][0]
__________________________________________________________________________________________________
max_pooling2d_2 (MaxPooling2D) (None, 64, 64, 64) 0 activation_4[0][0]
__________________________________________________________________________________________________
conv2d_5 (Conv2D) (None, 64, 64, 128) 73856 max_pooling2d_2[0][0]
__________________________________________________________________________________________________
batch_normalization_5 (BatchNor (None, 64, 64, 128) 512 conv2d_5[0][0]
__________________________________________________________________________________________________
activation_5 (Activation) (None, 64, 64, 128) 0 batch_normalization_5[0][0]
__________________________________________________________________________________________________
conv2d_6 (Conv2D) (None, 64, 64, 128) 147584 activation_5[0][0]
__________________________________________________________________________________________________
batch_normalization_6 (BatchNor (None, 64, 64, 128) 512 conv2d_6[0][0]
__________________________________________________________________________________________________
activation_6 (Activation) (None, 64, 64, 128) 0 batch_normalization_6[0][0]
__________________________________________________________________________________________________
max_pooling2d_3 (MaxPooling2D) (None, 32, 32, 128) 0 activation_6[0][0]
__________________________________________________________________________________________________
conv2d_7 (Conv2D) (None, 32, 32, 256) 295168 max_pooling2d_3[0][0]
__________________________________________________________________________________________________
batch_normalization_7 (BatchNor (None, 32, 32, 256) 1024 conv2d_7[0][0]
__________________________________________________________________________________________________
activation_7 (Activation) (None, 32, 32, 256) 0 batch_normalization_7[0][0]
__________________________________________________________________________________________________
conv2d_8 (Conv2D) (None, 32, 32, 256) 590080 activation_7[0][0]
__________________________________________________________________________________________________
batch_normalization_8 (BatchNor (None, 32, 32, 256) 1024 conv2d_8[0][0]
__________________________________________________________________________________________________
activation_8 (Activation) (None, 32, 32, 256) 0 batch_normalization_8[0][0]
__________________________________________________________________________________________________
max_pooling2d_4 (MaxPooling2D) (None, 16, 16, 256) 0 activation_8[0][0]
__________________________________________________________________________________________________
conv2d_9 (Conv2D) (None, 16, 16, 512) 1180160 max_pooling2d_4[0][0]
__________________________________________________________________________________________________
batch_normalization_9 (BatchNor (None, 16, 16, 512) 2048 conv2d_9[0][0]
__________________________________________________________________________________________________
activation_9 (Activation) (None, 16, 16, 512) 0 batch_normalization_9[0][0]
__________________________________________________________________________________________________
conv2d_10 (Conv2D) (None, 16, 16, 512) 2359808 activation_9[0][0]
__________________________________________________________________________________________________
batch_normalization_10 (BatchNo (None, 16, 16, 512) 2048 conv2d_10[0][0]
__________________________________________________________________________________________________
activation_10 (Activation) (None, 16, 16, 512) 0 batch_normalization_10[0][0]
__________________________________________________________________________________________________
max_pooling2d_5 (MaxPooling2D) (None, 8, 8, 512) 0 activation_10[0][0]
__________________________________________________________________________________________________
conv2d_11 (Conv2D) (None, 8, 8, 1024) 4719616 max_pooling2d_5[0][0]
__________________________________________________________________________________________________
batch_normalization_11 (BatchNo (None, 8, 8, 1024) 4096 conv2d_11[0][0]
__________________________________________________________________________________________________
activation_11 (Activation) (None, 8, 8, 1024) 0 batch_normalization_11[0][0]
__________________________________________________________________________________________________
conv2d_12 (Conv2D) (None, 8, 8, 1024) 9438208 activation_11[0][0]
__________________________________________________________________________________________________
batch_normalization_12 (BatchNo (None, 8, 8, 1024) 4096 conv2d_12[0][0]
__________________________________________________________________________________________________
activation_12 (Activation) (None, 8, 8, 1024) 0 batch_normalization_12[0][0]
__________________________________________________________________________________________________
up_sampling2d_1 (UpSampling2D) (None, 16, 16, 1024) 0 activation_12[0][0]
__________________________________________________________________________________________________
concatenate_1 (Concatenate) (None, 16, 16, 1536) 0 activation_10[0][0]
up_sampling2d_1[0][0]
__________________________________________________________________________________________________
conv2d_13 (Conv2D) (None, 16, 16, 512) 7078400 concatenate_1[0][0]
__________________________________________________________________________________________________
batch_normalization_13 (BatchNo (None, 16, 16, 512) 2048 conv2d_13[0][0]
__________________________________________________________________________________________________
activation_13 (Activation) (None, 16, 16, 512) 0 batch_normalization_13[0][0]
__________________________________________________________________________________________________
conv2d_14 (Conv2D) (None, 16, 16, 512) 2359808 activation_13[0][0]
__________________________________________________________________________________________________
batch_normalization_14 (BatchNo (None, 16, 16, 512) 2048 conv2d_14[0][0]
__________________________________________________________________________________________________
activation_14 (Activation) (None, 16, 16, 512) 0 batch_normalization_14[0][0]
__________________________________________________________________________________________________
conv2d_15 (Conv2D) (None, 16, 16, 512) 2359808 activation_14[0][0]
__________________________________________________________________________________________________
batch_normalization_15 (BatchNo (None, 16, 16, 512) 2048 conv2d_15[0][0]
__________________________________________________________________________________________________
activation_15 (Activation) (None, 16, 16, 512) 0 batch_normalization_15[0][0]
__________________________________________________________________________________________________
up_sampling2d_2 (UpSampling2D) (None, 32, 32, 512) 0 activation_15[0][0]
__________________________________________________________________________________________________
concatenate_2 (Concatenate) (None, 32, 32, 768) 0 activation_8[0][0]
up_sampling2d_2[0][0]
__________________________________________________________________________________________________
conv2d_16 (Conv2D) (None, 32, 32, 256) 1769728 concatenate_2[0][0]
__________________________________________________________________________________________________
batch_normalization_16 (BatchNo (None, 32, 32, 256) 1024 conv2d_16[0][0]
__________________________________________________________________________________________________
activation_16 (Activation) (None, 32, 32, 256) 0 batch_normalization_16[0][0]
__________________________________________________________________________________________________
conv2d_17 (Conv2D) (None, 32, 32, 256) 590080 activation_16[0][0]
__________________________________________________________________________________________________
batch_normalization_17 (BatchNo (None, 32, 32, 256) 1024 conv2d_17[0][0]
__________________________________________________________________________________________________
activation_17 (Activation) (None, 32, 32, 256) 0 batch_normalization_17[0][0]
__________________________________________________________________________________________________
conv2d_18 (Conv2D) (None, 32, 32, 256) 590080 activation_17[0][0]
__________________________________________________________________________________________________
batch_normalization_18 (BatchNo (None, 32, 32, 256) 1024 conv2d_18[0][0]
__________________________________________________________________________________________________
activation_18 (Activation) (None, 32, 32, 256) 0 batch_normalization_18[0][0]
__________________________________________________________________________________________________
up_sampling2d_3 (UpSampling2D) (None, 64, 64, 256) 0 activation_18[0][0]
__________________________________________________________________________________________________
concatenate_3 (Concatenate) (None, 64, 64, 384) 0 activation_6[0][0]
up_sampling2d_3[0][0]
__________________________________________________________________________________________________
conv2d_19 (Conv2D) (None, 64, 64, 128) 442496 concatenate_3[0][0]
__________________________________________________________________________________________________
batch_normalization_19 (BatchNo (None, 64, 64, 128) 512 conv2d_19[0][0]
__________________________________________________________________________________________________
activation_19 (Activation) (None, 64, 64, 128) 0 batch_normalization_19[0][0]
__________________________________________________________________________________________________
conv2d_20 (Conv2D) (None, 64, 64, 128) 147584 activation_19[0][0]
__________________________________________________________________________________________________
batch_normalization_20 (BatchNo (None, 64, 64, 128) 512 conv2d_20[0][0]
__________________________________________________________________________________________________
activation_20 (Activation) (None, 64, 64, 128) 0 batch_normalization_20[0][0]
__________________________________________________________________________________________________
conv2d_21 (Conv2D) (None, 64, 64, 128) 147584 activation_20[0][0]
__________________________________________________________________________________________________
batch_normalization_21 (BatchNo (None, 64, 64, 128) 512 conv2d_21[0][0]
__________________________________________________________________________________________________
activation_21 (Activation) (None, 64, 64, 128) 0 batch_normalization_21[0][0]
__________________________________________________________________________________________________
up_sampling2d_4 (UpSampling2D) (None, 128, 128, 128 0 activation_21[0][0]
__________________________________________________________________________________________________
concatenate_4 (Concatenate) (None, 128, 128, 192 0 activation_4[0][0]
up_sampling2d_4[0][0]
__________________________________________________________________________________________________
conv2d_22 (Conv2D) (None, 128, 128, 64) 110656 concatenate_4[0][0]
__________________________________________________________________________________________________
batch_normalization_22 (BatchNo (None, 128, 128, 64) 256 conv2d_22[0][0]
__________________________________________________________________________________________________
activation_22 (Activation) (None, 128, 128, 64) 0 batch_normalization_22[0][0]
__________________________________________________________________________________________________
conv2d_23 (Conv2D) (None, 128, 128, 64) 36928 activation_22[0][0]
__________________________________________________________________________________________________
batch_normalization_23 (BatchNo (None, 128, 128, 64) 256 conv2d_23[0][0]
__________________________________________________________________________________________________
activation_23 (Activation) (None, 128, 128, 64) 0 batch_normalization_23[0][0]
__________________________________________________________________________________________________
conv2d_24 (Conv2D) (None, 128, 128, 64) 36928 activation_23[0][0]
__________________________________________________________________________________________________
batch_normalization_24 (BatchNo (None, 128, 128, 64) 256 conv2d_24[0][0]
__________________________________________________________________________________________________
activation_24 (Activation) (None, 128, 128, 64) 0 batch_normalization_24[0][0]
__________________________________________________________________________________________________
up_sampling2d_5 (UpSampling2D) (None, 256, 256, 64) 0 activation_24[0][0]
__________________________________________________________________________________________________
concatenate_5 (Concatenate) (None, 256, 256, 96) 0 activation_2[0][0]
up_sampling2d_5[0][0]
__________________________________________________________________________________________________
conv2d_25 (Conv2D) (None, 256, 256, 32) 27680 concatenate_5[0][0]
__________________________________________________________________________________________________
batch_normalization_25 (BatchNo (None, 256, 256, 32) 128 conv2d_25[0][0]
__________________________________________________________________________________________________
activation_25 (Activation) (None, 256, 256, 32) 0 batch_normalization_25[0][0]
__________________________________________________________________________________________________
conv2d_26 (Conv2D) (None, 256, 256, 32) 9248 activation_25[0][0]
__________________________________________________________________________________________________
batch_normalization_26 (BatchNo (None, 256, 256, 32) 128 conv2d_26[0][0]
__________________________________________________________________________________________________
activation_26 (Activation) (None, 256, 256, 32) 0 batch_normalization_26[0][0]
__________________________________________________________________________________________________
conv2d_27 (Conv2D) (None, 256, 256, 32) 9248 activation_26[0][0]
__________________________________________________________________________________________________
batch_normalization_27 (BatchNo (None, 256, 256, 32) 128 conv2d_27[0][0]
__________________________________________________________________________________________________
activation_27 (Activation) (None, 256, 256, 32) 0 batch_normalization_27[0][0]
__________________________________________________________________________________________________
conv2d_28 (Conv2D) (None, 256, 256, 1) 33 activation_27[0][0]
==================================================================================================
UNet网络代码如下:
def get_unet_256(input_shape=(256, 256, 3),
num_classes=1):
inputs = Input(shape=input_shape)
# 256
down0 = Conv2D(32, (3, 3), padding='same')(inputs)
down0 = BatchNormalization()(down0)
down0 = Activation('relu')(down0)
down0 = Conv2D(32, (3, 3), padding='same')(down0)
down0 = BatchNormalization()(down0)
down0 = Activation('relu')(down0)
down0_pool = MaxPooling2D((2, 2), strides=(2, 2))(down0)
# 128
down1 = Conv2D(64, (3, 3), padding='same')(down0_pool)
down1 = BatchNormalization()(down1)
down1 = Activation('relu')(down1)
down1 = Conv2D(64, (3, 3), padding='same')(down1)
down1 = BatchNormalization()(down1)
down1 = Activation('relu')(down1)
down1_pool = MaxPooling2D((2, 2), strides=(2, 2))(down1)
# 64
down2 = Conv2D(128, (3, 3), padding='same')(down1_pool)
down2 = BatchNormalization()(down2)
down2 = Activation('relu')(down2)
down2 = Conv2D(128, (3, 3), padding='same')(down2)
down2 = BatchNormalization()(down2)
down2 = Activation('relu')(down2)
down2_pool = MaxPooling2D((2, 2), strides=(2, 2))(down2)
# 32
down3 = Conv2D(256, (3, 3), padding='same')(down2_pool)
down3 = BatchNormalization()(down3)
down3 = Activation('relu')(down3)
down3 = Conv2D(256, (3, 3), padding='same')(down3)
down3 = BatchNormalization()(down3)
down3 = Activation('relu')(down3)
down3_pool = MaxPooling2D((2, 2), strides=(2, 2))(down3)
# 16
down4 = Conv2D(512, (3, 3), padding='same')(down3_pool)
down4 = BatchNormalization()(down4)
down4 = Activation('relu')(down4)
down4 = Conv2D(512, (3, 3), padding='same')(down4)
down4 = BatchNormalization()(down4)
down4 = Activation('relu')(down4)
down4_pool = MaxPooling2D((2, 2), strides=(2, 2))(down4)
# 8
center = Conv2D(1024, (3, 3), padding='same')(down4_pool)
center = BatchNormalization()(center)
center = Activation('relu')(center)
center = Conv2D(1024, (3, 3), padding='same')(center)
center = BatchNormalization()(center)
center = Activation('relu')(center)
# center
up4 = UpSampling2D((2, 2))(center)
up4 = concatenate([down4, up4], axis=3)
up4 = Conv2D(512, (3, 3), padding='same')(up4)
up4 = BatchNormalization()(up4)
up4 = Activation('relu')(up4)
up4 = Conv2D(512, (3, 3), padding='same')(up4)
up4 = BatchNormalization()(up4)
up4 = Activation('relu')(up4)
up4 = Conv2D(512, (3, 3), padding='same')(up4)
up4 = BatchNormalization()(up4)
up4 = Activation('relu')(up4)
# 16
up3 = UpSampling2D((2, 2))(up4)
up3 = concatenate([down3, up3], axis=3)
up3 = Conv2D(256, (3, 3), padding='same')(up3)
up3 = BatchNormalization()(up3)
up3 = Activation('relu')(up3)
up3 = Conv2D(256, (3, 3), padding='same')(up3)
up3 = BatchNormalization()(up3)
up3 = Activation('relu')(up3)
up3 = Conv2D(256, (3, 3), padding='same')(up3)
up3 = BatchNormalization()(up3)
up3 = Activation('relu')(up3)
# 32
up2 = UpSampling2D((2, 2))(up3)
up2 = concatenate([down2, up2], axis=3)
up2 = Conv2D(128, (3, 3), padding='same')(up2)
up2 = BatchNormalization()(up2)
up2 = Activation('relu')(up2)
up2 = Conv2D(128, (3, 3), padding='same')(up2)
up2 = BatchNormalization()(up2)
up2 = Activation('relu')(up2)
up2 = Conv2D(128, (3, 3), padding='same')(up2)
up2 = BatchNormalization()(up2)
up2 = Activation('relu')(up2)
# 64
up1 = UpSampling2D((2, 2))(up2)
up1 = concatenate([down1, up1], axis=3)
up1 = Conv2D(64, (3, 3), padding='same')(up1)
up1 = BatchNormalization()(up1)
up1 = Activation('relu')(up1)
up1 = Conv2D(64, (3, 3), padding='same')(up1)
up1 = BatchNormalization()(up1)
up1 = Activation('relu')(up1)
up1 = Conv2D(64, (3, 3), padding='same')(up1)
up1 = BatchNormalization()(up1)
up1 = Activation('relu')(up1)
# 128
up0 = UpSampling2D((2, 2))(up1)
up0 = concatenate([down0, up0], axis=3)
up0 = Conv2D(32, (3, 3), padding='same')(up0)
up0 = BatchNormalization()(up0)
up0 = Activation('relu')(up0)
up0 = Conv2D(32, (3, 3), padding='same')(up0)
up0 = BatchNormalization()(up0)
up0 = Activation('relu')(up0)
up0 = Conv2D(32, (3, 3), padding='same')(up0)
up0 = BatchNormalization()(up0)
up0 = Activation('relu')(up0)
# 256
classify = Conv2D(num_classes, (1, 1), activation='sigmoid')(up0)
model = Model(inputs=inputs, outputs=classify)
#model.compile(optimizer=RMSprop(lr=0.0001), loss=bce_dice_loss, metrics=[dice_coeff])
return model
输入为256X256X3的彩色图,输出为256X256X1的MASK,训练参数如下:
model.compile(optimizer = "adam", loss = 'binary_crossentropy', metrics = ["accuracy"])
model.fit(image_train, label_train,epochs=100,verbose=1,validation_split=0.2, shuffle=True,batch_size=8)
效果图如下:
本人这里训练集中样本标定是把人脸区域都当作了肤色区域,因此没有排除五官区域,如果要得到不包含五官的皮肤区域,只需要替换相应样本就可以了。
拿到了精确的肤色区域,我们就可以更新磨皮算法,这里给出一组效果图:

大家可以看到,基于颜色空间的传统磨皮算法始终无法精确区分皮肤区域与类肤色区域,因此在头发的地方也做了磨皮操作,导致头发纹理细节丢失,而基于Unet皮肤分割的磨皮算法则可以很好的区分皮肤与头发这种类肤色区域,进而将头发的纹理细节保留,达到该磨皮的地方磨皮,不该磨皮的地方不磨,效果明显优于传统方法。
目前美图秀秀,天天P图等主流公司也都已经使用了基于深度学习肤色分割的算法来提高磨皮的效果,这里给大家简单介绍一下,帮助大家更好的理解。
当然,使用深度学习的方法来改进传统方法,只是一个模式,因此这里文章标题为AI美颜磨皮算法一,在AI美颜磨皮算法二中,本人将完全抛弃传统方法,完全基于深度学习来实现磨皮美颜的效果。
最后,本人使用的训练样本来源于网络中的lfw训练集,大家可以搜索一下,很容易就可以找到了,当然,如果你要精确的样本集,并且不包含五官区域,那还是自己标记的好,本人QQ1358009172