主要应用,
1.滤波,如L0 平滑滤波;
2.对比度增强
3.风格转换
4.雾天图像清晰化
5.铅笔化(pencil drawing)
网络结构
网络为9层空洞卷积(dilation convolution),每层卷积核大小为 3×3 ,stride=1. 1-7层dilation 值依次递增,分别为1,2,4,8,16,32,64,8,9层dilation rate为1.第s层的feature maps计算为:
Ls,Ls−1 分别为第s,s-1层的feature maps, Ksi,j,bsi 为卷积核权重, ∗rs 表示空洞卷积操作,dilation 值为 rs .空洞卷积的目的是扩大卷积范围,例如对于图像坐标点x,卷积值为所有满足 a+rsb=x 的所有 Ls−1j(a),ksi,j(b) 点的乘积的均值,
这样对于 3×3 的卷积核,当前特征点的不仅仅与其邻域范围 3×3 的特征点有关,还与dilation范围内的特征点有关.空洞卷积原理可参考:http://blog.csdn.net/u011961856/article/details/77141761 .
每个卷积层激活函数为lrelu. 采用自适应batch normalization,公式为
λs,μs
为需要学习的参数,
BN(x)
为batch normalization.
网络代码实现如下:
def build(input):#stride is 1,rate is dilation rate
net=slim.conv2d(input,24,[3,3],rate=1,activation_fn=lrelu,normalizer_fn=nm,weights_initializer=identity_initializer(),scope='g_conv1')
net=slim.conv2d(net,24,[3,3],rate=2,activation_fn=lrelu,normalizer_fn=nm,weights_initializer=identity_initializer(),scope='g_conv2')
net=slim.conv2d(net,24,[3,3],rate=4,activation_fn=lrelu,normalizer_fn=nm,weights_initializer=identity_initializer(),scope='g_conv3')
net=slim.conv2d(net,24,[3,3],rate=8,activation_fn=lrelu,normalizer_fn=nm,weights_initializer=identity_initializer(),scope='g_conv4')
net=slim.conv2d(net,24,[3,3],rate=16,activation_fn=lrelu,normalizer_fn=nm,weights_initializer=identity_initializer(),scope='g_conv5')
net=slim.conv2d(net,24,[3,3],rate=32,activation_fn=lrelu,normalizer_fn=nm,weights_initializer=identity_initializer(),scope='g_conv6')
net=slim.conv2d(net,24,[3,3],rate=64,activation_fn=lrelu,normalizer_fn=nm,weights_initializer=identity_initializer(),scope='g_conv7')
# net=slim.conv2d(net,24,[3,3],rate=128,activation_fn=lrelu,normalizer_fn=nm,weights_initializer=identity_initializer(),scope='g_conv8')
net=slim.conv2d(net,24,[3,3],rate=1,activation_fn=lrelu,normalizer_fn=nm,weights_initializer=identity_initializer(),scope='g_conv9')
net=slim.conv2d(net,3,[1,1],rate=1,activation_fn=None,scope='g_conv_last')
return net
输出与输入大小相同,损失函数采用L2 loss:
训练:
输入图像大小不固定,输出与输入大小相同.为了使模型能够处理不同分辨率的图像,在训练的时候随机对输入图像resize为不同的大小.
梯度更新采用Adam,学习率lr=0.0001.
模型大小为1.1M.
测试
将demo.py中is_training设置为False.运行python demo.py
采用opencv显示图像,可以在代码后面添加行:
cv2.imshow('output',np.uint8(output_image[0,:,:,:]))
cv2.waitKey(0)