论文解读《Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network》SRGAN

1 文章介绍

在本文中,作者介绍了SRGAN,这是一种用于图像超分辨率(SR)的生成对抗网络(GAN)。号称第一个能够为4倍放大因子推断出逼真的自然图像的框架。而且文章提出了一种感知损失函数,该函数由对抗损失和内容损失组成。此外,作者使用感知相似性而非像素空间相似性引起的内容损失。广泛的均值评分(MOS)测试显示,使用SRGAN可以显着提高感知质量。用SRGAN获得的MOS得分比使用任何最新方法获得的MOS得分更接近原始高分辨率图像的MOS得分。
在这里插入图片描述

2 文章贡献

•通过PSNR和结构相似性(SSIM)来衡量具有高放大倍数(4x)的图像SR的最新技术。
•基于MSE的内容损失替换为在VGG网络的特征图上计算的损失,该损失对于像素空间的变化更加不变。
•对来自三个公共基准数据集的图像进行了广泛的平均意见评分(MOS)测试,证实SRGAN的优越性能

3 loss函数

loss分两部分,分别为content loss和对抗adversarial loss:以下为总公式,可以看出两个loss的权重占比
在这里插入图片描述

3.1 content loss

作者没有使用传统的l1或者l2loss,而是使用了如图所示的loss
在这里插入图片描述
作者认为传统的损失函数对学习图像的高频细节不友好,所以作者使用了此loss
由VGG得来:φij的由来:由VGG19 network 内的第i个maxpooling layer前 的第j层卷积(after activation)得到的feature map 特征图,这里有点绕,其实VGG loss是求重构图像和参考图像的特征图的欧式距离,就是求feature map的欧氏距离

3.2 adversarial loss

对抗loss是常用的对抗loss
在这里插入图片描述

4 网络结构

作者给出的图非常清楚,k:卷积核,n:feature map,s:步长
在这里插入图片描述
生成网络结构,基于Resnet网络结构;

辨别网络结构,LeakyReLU(0.2)为激活函数,featuremap层数从64到512,后面连接两个全连接层和一个sigmoid层,用来判断是否为同一图像的概率;

5 数据集

三个广泛使用的基准数据集Set5 [3],Set14 [69]和BSD100(BSD300的测试集[41])上进行了实验。

6 实验结果

1) MOS评价
在这里插入图片描述
这里的SRGAN比较符合HR的MOS分布
2)不同的VGG 层作为特征的损失函数性能:
在这里插入图片描述

  • 1
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
Image super-resolution (SR) is the process of increasing the resolution of a low-resolution (LR) image to a higher resolution (HR) version. This is an important task in computer vision and has many practical applications, such as improving the quality of images captured by low-resolution cameras or enhancing the resolution of medical images. However, most existing SR methods suffer from a loss of texture details and produce overly smooth HR images, which can result in unrealistic and unappealing results. To address this issue, a new SR method called Deep Spatial Feature Transform (DSFT) has been proposed. DSFT is a deep learning-based approach that uses a spatial feature transform layer to recover realistic texture in the HR image. The spatial feature transform layer takes the LR image and a set of HR feature maps as input and transforms the features to a higher dimensional space. This allows the model to better capture the high-frequency details in the image and produce more realistic HR images. The DSFT method also employs a multi-scale approach, where the LR image is processed at multiple scales to capture both local and global features. Additionally, the model uses residual connections to improve the training process and reduce the risk of overfitting. Experimental results show that DSFT outperforms state-of-the-art SR methods in terms of both quantitative metrics and visual quality. The method is also shown to be robust to different noise levels and image degradation scenarios. In summary, DSFT is a promising approach for realistic texture recovery in image super-resolution. Its ability to capture high-frequency details and produce visually appealing HR images makes it a valuable tool for various applications in computer vision.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值