论文阅读:EDVR: Video Restoration with Enhanced Deformable Convolutional Networks

Abstract

**1.benchmark:**REDS is released in the NTIRE19 Challenge,contain larger and more complex motions
2.难点:
(1)align multiple frames given large motions
(2)effectively fuse(融合) different frames with diverse motion and blur。
3.方法:Enhanced Deformable convolutions,
(1)a Pyramid, Cascading
and Deformable (PCD) alignment module, in which frame
alignment is done at the feature level using deformable convolutions in a coarse-to-fine manner
(2)a Temporal and Spatial Attention (TSA) fusion module to emphasize important features for subsequent restoration.
4.代码https://github.com/xinntao/EDVR.
**论文地址:**https://arxiv.org/abs/1905.02716v1

Introduction

1.Earlier studies: a simple extension of image restoration,The temporal redundancy among
neighboring frames is not fully exploited
**2.Recent studies:**精细化,主要包含四部分feature extraction, alignment, fusion(融合), and
reconstruction(重建)。当存在遮挡,大幅度运动,严重模糊时,主要的挑战在alignment 和 fusion modules。为了得到高质量的图像必须(1)align and establish accurate correspondences among multiple frames(2) effectively fuse
the aligned features for reconstruction
**3.Alignment:**多用 flow-based methods
Fusion:use convolutions to perform early fusion on all
frames or adopt recurrent networks to gradually fuse
multiple frames,Ding Liu, Robust video super-resolution with learned temporal dynamics propose a temporal
adaptive network that can dynamically fuse across different temporal scales。
4.our solution:EDVR are (1) an alignment module known as Pyramid,Cascading and Deformable convolutions (PCD), and (2) a fusion module known as Temporal and Spatial Attention
(TSA)
PCD:灵感来源Yapeng Tian,TDAN: Temporally deformable alignment network for video super-resolution. using
deformable convolutions to align each neighboring frame
to the reference frame at the feature level.Different from
TDAN, we perform alignment in (1)a coarse-to-fine manner to handle large and complex motions. (2)pyramid structure,(3) cascade an additional deformable convolution after the pyramidal alignment
TSA:(1)temporal attention(2) spatial attention

Related Work

1.Video Restoration:SRCNN首次采用deeplearning,还有其他方法(flow-based),但是occlusion 和运动accurate flow is difficult to obtain given occlusion and large motions,DUF [10] and TDAN [40]circumvent the problem by implicit motion compensation and surpass the flow-based methods.
2.Deformable Convolution.:Jifeng Dai, Deformable convolutional networks. obtain information away from its regular local neighborhood, improving the capability of regular convolutions.Deformable convolutions are widely used in various tasks such as video object detection [1], action recognition [53], semantic segmentation [3],and video super-resolution [40]. In particular, TDAN [40]uses deformable convolutions to align the input frames at the feature level without explicit motion estimation or image warping.
3.Attention Mechanism:Attention has proven its effectiveness in many tasks。

Methodoloy

1.structure
在这里插入图片描述
2.PCD Alignment:aligning features of each neighboring frame to that of the reference one,align on features of each frame,
DConv is the deformable convolution ; three-level pyramid
在这里插入图片描述
3.Fusion with TSA:Inter-frame temporal relation and intra-frame spatial relation are critical in fusion。给相似度大的neighboring frame 分配更多的注意。
在这里插入图片描述
4. Two-Stage Restoration:

Expe and result

1.datasets:REDS [26] is a newly proposed high-quality (720p) video dataset in the NTIRE19 Competition
2.效果很好

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值