论文阅读:Image Super-Resolution by Neural Texture Transfer

  1. To search for matching texture from IRef in the feature space and then transfer matched textures to ISR in a multi-scale fashion,The multi-scale texture transfer simultaneously considers semantic (higher-level) and textual (lower-level) similarity between ILR and IRef, leading to transferring related textures while suppressing irrelevant textures.
    we further regularize on the texture consistency between ISR and the matched textures from IRef, enforcing the effectiveness of texture transfer.
  2. We first conduct feature swapping (进行特征交换)which searches over the entire IRef for locally similar textures that can be used to replace(更换) (or swap) the texture features of ILR for enhanced SR recovery.【把参考图像上的直接替换到低分辨率上】
    We also sequentially apply bicubic downsampling and up-sampling with the same factor on IRef to obtain a blurry Ref image IRef↓↑ that matches the frequency band of ILR↑. Instead of estimating a global transformation or optical flow, we match the local patches in ILR↑ and IRef↓↑ 【ref依旧是要进行上采样】 As LR and Ref patches may also differ in color and illumination, we match their similarity in the neural feature space φ(I) to
    【我们在神经特征空间φ(I)中匹配它们的相似性,以强调结构和文字信息。我们使用内积来度量神经特征之间的相似性:】emphasize the structural and textural information. We use inner 【内积】product to measure the similarity between neural features:

where Pi(·) sampling the i-th patch from neural
feature map, and si,j is the similarity between the i-th LR
patch and the j-th Ref patch. The Ref patch feature is normalized for selecting the best match over all j. The similarity computation can be efficiently implemented as a set of
convolution (or correlation) operations over all LR patches
with each kernel corresponding to a Ref patch:
Sj = φ(ILR↑) ∗ Pj(φ(IRef↓↑))
kPj(φ(IRef↓↑))k, (2)
where Sj is the similarity map for the j-th Ref patch, and
∗ denotes the correlation operation. We use Sj(x, y) to denote the similarity between the LR patch centered at location (x, y) and the j-th Ref patch. Both LR and Ref patches
are densely sampled from their images. Based on the similarity score, we can construct a swapped feature map M
to represent texture-enhanced LR image. Each patch in M
centered at (x, y) is defined as

(x,y)(M) = Pj∗(φ(IRef )), j∗= arg max
Sj(x, y), (3)
where ω(·, ·) maps patch center to patch index. Note that
while IRef↓↑ is used for matching (Eq. 2), the raw Ref IRef
is used in swapping (Eq. 3) so that the HR information from
the original references is preserved. Due to the dense sampling of LR patches, we take the average of the swapped
features Pj∗(φ(IRef )) in the regions where they overlap.
The resulting swapped feature map M is used as the basis
for the next texture transfer stage.





