论文:Blind Super-Resolution With Iterative Kernel Correction
论文看得比较粗略,所以可能存在理解错误的地方,请指正(另外请忽略我糟糕的英语)
Blind Super-Resolution With Iterative Kernel Correction
Movtivation
The point is that different kernel will generate different artifact-texture in the result. So you should choose the right kernel. For example, use Gaussian kernel with kernel width
σ
L
R
\sigma_{LR}
σLR, the SR results show unnatural ringing artifacts when
σ
S
R
>
σ
L
R
\sigma_{SR} > \sigma_{LR}
σSR>σLR and over-smoothing on the other side.
Estimate the kernel
A straightforward method is to adopt a function that estimates kernel from the LR image. Let
k
′
=
P
(
I
L
R
)
k'=\mathcal{P}(I^{LR})
k′=P(ILR)
Then we can optimize the function by minimizing the
L
2
L_2
L2 distance
θ
P
=
arg min
θ
P
∣
∣
k
−
P
(
I
L
R
;
θ
P
)
∣
∣
2
2
\theta_{\mathcal{P}}=\argmin_{\theta_{\mathcal{P}}}{||k-\mathcal{P}(I^{LR};\theta_{\mathcal{P}})||^2_2}
θP=θPargmin∣∣k−P(ILR;θP)∣∣22
But accurate estimation of kernel is impossible as the problem is ill-posed. So they try to find a way to correct the estimation
Correct the kernel
The idea is to adopt the intermediate SR results. Let
C
\mathcal{C}
C be the corrector function, then
θ
C
=
arg min
θ
C
∣
∣
k
−
(
C
(
I
S
R
;
θ
C
)
+
k
′
)
∣
∣
2
2
\theta_{\mathcal{C}}=\argmin_{\theta_{\mathcal{C}}}{||k-(\mathcal{C}(I^{SR};\theta_{\mathcal{C}})+k')||_2^2}
θC=θCargmin∣∣k−(C(ISR;θC)+k′)∣∣22
To avoid over- or under-fitting, a smaller correction steps is used to refine the kernel until it reaches ground truth.
Method
Let
F
\mathcal{F}
F be a SR model,
P
\mathcal{P}
P is a kernel predictor and
C
\mathcal{C}
C is a corrector. You can use PCA to reduce the dimensionality of the kernel space. The kernel after the dimension reduction is denoted by
h
h
h where
h
=
M
k
h=Mk
h=Mk,
M
M
M is the dimension reduction matrix. An initial estimation
h
0
h_0
h0 is given by the predictor
h
0
=
P
(
I
L
R
)
h_0=\mathcal{P}(I^{LR})
h0=P(ILR) and the first SR result is
I
0
S
R
=
F
(
I
L
R
,
h
0
)
I_0^{SR}=\mathcal{F}(I^{LR}, h_0)
I0SR=F(ILR,h0), Then the iterative kernel correction algorithm can be written as
Δ
h
i
=
C
(
I
S
R
,
h
i
−
1
)
h
i
=
h
i
−
1
+
Δ
h
i
I
i
S
R
=
F
(
I
L
R
,
h
i
)
\begin{array}{rl} \Delta h_i &=& \mathcal{C}(I^{SR}, h_{i-1}) \\ h_i &=& h_{i-1}+\Delta h_i \\ I^{SR}_i &=& \mathcal{F}(I^LR, h_i) \end{array}
ΔhihiIiSR===C(ISR,hi−1)hi−1+ΔhiF(ILR,hi)
After
t
t
t iterations, the
I
t
S
R
I^{SR}_t
ItSR is the final result of IKC.
Network Architecture of SR model
The SR method for multiple blur kerenls, SRMD have two problems.
- The kernel maps do not actually contain the information of the image.
- The influence of kernel information is only considered at the first layer.
So SFTMD is proposed which using spatial feature transform layer
Use SRResNet as the backbone (of cause you can change it) and then employ the SFT layer to provide the affine transformation for the feature maps
F
F
F conditioned on the kernel maps
H
\mathcal{H}
H by a scaling and shifting operation:
S
F
T
(
F
,
H
)
=
γ
⊙
F
+
β
\mathrm{SFT}(F,\mathcal{H})=\gamma \odot F + \beta
SFT(F,H)=γ⊙F+β
The kernel maps
H
\mathcal{H}
H is stretched by
h
h
h, where all the elements of the
i
i
i-th map are equal to the
i
i
i-th element of
h
h
h.
Network Architecture of P \mathcal{P} P and C \mathcal{C} C
Experiments
Always best.
Comprehension
- How can I train the predictor
P
\mathcal{P}
P
Although the paper said that it is an ill-posed problem, then they get trained model. Over-fit or under-fit? What’s the loss curve like? and When should I stop training? - Spatially uniform
How? The paper said it is different from its application in semantic super resolution. Just employ the transformation characteristic of SFT layers.
So I don’t understand that if the segmentation information can provide spatial variability? - IKC With PCA is best?
Why? As the paper said that the PCA can provides a feature representation and IKC learns the relationship between the SR images and features rather than the Gaussian kernel. But why can’t IKC learn features from kernel?