直接修改配置文件中的scale=2会报错。
1. 根据报错,猜想应当是网络结构不同。到对应的rrdbnet_arch.py中读代码,发现作者有如下注释:
"""Networks consisting of Residual in Residual Dense Block, which is used in ESRGAN. ESRGAN: Enhanced Super-Resolution Generative Adversarial Networks. We extend ESRGAN for scale x2 and scale x1. Note: This is one option for scale 1, scale 2 in RRDBNet. We first employ the pixel-unshuffle (an inverse operation of pixelshuffle to reduce the spatial size and enlarge the channel size before feeding inputs into the main ESRGAN architecture. Args: num_in_ch (int): Channel number of inputs. num_out_ch (int): Channel number of outputs. num_feat (int): Channel number of intermediate features. Default: 64 num_block (int): Block number in the trunk network. Defaults: 23 num_grow_ch (int): Channels for each growth. Default: 32. """
作者指出,已经扩展出了ESRGAN的2倍重建的功能。
这里提到了pixel-unshuffle,是pixelshuffle的反变换,是为了缩小spatial size。而后,在送进网络前,扩大了channel size。我没了解过pixelshuffle,遂查阅:
pixelshuffle是把维度(B, C*r*r, H,w) reshape成 (B, C, H*r,w*r)。那么,pixel-unshuffle则是将维度 (B, C, H*r,w*r)reshape成(B, C*r*r, H,w)。其中,作者的操作中,scale=2对应r=2;scale=1对应r=4。这样一来,输入的spatial size的缩小转换成了channel size的扩大,并且作者也在网络结构中扩大了channel size,就可以正常训练啦~
在代码中,作者也实现了对应的操作。
if self.scale == 2:
feat = pixel_unshuffle(x, scale=2)
if scale == 2:
num_in_ch = num_in_ch * 4
2. 问题出在哪儿了???是配置中还有参数需要修改吗
Yes!!!
通过在配置文件和rrdbnet_arch.py修改通道数等参数,发现在配置文件最上面直接修改scale=2的参数不能传过去,遂在network配置部分,加上一行:
scale: 2
成功传入!不报错了!
小结:
1. 修改最上面的配置scale: 2
2. 在network structures的network_g的配置下面加一行:scale: 2
3. metrics部分的crop_border改为2