JVET-W0113
为了提高重建图像的质量,分别针对I Slice和B Slice训练神经网络模型。 在环内滤波过程中,使用NN 滤波器代替Deblock 和 SAO滤波 。
Method
将通过LMCS后的重建图输入到网络中,具体地,将U/V通道进行上采样和Y通道级联输入到网络中,通过滤波后,再将U/V通道进行下采样。最后,NN 滤波器的输出由 ALF 和 CCALF 处理
网络结构
下图展示了所提出的 NN 滤波器的网络结构。 第一层和最后一层是3x3的卷积。跳过层是5x5的卷积层。残差块(ResBlock)的结构由3个卷积层组成,其中N表示ResBlock的数量。 在此提案中,N 设置为 8。 在 ResBlock 中,第一层是 1x1 卷积层,后面是 ReLU 激活函数,第二层是 1x1 卷积层,第三层是 3x3 卷积层。 对于内部卷积层,特征图的数量设置为 64。
训练
Network Information in Training Stage | ||
Mandatory | GPU Type | GPU: Tesla V100 x 8 x 16GB) |
Framework: | PyTorch v1.5 | |
Number of GPUs per Task | ||
|
| |
Epoch: | 40 | |
Batch size: | 128Kx64 | |
Loss function: | L1 30 epochs then L2 10 epochs | |
Training time: | 72h | |
Training data information: | DIV2K, BVI-DVC | |
Training configurations for generating compressed training data (if different to VTM CTC): | Disable Deblock and SAO | |
Optional |
|
|
Number of iterations | ||
Patch size | 128x128 | |
Learning rate: | ||
Optimizer: | ||
Preprocessing: | ||
Mini-batch selection process: |
| |
Other information: |
| |
|
|
集成
1. 禁用Deblock 和 SAO,提出的滤波器放在 ALF 之前。
2. 提出的滤波器可以在 CTU 级别和Slice级别打开/关闭。
3. 进行缩放操作以细化 NN 滤波器的结果。 缩放因子在每个分量的Slice头中用信号通知。
实验结果
Random access Main10 | ||||||||
BD-rate Over VTM-11.0+V0056 | ||||||||
Y-PSNR | U-PSNR | V-PSNR | Y-MSIM | U-MSIM | V-MSIM | EncT | DecT | |
Class A1 | -3.02% | -6.43% | -7.01% | -3.95% | -9.49% | -8.13% | 175% | 114554% |
Class A2 | -3.70% | -12.16% | -8.83% | -3.47% | -12.00% | -6.48% | 170% | 97254% |
Class B | -3.44% | -10.95% | -10.54% | -3.05% | -10.52% | -9.61% | 171% | 94209% |
Class C | -3.94% | -13.61% | -13.33% | -3.03% | -10.54% | -9.76% | 143% | 89578% |
Class E |
|
|
|
|
|
| ||
Overall | -3.54% | -10.99% | -10.24% | -3.31% | -10.61% | -8.73% | 164% | 97275% |
Class D | -5.83% | -15.41% | -16.15% | -3.27% | -11.78% | -11.27% | 138% | 62376% |
Class F | -0.77% | -6.74% | -6.74% | -0.52% | -6.70% | -7.19% | 227% | 28503% |
Class H | #VALUE! | #VALUE! | #VALUE! | #VALUE! | #VALUE! | #VALUE! | #DIV/0! | #DIV/0! |
All Intra Main10 | ||||||||
BD-rate Over VTM-11.0+V0056 | ||||||||
Y-PSNR | U-PSNR | V-PSNR | Y-MSIM | U-MSIM | V-MSIM | EncT | DecT | |
Class A1 | -4.15% | -7.35% | -8.85% | -4.78% | -8.84% | -8.97% | 230% | 60980% |
Class A2 | -4.10% | -10.57% | -9.15% | -4.09% | -10.54% | -7.23% | 168% | 47767% |
Class B | -4.12% | -10.89% | -11.75% | -3.70% | -11.42% | -11.60% | 162% | 42477% |
Class C | -5.13% | -14.07% | -14.99% | -4.36% | -13.21% | -12.92% | 128% | 40496% |
Class E | -6.39% | -9.56% | -11.47% | -5.76% | -8.77% | -11.32% | 166% | 60336% |
Overall | -4.73% | -10.73% | -11.51% | -4.43% | -10.80% | -10.68% | 165% | 48263% |
Class D | -5.77% | -14.48% | -16.90% | -4.43% | -14.05% | -15.21% | 126% | 36850% |
Class F | -2.07% | -8.05% | -8.55% | -2.01% | -8.62% | -8.93% | 130% | 38788% |
Class H | #VALUE! | #VALUE! | #VALUE! | #VALUE! | #VALUE! | #VALUE! | #DIV/0! | #DIV/0! |