论文:https://arxiv.org/pdf/1907.02665.pdf
本文采用高阶特征做图像质量评鉴
propose a deep bilinear model for blind image quality assessment (BIQA) that handles both synthetic and authentic distortions(合成失真、真实失真).
网络由两个部分组成,每个部分复杂一种失真的。For synthetic distortions, we pre-train a CNN to classify image distortion type and level, where we enjoy largescale training data. For authentic distortions, we adopt a pretrained CNN for image classification。The features from the two CNNs are pooled bilinearly into a unified representation for final quality prediction. We then fine-tune the entire model on target subject-rated databases(目标数据库) using a variant of stochastic gradient descent(使用随机梯度下降的变体。).
图像质量评估(image quality assessment,IQA)
对于合成失真,we construct a large-scale pre-training set based on the Waterloo Exploration Database and the PASCAL VOC
Database, where the images are synthesized with nine distortion types and two to five distortion levels.利用数据集中已知的失真类型以及level的信息,来预先训练一个CNN通过多类分类任务。
对于真实失真,很难去模拟退化过程,因此采用VGG-16(在imageNet上预训练)
We model synthetic and authentic distortions as two-factor variations, and pool the two feature sets bilinearly into a unified representation (统一代表) for final quality prediction.
We consider bilinear pooling to combine S-CNN for synthetic distortions and VGG-16 for authentic distortions into a unified model. Bilinear models have been shown to be effective in modeling two-factor variations, such as style and content of images, location and appearance for finegrained recognition, spatial and temporal characteristics for video analysis, and text and visual information for question-answering.