基本信息
原文Scene Classification Based on Two-Stage Deep Feature Fusion
二区,华师。
笔记
作者自认为大概有三个创新之处。
参考了ICCV 2015的论文。“Do deep features generalize from everyday objects to remote sensing and aerial scenes domains?”
their fusion is performed at score level instead of at feature level
目的就是想联合低级特征。特征融合是加权操作的。
we propose a novel feature-level fusion method for adaptively combining the information from lower layers and FC layers, in which the fusion coefficients are automatically learned from data.
后面仔细描述了
our fusion is performed via a linear combination of feature vectors instead of feature
concatenation, and the fusion weights are learned via training and fine-tuning.用了两个神经网络。
caffeNet和Vgg-VD-16。这个也是标题Two-Stage的来源原因,因为两个模型各自提取feature 然后融合第一步,然后两个CNN的结果再融合是第二步。
为什么选两个不同的呢?
作者认为different CNNs contain complementary information.
实验
网络结构
这篇文章,可取指出应该就是实验部分描述的比较详细。有些参考价值。
大概做法是,先用ILSVRC-2012 pretrain 两个网络,然后使用rs data进行fine-tuning。
1) Freeze the trunk of CaffeNet, and train its branches and fusion layer using ILSVRC-2012.
2) Unfreeze the trunk of CaffeNet, and train the whole converted CaffeNet using ILSVRC-2012.
3) For VGG-VD-16, do things analogous to steps 1)–2).
4) Train the composite CNN using ILSVRC-2012.
5) Fine-tune the composite CNN using an RS data set.
rs data使用的是AID和RSSCN7。
fine-tuning那块是这样的。
the two RS data sets are both randomly and equally divided into two subsets
for fine-tuning and testing. We conduct 10 repetitions of each experiment, each repetition having different data set division.
对半分成训练和测试集。
总结
作者从一篇ICCV 2015和CVPR 2015的两篇论文修改和发散。主要想法就是加权,高低特征融合,集成学习的雏形。