刚刚看完这篇论文,整理了一下思路。这篇论文基于神经元在学习不同特征时活跃程度不同而提出DGD的方法,也是666的
论文:https://arxiv.org/abs/1604.07528
代码:https://github.com/Cysu/dgd_person_reid
论文解析
- 文章一开始,作者解释了为什么要使用多个数据集进行训练:
- Learning generic and robust feature representations with data from multiple domains for the same problem is of great value, especially for the problems that have multiple datasets but none of them are large enough to provide abundant data variations.
- 即当一个问题所对应的数据集中没有一个能够提供足够的信息时,可以考虑使用多个训练集进行训练,有点类似于“互补”。
- 包括一些背景的补充:
- In computer vision, a domain often refers to a dataset where samples follow the same underlying data distrubution.
- 一个域通常指一个数据集,该数据集中的样本符合某种数据分布。
- It’s common that multiple datasets with different data distributions are proposed to target the same or similar problems.
- 带有不同数据分布的数据集都是为了解决同一个或者相似的问题。
- Multiple-domain learning aims to solve the problem with datasets across different domains simultaneously by using all the data they provide.
- 多个域的学习问题就是利用多个数据集去解决某个问题。
- The success of deep learning is driven by the emergence of large-scale learning. Many studies have shown that fine-tuning a deep model pretrained on a large-scale dataset.
- 深度学习的发展是受到large-scale leaning的驱使。有很多研究都是使用一个大训练集对模型进行与训练,然后在使用特定的训练集对模型进行微调以得到最终的模型。
- However, in many specific areas, there is no such large-scale dataset for learning robust and generic feature representation.
- 然而,并不是在每个领域都会有大训练集可以用于学习具有鲁棒性的特征。所以很多研究团队提出了很多小训练集。
- 所以,作者认为:
- It