笔者近日打算阅读完Andrew NG 和 Hinton的所有论文
Andrew NG的第一篇
The Importance of Encoding Versus Training with Sparse Coding and Vector Quantization
Coates A, Ng A Y. The importance of encoding versus training with sparse coding and vector quantization[C]//Proceedings of the 28th International Conference on Machine Learning (ICML-11). 2011: 921-928.
补充 vector quantization
Vector Quantization是一种数据压缩(data compression)和编码的方式,关于其内容可以参见网站
vector quantization的思想就是把某个区域的点都用一个点代替(k-means就是一种VQ),见图
Here, every pair of numbers falling in a particular region are approximated by a red star associated with that region
摘要
VQ在很长一段时间是特征编码的主要方法之一,后面逐渐被稀疏编码(sparse coding)取代了。这两个特征编码方法都包含两个阶段:训练阶段(learning phase)(学习编码用的字典dictionary/基函数basis function)和编码阶段(encoding phase)。本文分析了SC优于VQ的原因,以及训练和编码两个阶段的作用。
结论:稀疏编码的成功来自于其高效的编码阶段,稀疏编码的编码阶段搭配起他训练过程也可以取得很好的效果
这就启发我们,如果训练的数据集很大,就可以选择快速的简单训练算法,用高效的编码策略。
introduction
VQ在特征提取中,一般起补刀的作用。它一般作用于已经提取的低层次特征上(low level),为了抽象出高层(high-level)特征。VQ的这种作用现在可被SC代替,并且SC能取得更好的效果。问题来了,这是因为SC学习到了更好的表示数据结构的字典,还是因为稀疏编码就是比简单的非线性特征要好?有其他的训练算法或者编码策略比稀疏编码简单,且又能与稀疏编码抗衡的?
D: 表示训练阶段学习到的字典(基函数)--训练阶段
M: 基于D,从输入x映射到特征f的映射原则(mapping)--编码阶段
训练阶段和编码阶段不一定要匹配的。比如这样搭配训练阶段中使用硬分配hard assignment,编码阶段用软分配soft assignment,效果会比训练和编码都用hard assignment要好。
最近的文献也指出,基函数的选择远没有我们想象的那么重要。Jarrett[7]表明随机的权值也可以取得比较好的分类结果,即便没有学习到的权值好。
讨论
1)在只有少量标签数据时,稀疏编码的效果还是不错的
有用的知识总结
1,soft assignment比hard assignment要好,参考文献[1],[2]
2,The soft threshold function
在深度架构的算法中,用作非线性模块
3,
“locality preserving” encodings
新的编码策略[3-6]
4,patch-based system
原来的图像太大了,用patch
5,分离正负特征
the positive and negative polarities plit into separate features
参考文献
[1]van Gemert, J. C., Geusebroek, J. M., Veenman, C. J., and Smeulders, A. W. M. Kernel codebooks for
scene categorization. In European Conference on Computer Vision, 2008.
scene categorization. In European Conference on Computer Vision, 2008.
[2]Agarwal, A. and Triggs, B. Hyperfeatures multilevel local coding for visual recognition. In European Conference on Computer Vision, 2006.
[3]Yu, K., Zhang, T., and Gong, Y. Nonlinear learning using local coordinate coding. In Advances in Neural Information Processing Systems, 2009.
[4]Yu, K. and Zhang, T. Improved local coordinate coding using local tangents. In International Conference on Machine Learning, 2010.
[5]Yang, Jianchao, Yu, Kai, Gong, Yihong, and Huang, Thomas S. Linear spatial pyramid matching using
sparse coding for image classification. In Computer Vision and Pattern Recognition, 2009.
sparse coding for image classification. In Computer Vision and Pattern Recognition, 2009.
[6] Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., and Gong, Y. Locality-constrained linear coding for image classification. In Computer Vision and Pattern Recognition, 2010.
[7] Jarrett, K., Kavukcuoglu, K., Ranzato, M., and Le-Cun, Y. What is the best multi-stage architecture
for object recognition? In International Conference on Computer Vision, 2009.
for object recognition? In International Conference on Computer Vision, 2009.