2.1 图像分类-K最近邻算法
Hyperparamters: K
一般来说K选择的越大就会使得决策边界越平滑。
Hyperparamters: Distance Metric
- L1(Manhattan) distance = ∑ p ∣ I 1 p − I 2 p ∣ \sum_p|I_1^p - I_2^p| ∑p∣I1p−I2p∣
- L2(Euclidean) distance =
∑
p
(
I
1
p
−
I
2
p
)
2
\sqrt{\sum_p(I_1^p - I_2^p)^2}
∑p(I1p−I2p)2
PS:当你旋转坐标系的时候L1的距离会发生变化而L2不会,所以如果你的feature vector里面有比较重要的feature时,一般采用L1距离,而如果是一个一般的vector就使用L2距离。
Setting Hyperparameters
Idear #3: Split data into train, val, test; choose hyperparamters on val and evaluate on test.
Idear #4: Cross-Validation: Split data into folds, try each fold as validation and average the results.(Useful for small datasets, but not useds too frequently in deep learning)
Weakness on images
- Very slow at test time
- Distance metrics on pixels are not informative
- Curse of dimensionality(为了保证特征空间的训练样本比较密集均匀分布,那么训练样本就会呈指数增加)