cs231n听课笔记

最新推荐文章于 2023-11-13 23:00:23 发布

leobobsix

最新推荐文章于 2023-11-13 23:00:23 发布

阅读量688

点赞数 2

分类专栏：深度学习文章标签：深度学习 stanford cnn神经网络 youtube

本文链接：https://blog.csdn.net/liubo7887/article/details/78636523

版权

最近报了Udacity一节Deep Learning的课，无奈该课重实践而轻理论，做起作业和项目来，颇为吃力，所以从YouTube上找来Stanford今年春天的cs231n（Neural Network and Deep Learning）课来恶补基础知识，并简略记了笔记、补足了涉及到论文的链接，以便以后查阅。其中的Assignment，以后会在github上补齐。

Lecture1.introduction

这里是讲义

图像被称作互联网的“暗物质”（dark matter）
David Marr，1970s，stages of Visual Representation
1. input image
2. primal sketch(edge image)
3. 2 1/2-D sketch
4. 3-D model

Face Detection,2001
Histogam of Gradients(HoG),Dalal & Triggs,2005
PASCAL Visual Object Challenge(20 object categories)

现代图像识别问题是特征多，维度高，用算法经常过拟合

Image Net Challenge所用算法演变
- Lin CVPR 2011—svm
- Krizhevsky NIPS 2012——cNN,Supervision(AlexNet)
- Szegedy arxiv 2014/Simonyan arxiv 2014——-VGG GoogleNet
- Microsoft Research Asia 2015—–152 layer Residual Networks

Lecture2.image classificatioon

这里是讲义

Data-Driven Approach

collect a dataset of images and labels
use machine learning to train a classifier
evaluate the classifier on new images

机器学习方法用于预测分类，一般分两个函数（步骤）：

输入函数→train
输出函数→predict

常用分类器1：Nearest Neighbor

memorize all data and labels
predict the label of the most similar train image

Distance Metric to compare images
L1 distance(Manhattan distance):

d 1 (I 1, I 2) = \sum P ∣ ∣ I P 1 - I P 2 ∣ ∣

$d_1(I_1,I_2)=\sum_{P}\left | I_1^P - I_2^P \right |$

训练时间复杂度O(1), 预测时间复杂度O(N)，但是我们的需求是，训练时间可以长，但是预测速度越快越好
缺点：不准确,噪声点误分类
改进：K-Nearest Neighbor，给定一个K，将最邻近的K个样本点的分类作为最终预测结果

用欧几里得距离作为Distance Meric
L2(Euclidean)distance:

d 2 (I 1, I 2) = \sum P (I P 1 - I P 2) 2 - - - - - - - - - - - \sqrt

$d_2(I_1,I_2)= \sqrt{\sum_{P}(I_1^P - I_2^P)^{2}}$

L1与L2区别

L1依赖于所选择的坐标系，若旋转坐标系，L1距离会变化
L2不依赖于坐标系
如果输入特征有特别含义，则用L1较好，如果特征间无差别，则用L2较好

设置K近邻的超参数

×选择最佳超参数K（BAD：K=1总是对训练集拟合最好）
×分为训练集和测试集（BAD：只在测试集上预测效果好，不知道未知数据预测效果如何）
√分为训练集、验证集和测试集
√cross-validation（在小数据集中非常有用，但在deep learning中不常用）

K-Nearest Neighbor on images never used
- Very slow at test time
- Distance metrics on pixels are not informative
L2距离对样本数据变化（图像变化，如遮挡，变换）不敏感
- curse of dimensionality
随着维度增加，数据个数（采样点）指数级增多

summary

In image classification we start with a training set of images and labels , andd must predict labels on the test set
The K-Neatest Neighbors classifier predicts labels based on nearest training examples
Distance metric and K are hyperparameters
Choose hyperparameters using the validation set; only run on the test set once at the very end!

常用分类器2：线性分类

Parametric Approach

一个尺寸为32（pixels）×32（pixels）×3（RGB）的图片，转为含3072数字的Array，通过权重矩阵W，转换为10个给定分类的分值

f (x) = W x + b

$f(x) = Wx + b$

不需要测试集

该方法试图在高维空间用线性划分分类,但对线性不可分集无用

如何用cost function评价W的好坏，下节课讲

Lecture3.Loss Functions and Optimization

这里是讲义

评价权重矩阵W的方法

定义一个loss function量化分类的好坏
找出最小化以上函数的参数（optimization）

Loss function

一般表示：

假设数据集的样本表示为 $\left\{(x_i,y_i)\right\}^{N}_{i=1}$
$x_i$ 为图像， $y_i$ 为label（int）
数据集的损失表示为：

L = 1 N \sum i L i (f (x i, W), y i)

$L = \frac{1}{N} \sum_{i} L_i(f(x_i,W),y_i)$

Multiclass SVM loss(Hinge Loss)：

L i = \sum j \neq y i max (0, s j - s y i + 1)

$L_i = \sum_{j\neq y_i}\max\left ( 0,s_j - s_{y_i} + 1 \right )$
其中：

s=f(xi,W) $s=f(x_i,W)$ (

sj $s_j$ 表示预测分类分数，

syi $s_{y_i}$ 表示其他分类分数)

最低0.47元/天解锁文章

leobobsix

关注

2
点赞
踩
3

收藏

觉得还不错? 一键收藏
0
评论
cs231n听课笔记

最近报了Udacity一节Deep Learning的课，无奈该课重实践而轻理论，做起作业和项目来，颇为吃力，所以从YouTube上找来Stanford今年春天的cs231n（Neural Network and Deep Learning）课来恶补基础知识，并简略记了笔记、补足了涉及到论文的链接，以便以后查阅。其中的Assignment，以后会在github上补齐。
复制链接

扫一扫