CS231 Image Classification 01

Image Classification

This is my note to the course CS231n Stanford Convolutional Neural Network


Computer’ Work

Input an image, and assign one of the label amoung the given labels.

  • The Problem:
  1. Semantic Gap
  2. Viewpoint variation
  3. illumination
  4. Deformation
  5. Occlusion
  6. Intraclass variation

An image classifier

Coding might be difficult

def classify_image(image):
    # Do Some Magic
    return class_label
  • Attmpts

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-RHi7g2Ab-1604969982593)(https://s1.ax1x.com/2020/11/07/BIMSmD.png)]


Data-Driven Approach

  1. Collect a dataset of images and labels
  2. Use Machine Learning to train a classifier
  3. Evaluate the classifier on new images
  • First classifier: Nearest Neighbor

Just Memorize all data and labels

def train(images, labels):
    # Machine Learning!
    return model

Predict the label of the most similar training image

def predict(model, test_images):
    # Use model to predict labels
    return test_labels

Example Dataset: CIFAR10

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-KkuFQjqz-1604969982596)(https://s1.ax1x.com/2020/11/07/BIM9TH.png)]

Issues: Although pic may seems visually similar, but still gives lots of errors.


  • Compare func used in it

K nearest Neighbors Method

L1 distance: d 1 ( I 1 , I 2 ) = ∑ p ∣ I 1 p − I 2 p ∣ d_1(I_1,I_2) = \sum\limits_{p} \mid I_1^p - I_2^p \mid d1(I1,I2)=pI1pI2p

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-FIGWB59F-1604969982598)(https://s1.ax1x.com/2020/11/07/BIKXSx.png)]

Minimize the sum given the most similar pics

BackWards

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-FhHc59Kj-1604969982600)(https://s1.ax1x.com/2020/11/07/BIKjl6.png)]

What it looks like

Issues

  1. Isolated Yellow Point
  2. Noisy of one single point (green into blue)

Use K Nearest Neighbors to Optimize it
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-jW2WheUX-1604969982605)(https://s1.ax1x.com/2020/11/07/BIMitA.png)]


A Better Cmp Func
L2(Euclidean) distance: d 1 ( I 1 , I 2 ) = ∑ p ( I 1 p − I 2 p ) 2 d_1(I_1,I_2) = \sqrt{\sum\limits_{p}{(I_1^p - I_2^p)}^2} d1(I1,I2)=p(I1pI2p)2

The L1 Distance depends on the coordinate system, whenever there is a rotate, it would change the L1 Distance, while that won’t happen in the L2 Distance case (simply because it’s a circle)


Hyperparameters
  • What’s the best value of k
  • What’s the best distance to use? (L1,L2 or anything else)

These things are preset rather than learn automatically from learning process

This is Very problem-dependent, just try!, but How?

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-KUQVXZAE-1604969982611)(https://s1.ax1x.com/2020/11/07/BIME1P.png)]

Training & Validation process should not mixed with the test data

  • Cross Validation

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-JmtieLHv-1604969982612)(https://s1.ax1x.com/2020/11/07/BIMApt.png)]

  • Validation process

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-Sr9uDz4v-1604969982613)(https://s1.ax1x.com/2020/11/07/BIMV6f.png)]

using the validation data to choose the best hyperparameters.

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-vVAjURyj-1604969982615)(https://s1.ax1x.com/2020/11/07/BIMu7Q.png)]

Cause we sum the offset, though the differences bettween pics and pics are various, they still got the same L2 distance, which is not so good.


Linear Classification

  • Parametric Model
    [外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-t5G8Q8yw-1604969982615)(https://s1.ax1x.com/2020/11/07/BIMZX8.png)]

f ( x , W ) = W x + b f(x,W) = Wx + b f(x,W)=Wx+b

We need f(x,W) to be 10x1 and the x is actually 3072x1, so the W we input may be 10x3072, sometimes we add a bias to balance.

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-8ykRdhXp-1604969982616)(https://s1.ax1x.com/2020/11/07/BIMn0g.png)]

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-yRiOW4Zt-1604969982617)(https://s1.ax1x.com/2020/11/07/BIMMkj.png)]

It use a single line to separate the object based on its RGB info

But how can we tell the quality of W ?
(View the next lecture)

  • Problems
    [外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-fi9ccsMK-1604969982618)(https://s1.ax1x.com/2020/11/07/BIMQts.png)]

Since it’s linear the Problems is obivious.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值