CS231n学习笔记--11.Detection and Segmentation

1. Computer Vision Task


2. Semantic Segmentation

2.1 特点:
a. Label each pixel in the image with a category label
b. Don’t differentiate instances, only care about pixels

2.2 步骤:

a. Semantic Segmentation Idea: Sliding Window


b. Semantic Segmentation Idea: Fully Convolutional


2.3 upsampling:

Max Unpooling


这样的upsamle有效的原因在于算法不要求得到一张好看的超分辨率图片,而是为了尽可能的保留像素的结构分布特征!

Transpose Convolution

算法原理图:


1D Example:


3. Classification + Localization

原理图:


Human Pose Estimation

目标:


原理图:


4. Object Detection as Classification

搜索算法Sliding Window存在的问题:


Region Proposals:


RNN算法原理:


R-CNN: Problems

  1. Ad hoc training objectives
    • Fine-tune network with softmax classifier (log loss)
    • Train post-hoc linear SVMs (hinge loss)
    • Train post-hoc bounding-box regressions (least squares)
  2. Training is slow (84h), takes a lot of disk space
  3. Inference (detection) is slow
    • 47s / image with VGG16 [Simonyan & Zisserman. ICLR15]
    • Fixed by SPP-net [He et al. ECCV14]

Fast R-CNN

检测ROI区域在得到图像特征图之后,从而减少大量的重复特征计算。



Faster R-CNN: RoI Pooling


在卷积层中设置RPN层用于检测ROI:


Detection without Proposals: YOLO / SSD

扫描一次图片时同时进行区域定位与物体识别:


Object Detection: Lots of variables …


Aside: Object Detection + Captioning = Dense Captioning


算法架构:


Mask R-CNN

加入一个掩摸:



Mask R-CNN Also does pose


效果图:


评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值