deeplearning.ai - 卷积神经网络

卷积神经网络
吴恩达 Andrew Ng

Computer Vision Problems

  • Image Classification
  • Object Detection
  • Neural Style Transfer

Vertical edge detection

  • filter (过滤器) (通常是奇数阶)

    111000111 ( 1 0 − 1 1 0 − 1 1 0 − 1 ) 边缘检测

  • convolution operation *

  • bright pixels on the left and dark pixels on the right

Padding

  • (n×n)(f×f)=(nf+1)×(nf+1) ( n × n ) ∗ ( f × f ) = ( n − f + 1 ) × ( n − f + 1 )

    • output will shrink
    • pixels in the corner are used only once, so we loss information near the edge of the image
  • 解决上述两个问题的方法 —— Pad

    • with an additional border of one pixel all around the edges
    • pad with zeros by convention
    • so the output will be (n+2pf+1)×(n+2pf+1) ( n + 2 p − f + 1 ) × ( n + 2 p − f + 1 )
  • Valid Convolution: No paddings (p = 0)

  • Same Convolution: Pad so that output size is the same as the input size

  • f 通常是奇数

    • 便于 Same Convolution 的操作
    • 存在中心点 central pixel

Strided Convolution

  • 卷积步长 (stride):每次移动的格子数
  • output: (nfs+1)×(nfs+1) ( ⌊ n − f s ⌋ + 1 ) × ( ⌊ n − f s ⌋ + 1 )
  • the filter must entirely lies in the image (plus padding region)

Cross-correlation VS. Convolution

  • in mathematic, (convolution) before calculation the filter needs a flipping operation (沿副对角线的镜面翻转)
  • in ML we usually do not use flipping operation, actually it should be cross-correlation, but by convention we call this convolution
  • 卷积满足结合律

Convolution on RGB images

  • height, width, channels(depth)

  • 图片和过滤器的通道数必须相等

  • Convolution on RGB images

  • n×n×ncf×f×nc(nf+1)×(nf+1)×nc n × n × n c ∗ f × f × n c → ( n − f + 1 ) × ( n − f + 1 ) × n c ′

    nc n c : number of channels; nc n c ′ : number of filters

  • detect nc n c ′ features

Example of a layer

  • add a bias (偏差) to the output, and then apply a real non-linearity (非线性激活函数)
  • less prone to over fitting (避免过拟合)
  • 上一层的输出作为这一层的输入
  • notation
    这里写图片描述

A simple convolution network example

  • 这里写图片描述

  • Convolutional Layer (Conv) 卷积层

    Pooling Layer (Pool) 池化层

    Fully Connected Layer (FC) 全连接层

Pooling layer

  • reduce the size of representation to speed up computation and make some of the features it detects a bit more robust
  • no parameters to learn, just a fixed function, has no weights
  • 最后将池化的结果平整化为一个列向量
Max pooling
  • break into different regions
  • 输出每一个区域的最大值
  • 最大池化的超级参数(hyper-parameters) f=2,s=2 f = 2 , s = 2 (often)
  • usually does not use any padding
  • 在某个区域提取到的特征保存在输出里
  • if this feature is detected anywhere in this filter, then keep a high number
  • 每个信道独立执行最大池化的计算
Average pooling
  • 每个区域取平均得到输出
  • 最大池化比平均池化更常用

Neural network example

识别数字

  • f=2,s=2 f = 2 , s = 2 使输入的高和宽减少一半

  • 两类卷积的形式

    一个卷积层和一个池化层一起作为一层;或者分为单独的两层

  • 一般计算网络层数时,只看具有权重的层

  • 池化后的结果和全连接层的单元作笛卡尔连接??

  • not to invent your own settings of hyper parameters, but to look in the literature

  • 随着层数的增加,高度和宽度都会减少,信道数会增加

  • 这里写图片描述
    这里写图片描述

Why convolution

  • parameter sharing (参数共享) and sparsity of connections (稀疏连接)
  • feature detector (特征检测器) 适用于整张图片
  • 某一个输出之和一部分输入相关
  • good at capturing translation invariance (捕捉平移不变)
  • 即使移动几个像素,图片依然具有与原图相似的特征
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值