Deep Learning-- class 2

  • basic 
    • deep L-layer neural network
      • forward propagation (vectorized)
        • Z[l] = W[l] A[l-1] + B[l]
        • A[l] = g[l](Z[l])
      • get the matrix dimensions right
        •  
          • Z[1] = W[1] X + B[1]
          • (n1, 1),(n1, n0),(n0,1),(n1,1)
          • single and mutiply
            • z(l), a(l) -->(nl,1);
            • Z(l),A(l) -->(nl, m);
      • why deep representations?
        • It is best that more hidden layers are more effective than neural units
      • build blocks
        • forward and backward functions

      • forward propagation for layer l
        • input a[l-1] ; output a[l]; cache(z[l])
      • backward propagation for layer l
        • input da[l] ; output da[l-1], dW[l], dB[l]

      • summary

      • hyperparameters and parameters
        • hyperparameters
          • learning rate alpha
          • hidden layer L
          • iteration n
        • parameters
          • W1 , B1, W2,B2....
    • machine learning
      • training/ developing(hold-out cross validation)/ test sets
        • before rate : 60:20:20
        • now rate : just make sure dev sets have enough examples(not 20%, maybe just 1%(10k) or 0.1%)
      • bias/ variance(偏差/方差)

      • basic recipe
        • high bias (training data performance): bigger network or train longer
        • high variance--过拟合 (developing data performance): more data or regularization
      • regularization-- to avoid overfitting
        • L2 regularization: to make W more appropriate:
          • just W not B: because the number of W is more than that of B
        • in neural network
        • why prevent overfitting?

        • dropout regularization

          • inverted dropout(反向随即失活)
            • keep_prob: 0.8;0.9;1
          • understanding
            • can't rely on any features, so have to spread out weights
            • if a layer is easy to be overfitting, set the keep_prob less, maybe 0.5
      • other ways to avoid overfitting
        • flip the image horizontally(水平翻转图像)
        • take random distortions and translations of the image(随即扭转、变换图片)

        • early stopping

      • normalizing training sets

        • move them
        • divide them by this vector sigma squared
        • average==0,variance==1
      • exploding and vanishing gradients
        • W>1 exploding
        • W<1 vanishing
      • weight initialization

      • gradient checking

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值