L7 CNN 卷积神经网络
Images
灰度图,每个像素点有一个值在0到p之间,这里 0 black ,1 white
可以用一维向量或二维矩阵来表示一张图并作为NN的输入
若是RGB图,采用三维张量 tensor
Convolutional Layer 卷积层
1D example
- A 1D image : 为边界延拓数据
- A filter :
- After convolution* :
dot product
- After ReLU :
what's happening in that image? Find a lonely pixel
2D example
- A 2D image :
- A filter :
- After convolution & ReLU:
Max pooling layer 最大池化层
2D example
- Output from the convolutional layer & ReLU:
- Max pooling : returns max of its arguments
e.g. size 3x3("size 3") e.g. stride 3
- After max pooling:
- Can use stride with filters too
- No weight in max pooling
CNNs : typical architecture
A familiar pattern
ith data point | prediction for ith point | training loss over points 1 to n | |
Logistic regression | |||
Linear regression | |||
Neutral networks |
CNNs:a taste of backpropagation
Regression:1 filter : size 3 & padding ; dimension: 5x1
Forward pass:
Part of the derivative of SGD :
L8 状态机与马尔科夫决策过程
Markov Decision Process
- = set of possible states {rich,poor}
- = set of possible actions {plant,fallow}
- transition model e.g.
- reward function
e.g. R(rich,plant) = 100 bushels; R(poor,plant) = 10 bushels; R(rich,fallow) = 0 bushels ;R(poor,fallow) = 0 bushels
- A discount factor :
- A policy
What's the value of a policy?
-
h:horizon(e.g. how many growing seasons left)
-
: value(expected reward) with policy starting at s
What's the best policy?
- : expected reward if starting at s, making action a, and then making the 'best' action for the h-1 steps keft
- With Q ,can find an optimal policy:
What's best? Any s,
What if I don't stop farming?
- Problem:100 bushels today > 100 bushels in ten years
- A solution: discount factor
- Value of 1 bushel after t time steps : bushels
- Example: What's the value of 1 bushels per year forever?
- : value(expected reward) with policy starting at s
|S| linear equations in |S| unknowns