机器学习学习笔记4

最新推荐文章于 2024-11-06 10:55:54 发布

C-beams

最新推荐文章于 2024-11-06 10:55:54 发布

阅读量578

点赞数 25

分类专栏：机器学习学习笔记文章标签：机器学习学习笔记

本文链接：https://blog.csdn.net/2401_82787858/article/details/138020952

版权

机器学习学习笔记专栏收录该内容

6 篇文章 0 订阅

订阅专栏

本文探讨了L7CNN卷积神经网络的工作原理，包括一维和二维图像表示、卷积层和最大池化层的应用，以及与回传传播的关系。同时，介绍了马尔科夫决策过程在决策制定中的角色，涉及状态机、动作、奖励函数和策略价值计算。

摘要由CSDN通过智能技术生成

L7 CNN 卷积神经网络

Images

灰度图，每个像素点有一个值在0到p之间，这里 0 black ，1 white

可以用一维向量 $x$ 或二维矩阵 $X$ 来表示一张图并作为NN的输入

若是RGB图，采用三维张量 tensor

Convolutional Layer 卷积层

1D example

A 1D image : $[\tilde0,0,0,1,1,1,0,1,0,0,0,\tilde0]$ $\tilde 0$ 为边界延拓数据
A filter : $[-1,1,-1]\ [\omega_1,\omega_2,\omega_3]\ with\ bias\ b$
After convolution* : $[0,-1,0,-1,-2,1,-1,0,0]$

dot product $,conv(x^{(1)},x^{(2)})=\sum x_i^{(1)}x_i^{(2)}$

After ReLU : $[0,0,0,0,0,1,0,0]$

what's happening in that image? Find a lonely pixel

2D example

A 2D image :

$\begin{bmatrix} 1&0 &1 &0 &0 \\ 1& 0 &1 &0 &1 \\ 1& 1& 1& 0& 0\\ 1& 0 &1 &0 &1 \\ 1& 0& 1& 0&1 \end{bmatrix}$

A filter : $\begin{bmatrix} -1 & -1 &-1 \\ -1 & 1& -1\\ -1& -1 & -1 \end{bmatrix}$
After convolution & ReLU: $\begin{bmatrix} 0& 0 &0 &0 &0 \\ 0& 0& 0& 0 & 1\\ 0 &0 & 0 &0&0 \\ 0& 0 &0& 0& 0\\ 0&0&0 &0&0 \end{bmatrix}$

Max pooling layer 最大池化层

2D example

Output from the convolutional layer & ReLU:

$\begin{bmatrix} 0 & 0 &0 &0 &0 &0 \\ 0 & 0 &0 &0 &1 &0 \\ 0 & 0 &0 &0 &0 &0 \\ 0 & 1 &0 &0 &0 &0 \\ 0 & 0 &0 &0 &0 &0 \\ 0 & 0 &0 &0 &0 &0 \end{bmatrix}$

Max pooling : returns max of its arguments

e.g. size 3x3("size 3") e.g. stride 3

After max pooling:

$\begin{bmatrix} 0 &1 \\ 1& 0 \end{bmatrix}$

Can use stride with filters too
No weight in max pooling

CNNs : typical architecture

$input\to feature \ learning\to classification\\ x\to NN(x;W,W_0)$

A familiar pattern

	ith data point	prediction for ith point	training loss over points 1 to n
Logistic regression	$x^{(i)}$	$LogiReg(x^{(i)};\theta,\theta_0)$	$J_{Logi}(\theta,\theta_0)$
Linear regression	$x^{(i)}$	$LinReg(x^{(i)};\theta,\theta_0)$	$J_{Lin}(\theta,\theta_0)$
Neutral networks	$x^{(i)}$	$NN(x^{(i)};W,W_0)$	$J_{NN}(W,W_0)$

CNNs:a taste of backpropagation

Regression:1 filter : size 3 & padding ; $x^{(j)}$ dimension: 5x1

Forward pass:

$Z_i^1 = (W^1)^{\top}X_{[i-1,i,i+1]}\ (5\times1)\\ A_I^1 = ReLU(Z_i^1)\ (5\times1) \\A^2=(W^2)^{\top}A^1\ (1\times 1) \\L(A^2,y) = (A^2-y)^2\ (1\times 1)$

Part of the derivative of SGD :

$\frac{\partial loss}{\partial W^1}=\frac{\partial Z^1}{\partial W^1}\cdot\frac{\partial A^1}{\partial Z^1}\cdot\frac{\partial loss}{\partial A^1}\\ \\3\times1= 3\times 5\cdot 5\times5\cdot 5\times 1$

L8 状态机与马尔科夫决策过程

Markov Decision Process

$S$ = set of possible states {rich,poor}
$A$ = set of possible actions {plant,fallow}
$T:S\times A\times S\to\mathbb{R}:$ transition model e.g.

$0.9=\\P(S_t=poor|S_{t-1}=rich,A_{t-1}=plant)=T(rich,plant,poor)$

$R:S\times A\to\mathbb{R}:$ reward function

e.g. R(rich,plant) = 100 bushels; R(poor,plant) = 10 bushels; R(rich,fallow) = 0 bushels ;R(poor,fallow) = 0 bushels

A discount factor : $\gamma$
A policy $\pi:S\to A$

What's the value of a policy?

h:horizon(e.g. how many growing seasons left)
$V_{\pi}^h(s)$ : value(expected reward) with policy $\pi$ starting at s

$V_{\pi}^0(s)=0;V_{\pi}^h(s)=R(s,\pi_h(s))+\sum_{s^{'}}T(s,\pi_h(s),s^{'})\cdot V_{\pi}^{h-1}(s^{'})$

What's the best policy?

$Q^h(s,a)$ : expected reward if starting at s, making action a, and then making the 'best' action for the h-1 steps keft
With Q ,can find an optimal policy: $\pi_h^{*}(s)=argmax_aQ^{h}(s,a)$

$Q^0(s,a)=0;Q^h(s,a)=R(s,a)+\sum_{s^{'}}T(s,a,s^{'})max_{a^{'}}Q^{h-1}(s^{'},a^{'})$

$Q^1(rich,plant)=100;Q^1(rich,fallow)=0; \\Q^1(poor,plant)=10;Q^1(poor,fallow)=0; \\Q^2(rich,plant)=119;Q^2(rich,fallow)=91; \\Q^2(poor,plant)=29;Q^2(poor,fallow)=91;$

What's best? Any s, $\pi_1^{*}(s)=plant;\pi_2^{*}(rich)=plant;\pi_2^{*}(poor)=fallow;$

What if I don't stop farming?

Problem:100 bushels today > 100 bushels in ten years
- A solution: discount factor $\gamma:0<\gamma<1$
- Value of 1 bushel after t time steps : $\gamma^t$ bushels
- Example: What's the value of 1 bushels per year forever?