Andrew Ng Machine Learning 第四、五周

最新推荐文章于 2024-06-26 09:48:29 发布

未知丶丶

最新推荐文章于 2024-06-26 09:48:29 发布

阅读量1k

点赞数

分类专栏：机器学习文章标签：机器学习

本文链接：https://blog.csdn.net/qq_43310834/article/details/84929470

版权

机器学习专栏收录该内容

10 篇文章 0 订阅

订阅专栏

前言

网易云课堂（双语字幕，不卡）：https://study.163.com/course/courseMain.htm?courseId=1004570029
Coursera：https://www.coursera.org/learn/machine-learning
本人初学者，先在网易云课堂上看网课，再去Coursera上做作业，开博客以记录，文章中引用图片皆为课程中所截。

神经网络（ Neural Network）

1.模型描述

在这里插入图片描述
Tips: 若只有2个特征值时，使用平方项或者里方向的sigmoid函数时间复杂度还可以接受，若特征值过多，logistic回归需要的时间太多

2.模型展示

《1》基本模型

在这里插入图片描述
Tips: layer1称为输入层，h_θ(x)为输出层，中间的所有计算为隐藏层。对于隐藏层的每个单元，都是一个独立的sigmoid函数，即为g(θ₀x₀+θ₁x₁…)其中x₀为1，即在输入层x1上的偏置单元(bias unit)，同理a₀也为layer2上的偏置单元，每个x对应的系数不同。

《2》向前传播（Forward propagation）

在这里插入图片描述
Tips: 将每个sigmoid里的多项式简化为z，举个例子，则a⁽²⁾₁=g(z⁽²⁾₁)，其中z=θ₀x₀+θ₁x₁…，对于每层都用向量化的方法构成矩阵相乘的算法。

3.直觉理解

《1》逻辑AND和逻辑OR和逻辑NOT

在这里插入图片描述

Tips: 根据↑图列出真值表

《2》复杂函数XOR

在这里插入图片描述
Tips: 红线即为x1 and x2，青线即为(not x1) and (not x2)，绿线为a1 or a2，一步步构成了复杂逻辑函数XOR

4.多元分类问题

在这里插入图片描述

5.代价函数

在这里插入图片描述

6.反向传播算法

在这里插入图片描述
Tips: δ意思是该结点的偏移程度，即和训练集的不同程度

Tips: 最后一层即输出层的偏移程度由向前传播算法求出来的对应激活值a-训练集的结果

Tips: 剩下每层的δ计算方法，g^‘(z⁽³⁾)由微积分可以求解，结果为该层激活值a.*(1-a)，其中.*为矩阵元素直接相乘，δ⁽¹⁾不存在因为第一层为输入层不会有偏差
在这里插入图片描述
Tips: 由此就算出了代价函数的导数

7.反向传播方法理解

在这里插入图片描述
Tips: 下一个有关于代价函数的定义

8.梯度检测

在这里插入图片描述
Tips: 另外一种求J(θ)导数方法，即为取一个极小的数ε，在左右范围内Δy/Δx

Tips: 之后对比该方法求出来的导数与反向传播算法的导数是否约等

9.参数初始化

在这里插入图片描述

10.总结

1.根据情况选择网络结构

在这里插入图片描述

2.随机初始化参数

3.向前传播求激活值a

4.计算代价函数J(θ)

5.反向传播求代价函数的导数

6.梯度检查导数项是否正确

7.梯度下降算法变化θ最小化J(θ)

题目

1.Question1

Which of the following statements are true? Check all that apply.
在这里插入图片描述
解答：CD

2.Question 2

Consider the following neural network which takes two binary-valued inputs x1,x 2∈{0,1} and outputs h_Θ(x). Which of the following logical functions does it (approximately) compute?
在这里插入图片描述

解答：B

3.Question 3

Consider the neural network given below. Which of the following equations correctly computes the activation a₁⁽³⁾? Note: g(z) is the sigmoid activation function.
在这里插入图片描述

解答：A

4.Question 4

You have the following neural network:
在这里插入图片描述
You’d like to compute the activations of the hidden layer a⁽²⁾∈R³. One way to do so is the following Octave code:

You want to have a vectorized implementation of this (i.e., one that does not use for loops). Which of the following implementations correctly compute a⁽²⁾? Check all that apply.
在这里插入图片描述
解答：A
（B：θ₁是3*3，x是3位vector）

5.Question 5

在这里插入图片描述

解答：A

6.Question 6

You are training a three layer neural network and would like to use backpropagation to compute the gradient of the cost function. In the backpropagation algorithm, one of the steps is to update
在这里插入图片描述
for every i, ji,j. Which of the following is a correct vectorization of this step?

解答：B

7.Question 7

Suppose Theta1 is a 5x3 matrix, and Theta2 is a 4x6 matrix. You set thetaVec=[Theta1(：);Theta2(：)]. Which of the following correctly recovers Theta2?
在这里插入图片描述
解答：A