吴恩达课程作业（最新版，已解决早期版本不一致带来的bug）

最新推荐文章于 2023-06-09 19:36:28 发布

SmileBL

最新推荐文章于 2023-06-09 19:36:28 发布

阅读量1k

点赞数 4

分类专栏：吴恩达视频课程作业文章标签：人工智能机器学习深度学习 python

本文链接：https://blog.csdn.net/SmileBL/article/details/107738357

版权

写在前面：

这是我的第一篇关于深度学习的处女作，转入这个领域不久，还有很多的知识需要自己去学习和完善，深度学习与我而言，充满了数学的魅力与编程的乐趣，写这篇文章，第一可能是为了为自己划一个更深的记忆痕迹，第二也是希望以文章的形式能为更多喜欢深度学习的小朋友带来便利。力求让大家以最轻松的姿态理解吴恩达的视频，如有不妥的地方欢迎大家指正。

===============================================================
各位同学大家好，今天我们开始第一天吴恩达课程的学习：第一个编程是Logistic回归：

Logistic Regression with a Neural Network mindset

Welcome to your first (required) programming assignment! You will build a logistic regression classifier to recognize cats. This assignment will step you through how to do this with a Neural Network mindset, and so will also hone your intuitions about deep learning.

欢迎来到第一个编程练习，在这里你将建立一个用来识别猫的逻辑回归

Instructions:

Do not use loops (for/while) in your code, unless the instructions explicitly ask you to do so.

注意：
不要在你的代码中使用循环，除非明确要求你这么做。

You will learn to:

Build the general architecture of a learning algorithm, including:
- Initializing parameters
- Calculating the cost function and its gradient
- Using an optimization algorithm (gradient descent)
Gather all three functions above into a main model function, in the right order.

你将学习到：
1.建立一个学习算法的总体框架，包括：
1.1. 初始化参数
1.2.计算损失函数和它的梯度
1.3.使用最优算法（梯度下降）
2. 按照正确的顺序将上述三个函数集合到一个主模型函数中。

1 - Package

First, let’s run the cell below to import all the packages that you will need during this assignment.

numpy is the fundamental package for scientific computing with Python.
h5py is a common package to interact with a dataset that is stored on an H5 file.
matplotlib is a famous library to plot graphs in Python.
PIL and scipy are used here to test your model with your own picture at the end.

1-包：
首先，让我们运行下面的单元来导入在此任务期间需要的所有包。
numpy: 这是一个用于科学计算的基础包
h5py: 一个与存储在H5文件中的数据集交互的通用包。
matplotlib:这是一个python中用于画图的包
PLT和scipy: 这两个包在这使用是为了最后测试你自己的图片的模型

import numpy as np
import matplotlib.pyplot as plt
import h5py
import scipy
from PIL import Image
from scipy import ndimage
from lr_utils import load_dataset

%matplotlib inline

2 - Overview of the Problem set

Problem Statement: You are given a dataset (“data.h5”) containing:

a training set of m_train images labeled as cat (y=1) or non-cat (y=0)
a test set of m_test images labeled as cat or non-cat

each image is of shape (num_px, num_px, 3) where 3 is for the 3 channels (RGB). Thus, each image is square (height = num_px) and (width = num_px).

You will build a simple image-recognition algorithm that can correctly classify pictures as cat or non-cat.

Let’s get more familiar with the dataset. Load the data by running the following code.

问题集叙述：
问题叙述：你得到的数据集（“data.h5”）包含如下：
一个训练集M个训练图片的标签y=1表示有猫，y=0表示无猫。
一个测试集有m个测试的图片标签作为有猫和无猫的判断。
每一个图片的形状(num_px, num_px, 3),3表示RGB三个通道。第一个num_px表示高度，第二个num_px表示宽度

你将建立一个简单的图片分类算法，正确的分类出这张图片是否为猫
让我们更加的熟悉数据集，运行以下代码：

# Loading the data (cat/non-cat)
train_set_x_orig, train_set_y, test_set_x_orig, test_set_y, classes = load_dataset()
# 以下为了更好的测试代码，将这些数据的大小等基本信息打印出来供参考
print("train_set_x_orig.shape =",train_set_x_orig.shape)
print("train_set_y.shape =",train_set_y.shape)
print("test_set_x_orig.shape =",test_set_x_orig.shape)
print("test_set_y.shape =",test_set_y.shape)
print("classes =",classes)

运行结果：
train_set_x_orig.shape = (209, 64, 64, 3)
train_set_y.shape = (1, 209)
test_set_x_orig.shape = (50, 64, 64, 3)
test_set_y.shape = (1, 50)
classes = [b’non-cat’ b’cat’]

We added “_orig” at the end of image datasets (train and test) because we are going to preprocess them. After preprocessing, we will end up with train_set_x and test_set_x (the labels train_set_y and test_set_y don’t need any preprocessing).

Each line of your train_set_x_orig and test_set_x_orig is an array representing an image. You can visualize an example by running the following code. Feel free also to change the index value and re-run to see other images.

我们添加了"_orig"在图片数据集的最后(train and test)因为我们将对它们进行预处理。在预处理之后，我们将得到train_set_x和test_set_x(标签train_set_y和test_set_y不需要任何预处理)。

train_set_x_orig和test_set_x_orig的每一行都是一个表示图像的数组。您可以通过运行以下代码来可视化示例。您也可以随意更改索引值（index）来查看其他图像。

# Example of a picture
index = 25
# print(train_set_x_orig[index])  # 自己填加的一行，可以打印出来这个图片所对应的矩阵
plt.imshow(train_set_x_orig[index])
print ("y = " + str(train_set_y[:, index]) + ", it's a '" + classes[np.squeeze(train_set_y[:, index])].decode("utf-8") +  "' picture.")

运行结果：
y = [1], it’s a ‘cat’ picture.

Many software bugs in deep learning come from having matrix/vector dimensions that don’t fit. If you can keep your matrix/vector dimensions straight you will go a long way toward eliminating many bugs.

Exercise: Find the values for:
- m_train (number of training examples)
- m_test (number of test examples)
- num_px (= height = width of a training image)
Remember that train_set_x_orig is a numpy-array of shape (m_train, num_px, num_px, 3). For instance, you can access m_train by writing train_set_x_orig.shape[0].

深度学习中的许多软件bug都来自于不适合的矩阵/向量维数。如果你能保持你的矩阵/向量维数正确一致，你将会对消除许多bug大有帮助。

练习：找到值

m_train: 训练样例的大小

m_test: 测试样例的大小

num_px: 训练图片的宽和高

记住：train_set_x_orig是一个numpy数组的形状(num_px, num_px, 3).比如，你可以写m_train 通过train_set_x_orig.shape[0].来获得。

### START CODE HERE ### (≈ 3 lines of code)
m_train = train_set_y.shape[1]  # 获取训练样本的大小
m_test = test_set_y.shape[1]  # 获取测试样本的大小
num_px = train_set_x_orig.shape[1]
### END CODE HERE ###

print ("Number of training examples: m_train = " + str(m_train))
print ("Number of testing examples: m_test = " + str(m_test))
print ("Height/Width of each image: num_px = " + str(num_px))
print ("Each image is of size: (" + str(num_px) + ", " + str(num_px) + ", 3)")
print ("train_set_x shape: " + str(train_set_x_orig.shape))
print ("train_set_y shape: " + str(train_set_y.shape))
print ("test_set_x shape: " + str(test_set_x_orig.shape))
print ("test_set_y shape: " + str(test_set_y.shape))

运行结果：

Number of training examples: m_train = 209
Number of testing examples: m_test = 50
Height/Width of each image: num_px = 64
Each image is of size: (64, 64, 3)
train_set_x shape: (209, 64, 64, 3)
train_set_y shape: (1, 209)
test_set_x shape: (50, 64, 64, 3)
test_set_y shape: (1, 50)

For convenience, you should now reshape images of shape (num_px, num_px, 3) in a numpy-array of shape (num_px $*$ num_px $*$ 3, 1). After this, our training (and test) dataset is a numpy-array where each column represents a flattened image. There should be m_train (respectively m_test) columns.

Exercise: Reshape the training and test data sets so that images of size (num_px, num_px, 3) are flattened into single vectors of shape (num_px $*$ num_px $*$ 3, 1).

A trick when you want to flatten a matrix X of shape (a,b,c,d) to a matrix X_flatten of shape (b $*$ c $*$ d, a) is to use:

X_flatten = X.reshape(X.shape[0], -1).T      # X.T is the transpose of X

为了方便起见，现在应该将图片的形状(num_px, num_px, 3)reshape成numpy的数组形状(num_px∗num_px∗3,1)。在这之后，我们的训练(和测试)数据集是一个数字数组，
其中每一列表示一个扁平图像。应该有m_train(分别是m_test)列。

练习:reshape训练和测试数据集，这样大小的图像(num_px, num_px, 3)被平化成单个的形状向量(num_px∗num_px∗3,1)。
一个技巧: 当你想把一个矩阵X(a,b,c,d)平坦化为一个矩阵X_flatten(b∗c∗d, a),
你应该：
X_flatten = X.reshape(X.shape[0], -1).T # X.T is the transpose of X

# Reshape the training and test examples

### START CODE HERE ### (≈ 2 lines of code)
train_set_x_flatten = train_set_x_orig.reshape(train_set_x_orig.shape[0], -1).T  # (209, 64, 64, 3)将训练集展开成（64*64*3, 209）
test_set_x_flatten = test_set_x_orig.reshape(test_set_x_orig.shape[0], -1).T  # (50, 64, 64, 3)将测试集展开成（64*64*3, 50）
### END CODE HERE ###

print ("train_set_x_flatten shape: " + str(train_set_x_flatten.shape))
print ("train_set_y shape: " + str(train_set_y.shape))
print ("test_set_x_flatten shape: " + str(test_set_x_flatten.shape))
print ("test_set_y shape: " + str(test_set_y.shape))
print ("sanity check after reshaping: " + str(train_set_x_flatten[0:5,0]))

运行结果：
train_set_x_flatten shape: (12288, 209)
train_set_y shape: (1, 209)
test_set_x_flatten shape: (12288, 50)
test_set_y shape: (1, 50)
sanity check after reshaping: [17 31 56 22 33]

To represent color images, the red, green and blue channels (RGB) must be specified for each pixel, and so the pixel value is actually a vector of three numbers ranging from 0 to 255.

One common preprocessing step in machine learning is to center and standardize your dataset, meaning that you substract the mean of the whole numpy array from each example, and then divide each example by the standard deviation of the whole numpy array. But for picture datasets, it is simpler and more convenient and works almost as well to just divide every row of the dataset by 255 (the maximum value of a pixel channel).

Let’s standardize our dataset.

要表示彩色图像，必须为每个像素指定红、绿、蓝通道(RGB)，因此像素值实际上是三个数字的向量，从0到255。
机器学习中一个常见的预处理步骤是集中和标准化数据集，这意味着从每个示例中减去整个numpy数组的平均值，然后用每个示例除以整个numpy数组的标准差。但对于图片数据集来说，它更简单、更方便，而且工作效果也差不多，只需将数据集的每一行除以255(像素通道的最大值)。

让我们对数据集进行标准化。

train_set_x = train_set_x_flatten/255.
test_set_x = test_set_x_flatten/255.
# print(train_set_x_flatten)
# print(test_set_x_flatten)
# print(train_set_x)

What you need to remember:

Common steps for pre-processing a new dataset are:

Figure out the dimensions and shapes of the problem (m_train, m_test, num_px, …)
Reshape the datasets such that each example is now a vector of size (num_px * num_px * 3, 1)
“Standardize” the data

以下是你需要记住的：

预处理一个新数据集的常见步骤是:

找出问题的维度和形状(m_train, m_test, num_px，…)
reshape数据集，使每个示例现在是一个大小矢量(num_px * num_px * 3,1)
“标准化”数据

3 - General Architecture of the learning algorithm

It’s time to design a simple algorithm to distinguish cat images from non-cat images.

You will build a Logistic Regression, using a Neural Network mindset. The following Figure explains why Logistic Regression is actually a very simple Neural Network!

在这里插入图片描述

Mathematical expression of the algorithm:

For one example $x^{(i)}$ :
$z^{(i)} = w^T x^{(i)} + b \tag{1}$
$\hat{y}^{(i)} = a^{(i)} = sigmoid(z^{(i)})\tag{2}$
$\mathcal{L}(a^{(i)}, y^{(i)}) = - y^{(i)} \log(a^{(i)}) - (1-y^{(i)} ) \log(1-a^{(i)})\tag{3}$