转载过程中,图片丢失,代码显示错乱。
为了更好的学习内容,请访问原创版本:
http://www.missshi.cn/api/view/blog/59aa08fee519f50d04000170
Ps:初次访问由于js文件较大,请耐心等候(8s左右)
本节课中,我们将学习如何利用Python的来Logistic。
这是第一节Python代码内容,接下来我们将从一些基本的Python编程开始讲述。
本文中的代码经过作者改进,修改bug,已经提交到github。地址为:
https://github.com/Lite-Java/missshi_deeplearning_ai
使用numpy构建基本函数
numpy是Python在科学计算中最常用的库。接下来我们将要学习一些numpy中包含的常用函数。
练习1:利用np.exp()实现sigmod函数:
在利用np.exp()函数之前,我们首先使用math.exp()函数来实现sigmod函数,并将二者对比来突出np.exp()的优点。
其中,
import math
def basic_sigmod(x):
"""
# 计算单个标量的sigmod函数
"""
s = 1.0 / (1 + 1/ math.exp(x))
return s
print basic_sigmod(3)
# 0.9525741268224334
上述描述了如何对一个标量执行sigmod函数,而在深度学习的应用中,我们通过是对向量或者矩阵来执行sigmod运算。
如何执行将该函数用于矢量或者矩阵,那么系统会抛出异常:
- print basic_sigmod([3, 2, 1])
而如果使用的是np.exp函数的话,如果输入的是一个矢量或者矩阵,那么对应的输出也会是矢量或矩阵,即针对每个元素进行指数计算。
import numpy as np
x = np.array([1, 2, 3])
print np.exp(x)
# [ 2.71828183 7.3890561 20.08553692]
此外,对于numpy array类型的变量,其加减乘除的方法也统一被改写。
以下面的例子为例:
- x = np.array([1, 2, 3])
- print x + 3
- # [4 5 6]
接下来,我们来实现一个真正的、可用于矢量或矩阵的sigmod函数:
其需求如下:
import numpy as np
def sigmod(x):
"""
# sigmod函数,可用于矢量和矩阵
"""
s = 1.0 / (1 + 1 / np.exp(x))
return s
x = np.array([1, 2, 3])
print np.exp(x)
# [ 0.73105858, 0.88079708, 0.95257413]
练习2:计算sigmod函数的导数
在之前的理论课程中,我们学习到了sigmod函数的导数公式如下:
接下来,我们通过Python代码进行实现:
- def sigmoid_derivative(x):
- """
- Compute the gradient (also called the slope or derivative) of the sigmoid function with respect to its input x.
- You can store the output of the sigmoid function into variables and then use it to calculate the gradient.
- Arguments:
- x -- A scalar or numpy array
- Return:
- ds -- Your computed gradient.
- """
- s = 1.0 / (1 + 1 / np.exp(x))
- ds = s * (1 - s)
- return ds
- x = np.array([1, 2, 3])
- print "sigmoid_derivative(x) = " + str(sigmoid_derivative(x))
- # sigmoid_derivative(x) = [ 0.19661193 0.10499359 0.04517666]
练习3:将一副图像转为为一个向量
在numpy中,有两个常用的函数:np.shape和np.reshape()。
其中,X.shape可以用于查看当前矩阵的维度。
X.reshape()可以用于修改矩阵的维度或形状。
例如,对于一副彩色图像,其通常是由一个三维矩阵组成的(RGB三个通道)。然而,在深度学习的应用中,我们通常需要将其转换为一个矢量,其长度为3*length*width。
即我们需要将一个三维的矩阵转换为一个一维的向量。
接下来,我们需要实现一个image2vector函数,其输入为一个三维矩阵(length, height, 3),输出为一个矢量。
def image2vector(image):
"""
Argument:
image -- a numpy array of shape (length, height, depth)
Returns:
v -- a vector of shape (length*height*depth, 1)
"""
v = image.reshape((image.shape[0] * image.shape[1] * image.shape[2], 1))
return v
image = np.array([[[ 0.67826139, 0.29380381],
[ 0.90714982, 0.52835647],
[ 0.4215251 , 0.45017551]],
[[ 0.92814219, 0.96677647],
[ 0.85304703, 0.52351845],
[ 0.19981397, 0.27417313]],
[[ 0.60659855, 0.00533165],
[ 0.10820313, 0.49978937],
[ 0.34144279, 0.94630077]]])
print "image2vector(image) = " + str(image2vector(image))
# [[ 0.67826139] [ 0.29380381] [ 0.90714982] [ 0.52835647] [ 0.4215251 ] [ 0.45017551] [ 0.92814219] [ 0.96677647] [ 0.85304703] [ 0.52351845] [ 0.19981397] [ 0.27417313] [ 0.60659855] [ 0.00533165] [ 0.10820313] [ 0.49978937] [ 0.34144279] [ 0.94630077]]
练习4:按行归一化
在深度学习中,常用的一个技巧是需要对我们的数据进行归一化。
通过,在对数据进行归一化后,梯度下降算法的收敛速度会明显加快。
接下来,我们需要对一个矩阵进行按行归一化,归一化后的结果是每一个的长度为1。
例如:
- def normalizeRows(x):
- """
- Implement a function that normalizes each row of the matrix x (to have unit length).
- Argument:
- x -- A numpy matrix of shape (n, m)
- Returns:
- x -- The normalized (by row) numpy matrix. You are allowed to modify x.
- """
- x_norm = np.linalg.norm(x, axis=1, keepdims = True) #计算每一行的长度,得到一个列向量
- x = x / x_norm #利用numpy的广播,用矩阵与列向量相除。
- return x
- x = np.array([
- [0, 3, 4],
- [1, 6, 4]])
- print "normalizeRows(x) = " + str(normalizeRows(x))
- # normalizeRows(x) = [[0. 0.6 0.8] [ 0.13736056 0.82416338 0.54944226]]
在上面的代码中,我们利用了广播的特性,接下来我们主要学习一下广播的使用。
练习5:广播的使用及softmax函数的实现
广播是numpy中一个非常强大的功能,它可以帮助我们对不同维度的矩阵、向量、标量之前快速计算。
接下来,我们需要实现一个softmax函数,其定义如下:
def softmax(x):
"""Calculates the softmax for each row of the input x.
Your code should work for a row vector and also for matrices of shape (n, m).
Argument:
x -- A numpy matrix of shape (n,m)
Returns:
s -- A numpy matrix equal to the softmax of x, of shape (n,m)
"""
x_exp = np.exp(x) # (n,m)
x_sum = np.sum(x_exp, axis = 1, keepdims = True) # (n,1)
s = x_exp / x_sum # (n,m) 广播的作用
return s
x = np.array([
[9, 2, 5, 0, 0],
[7, 5, 0, 0 ,0]])
print "softmax(x) = " + str(softmax(x))
# softmax(x) = [[ 9.80897665e-01 8.94462891e-04 1.79657674e-02 1.21052389e-04 1.21052389e-04] [ 8.78679856e-01 1.18916387e-01 8.01252314e-04 8.01252314e-04 8.01252314e-04]]
矢量化
在深度学习中,我们通常会处理大数据量的数据集。
因此=,计算速度可能会成为整个训练过程中的瓶颈。
为了保证我们计算的效率,我们需要对进行过程矢量化。
接下来,我们对比一下是否使用矢量化对于点乘、外积和按元素相乘等操作来说,计算效率的比较。
首先,利用原生方法的实现过程如下:
- import time
- x1 = [9, 2, 5, 0, 0, 7, 5, 0, 0, 0, 9, 2, 5, 0, 0]
- x2 = [9, 2, 2, 9, 0, 9, 2, 5, 0, 0, 9, 2, 5, 0, 0]
- ### CLASSIC DOT PRODUCT OF VECTORS IMPLEMENTATION ###
- tic = time.process_time()
- dot = 0
- for i in range(len(x1)):
- dot+= x1[i]*x2[i]
- toc = time.process_time()
- print ("dot ----- Computation time = " + str(1000*(toc - tic)) + "ms")
- ### CLASSIC OUTER PRODUCT IMPLEMENTATION ###
- tic = time.process_time()
- outer = np.zeros((len(x1),len(x2))) # we create a len(x1)*len(x2) matrix with only zeros
- for i in range(len(x1)):
- for j in range(len(x2)):
- outer[i,j] = x1[i]*x2[j]
- toc = time.process_time()
- print ("outer ----- Computation time = " + str(1000*(toc - tic)) + "ms")
- ### CLASSIC ELEMENTWISE IMPLEMENTATION ###
- tic = time.process_time()
- mul = np.zeros(len(x1))
- for i in range(len(x1)):
- mul[i] = x1[i]*x2[i]
- toc = time.process_time()
- print ("elementwise multiplication ----- Computation time = " + str(1000*(toc - tic)) + "ms")
- ### CLASSIC GENERAL DOT PRODUCT IMPLEMENTATION ###
- W = np.random.rand(3,len(x1)) # Random 3*len(x1) numpy array
- tic = time.process_time()
- gdot = np.zeros(W.shape[0])
- for i in range(W.shape[0]):
- for j in range(len(x1)):
- gdot[i] += W[i,j]*x1[j]
- toc = time.process_time()
- print ("gdot ----- Computation time = " + str(1000*(toc - tic)) + "ms")
- # dot ----- Computation time = 0.17002099999974263ms
- # outer ----- Computation time = 0.34057500000006513ms
- # elementwise multiplication ----- Computation time = 0.1940779999998199ms
- # gdot ----- Computation time = 0.2362039999999066ms
接下来,利用矢量化实现的结果如下:
x1 = [9, 2, 5, 0, 0, 7, 5, 0, 0, 0, 9, 2, 5, 0, 0]
x2 = [9, 2, 2, 9, 0, 9, 2, 5, 0, 0, 9, 2, 5, 0, 0]
### VECTORIZED DOT PRODUCT OF VECTORS ###
tic = time.process_time()
dot = np.dot(x1,x2)
toc = time.process_time()
print ("dot ----- Computation time = " + str(1000*(toc - tic)) + "ms")
### VECTORIZED OUTER PRODUCT ###
tic = time.process_time()
outer = np.outer(x1,x2)
toc = time.process_time()
print ("outer ----- Computation time = " + str(1000*(toc - tic)) + "ms")
### VECTORIZED ELEMENTWISE MULTIPLICATION ###
tic = time.process_time()
mul = np.multiply(x1,x2)
toc = time.process_time()
print ("elementwise multiplication ----- Computation time = " + str(1000*(toc - tic)) + "ms")
### VECTORIZED GENERAL DOT PRODUCT ###
tic = time.process_time()
dot = np.dot(W,x1)
toc = time.process_time()
print ("gdot ----- Computation time = " + str(1000*(toc - tic)) + "ms")
# dot ----- Computation time = 0.16546899999991815ms
# outer ----- Computation time = 0.14168100000011563ms
# elementwise multiplication ----- Computation time = 0.10738799999998605ms
# gdot ----- Computation time = 0.38393900000022185ms
从上述结果中,我们可以看到矢量化的代码明显简单了很多。
同时,运行时间也有了一定程度的降低。降低的幅度不大主要是由于数据量较小的原因,随着数据量的增大,减小的幅度也会越来越明显。
练习1:L1误差函数的实现
我们需要使用numpy函数来实现L1误差函数:
其中,L1误差函数的定义如下:
^y表示估计值,y表示真实值。
- import numpy as np
- def L1(yhat, y):
- """
- Arguments:
- yhat -- vector of size m (predicted labels)
- y -- vector of size m (true labels)
- Returns:
- loss -- the value of the L1 loss function defined above
- """
- loss = np.sum(np.abs(y - yhat))
- return loss
- yhat = np.array([.9, 0.2, 0.1, .4, .9])
- y = np.array([1, 0, 0, 1, 1])
- print "L1 = " + str(L1(yhat,y))
- # L1 = 1.1
练习2:L2误差函数的实现
L2误差函数的定义如下:
import numpy as np
def L2(yhat, y):
"""
Arguments:
yhat -- vector of size m (predicted labels)
y -- vector of size m (true labels)
Returns:
loss -- the value of the L2 loss function defined above
"""
loss = np.sum(np.power((y - yhat), 2))
return loss
yhat = np.array([.9, 0.2, 0.1, .4, .9])
y = np.array([1, 0, 0, 1, 1])
print "L2 = " + str(L2(yhat,y))
# L2 = 0.43
Logistic的实现
接下来的内容中,我们将实现一个完成Logistic函数。包括:初始化、计算代价函数和梯度、使用梯度下降算法进行优化等并把他们整合成为一个函数。
本实验用于通过训练来判断一副图像是否为猫。
在这个过程中,我们将会用到如下库:
numpy:Python科学计算中最重要的库
h5py:Python与H5文件交互的库
mathplotlib:Python画图的库
PIL:Python图像相关的库
scipy:Python科学计算相关的库
在程序的开头,我们首先需要引入相关的库:
- import numpy as np
- import matplotlib.pyplot as plt
- import h5py
- import scipy
- from PIL import Image
- from scipy import ndimage
- %matplotlib inline #设置matplotlib在行内显示图片
在训练之前,首先需要读取数据,读取数据的代码如下:
def load_dataset():
"""
# 加载数据集
"""
train_dataset = h5py.File('datasets/train_catvnoncat.h5', "r") #读取H5文件
train_set_x_orig = np.array(train_dataset["train_set_x"][:]) # your train set features
train_set_y_orig = np.array(train_dataset["train_set_y"][:]) # your train set labels
test_dataset = h5py.File('datasets/test_catvnoncat.h5', "r")
test_set_x_orig = np.array(test_dataset["test_set_x"][:]) # your test set features
test_set_y_orig = np.array(test_dataset["test_set_y"][:]) # your test set labels
classes = np.array(test_dataset["list_classes"][:]) # the list of classes
train_set_y_orig = train_set_y_orig.reshape((1, train_set_y_orig.shape[0])) #对训练集和测试集标签进行reshape
test_set_y_orig = test_set_y_orig.reshape((1, test_set_y_orig.shape[0]))
return train_set_x_orig, train_set_y_orig, test_set_x_orig, test_set_y_orig, classes
train_set_x_orig, train_set_y, test_set_x_orig, test_set_y, classes = load_dataset()
Ps:为了给大家提供更好的学习效果,我们提供了原始数据集。
请访问http://www.missshi.cn/#/books搜索train_catvnoncat.h5和test_catvnoncat.h5进行下载,首次访问Js可能加载微慢,请耐心等候(约10s)。
如果感觉不错希望大家推广下网站哈!不建议大家把训练集直接在QQ群或CSDN上直接分享。
数据说明:
对于训练集的标签而言,对于猫,标记为1,否则标记为0。
每一个图像的维度都是(num_px, num_px, 3),其中,长宽相同,3表示是RGB图像。
train_set_x_orig和test_set_x_orig中,包含_orig是由于我们稍候需要对图像进行预处理,预处理后的变量将会命名为train_set_x和train_set_y。
train_set_x_orig中的每一个元素对于这一副图像,我们可以用如下代码将图像显示出来:
- index = 25
- plt.imshow(train_set_x_orig[index])
- print "y = " + str(train_set_y[:, index]) + ", it's a '" + classes[np.squeeze(train_set_y[:, index])].decode("utf-8") + "' picture."
- # y = [1], it's a 'cat' picture.
接下来,我们需要根据图像集来计算出训练集的大小、测试集的大小以及图片的大小:
m_train = train_set_x_orig.shape[0]
m_test = test_set_x_orig.shape[0]
num_px = train_set_x_orig.shape[1]
print (m_train, m_test, num_px)
# 209, 50, 64
接下来,我们需要对将每幅图像转为一个矢量,即矩阵的一列。
最终,整个训练集将会转为一个矩阵,其中包括num_px*numpy*3行,m_train列。
- train_set_x_flatten = train_set_x_orig.reshape(train_set_x_orig.shape[0], -1).T
- test_set_x_flatten = test_set_x_orig.reshape(test_set_x_orig.shape[0], -1).T
Ps:其中X_flatten = X.reshape(X.shape[0], -1).T可以将一个维度为(a,b,c,d)的矩阵转换为一个维度为(b∗∗c∗∗d, a)的矩阵。
接下来,我们需要对图像值进行归一化。
由于图像的原始值在0到255之间,最简单的方式是直接除以255即可。
train_set_x = train_set_x_flatten/255.
test_set_x = test_set_x_flatten/255.
接下来,我们来看一下Logistic的结构:
对于每个训练样本x,其误差函数的计算方式如下:
而整体的代价函数计算如下:
接下来,我们将按照如下步骤来实现Logistic:
1. 定义模型结构
2. 初始化模型参数
3. 循环
3.1 前向传播
3.2 反向传递
3.3 更新参数
4. 整合成为一个完整的模型
Step1:实现sigmod函数
- def sigmoid(z):
- """
- Compute the sigmoid of z
- Arguments:
- z -- A scalar or numpy array of any size.
- Return:
- s -- sigmoid(z)
- """
- s = 1.0 / (1 + 1 / np.exp(z))
- return s
Step2:初始化参数
def initialize_with_zeros(dim):
"""
This function creates a vector of zeros of shape (dim, 1) for w and initializes b to 0.
Argument:
dim -- size of the w vector we want (or number of parameters in this case)
Returns:
w -- initialized vector of shape (dim, 1)
b -- initialized scalar (corresponds to the bias)
"""
w = np.zeros((dim, 1))
b = 0
return w, b
Step3:前向传播与反向传播
Ps:计算公式如下:(具体计算公式来源请查看之前的理论课)
- def propagate(w, b, X, Y):
- """
- Implement the cost function and its gradient for the propagation explained above
- Arguments:
- w -- weights, a numpy array of size (num_px * num_px * 3, 1)
- b -- bias, a scalar
- X -- data of size (num_px * num_px * 3, number of examples)
- Y -- true "label" vector (containing 0 if non-cat, 1 if cat) of size (1, number of examples)
- Return:
- cost -- negative log-likelihood cost for logistic regression
- dw -- gradient of the loss with respect to w, thus same shape as w
- db -- gradient of the loss with respect to b, thus same shape as b
- Tips:
- - Write your code step by step for the propagation. np.log(), np.dot()
- """
- m = X.shape[1]
- # FORWARD PROPAGATION (FROM X TO COST)
- A = sigmoid(np.dot(w.T, X) + b) # compute activation
- cost = -1.0 / m * np.sum(Y * np.log(A) + (1.0 - Y) * np.log(1.0 - A)) # compute cost
- # BACKWARD PROPAGATION (TO FIND GRAD)
- dw = 1.0 / m * np.dot(X, (A - Y).T)
- db = 1.0 / m * np.sum(A - Y)
- cost = np.squeeze(cost)
- grads = {"dw": dw,
- "db": db}
- return grads, cost
Step4:更新参数
更新参数的公式如下:
完整代码如下:
def optimize(w, b, X, Y, num_iterations, learning_rate, print_cost = False):
"""
This function optimizes w and b by running a gradient descent algorithm
Arguments:
w -- weights, a numpy array of size (num_px * num_px * 3, 1)
b -- bias, a scalar
X -- data of shape (num_px * num_px * 3, number of examples)
Y -- true "label" vector (containing 0 if non-cat, 1 if cat), of shape (1, number of examples)
num_iterations -- number of iterations of the optimization loop
learning_rate -- learning rate of the gradient descent update rule
print_cost -- True to print the loss every 100 steps
Returns:
params -- dictionary containing the weights w and bias b
grads -- dictionary containing the gradients of the weights and bias with respect to the cost function
costs -- list of all the costs computed during the optimization, this will be used to plot the learning curve.
Tips:
You basically need to write down two steps and iterate through them:
1) Calculate the cost and the gradient for the current parameters. Use propagate().
2) Update the parameters using gradient descent rule for w and b.
"""
costs = []
for i in range(num_iterations): #每次迭代循环一次, num_iterations为迭代次数
# Cost and gradient calculation
grads, cost = propagate(w, b, X, Y)
# Retrieve derivatives from grads
dw = grads["dw"]
db = grads["db"]
# update rule
w = w - learning_rate * dw
b = b - learning_rate * db
# Record the costs
if i % 100 == 0:
costs.append(cost)
# Print the cost every 100 training examples
if print_cost and i % 100 == 0:
print ("Cost after iteration %i: %f" %(i, cost))
params = {"w": w,
"b": b}
grads = {"dw": dw,
"db": db}
return params, grads, costs
Step5:利用训练好的模型对测试集进行预测:
计算公式如下:
当输入大于0.5时,我们认为其预测认为结果是猫,否则不是猫。
- def predict(w, b, X):
- '''
- Predict whether the label is 0 or 1 using learned logistic regression parameters (w, b)
- Arguments:
- w -- weights, a numpy array of size (num_px * num_px * 3, 1)
- b -- bias, a scalar
- X -- data of size (num_px * num_px * 3, number of examples)
- Returns:
- Y_prediction -- a numpy array (vector) containing all predictions (0/1) for the examples in X
- '''
- m = X.shape[1]
- Y_prediction = np.zeros((1,m))
- w = w.reshape(X.shape[0], 1)
- # Compute vector "A" predicting the probabilities of a cat being present in the picture
- A = sigmoid(np.dot(w.T, X) + b)
- for i in range(A.shape[1]):
- # Convert probabilities A[0,i] to actual predictions p[0,i]
- if A[0][i] > 0.5:
- Y_prediction[0][i] = 1
- else:
- Y_prediction[0][i] = 0
- return Y_prediction
Step5:将以上功能整合到一个模型中:
def model(X_train, Y_train, X_test, Y_test, num_iterations = 2000, learning_rate = 0.5, print_cost = False):
"""
Builds the logistic regression model by calling the function you've implemented previously
Arguments:
X_train -- training set represented by a numpy array of shape (num_px * num_px * 3, m_train)
Y_train -- training labels represented by a numpy array (vector) of shape (1, m_train)
X_test -- test set represented by a numpy array of shape (num_px * num_px * 3, m_test)
Y_test -- test labels represented by a numpy array (vector) of shape (1, m_test)
num_iterations -- hyperparameter representing the number of iterations to optimize the parameters
learning_rate -- hyperparameter representing the learning rate used in the update rule of optimize()
print_cost -- Set to true to print the cost every 100 iterations
Returns:
d -- dictionary containing information about the model.
"""
# initialize parameters with zeros
w, b = initialize_with_zeros(X_train.shape[0])
# Gradient descent
parameters, grads, costs = optimize(w, b, X_train, Y_train, num_iterations, learning_rate, print_cost)
# Retrieve parameters w and b from dictionary "parameters"
w = parameters["w"]
b = parameters["b"]
# Predict test/train set examples
Y_prediction_test = predict(w, b, X_test)
Y_prediction_train = predict(w, b, X_train)
# Print train/test Errors
print("train accuracy: {} %".format(100 - np.mean(np.abs(Y_prediction_train - Y_train)) * 100))
print("test accuracy: {} %".format(100 - np.mean(np.abs(Y_prediction_test - Y_test)) * 100))
d = {"costs": costs,
"Y_prediction_test": Y_prediction_test,
"Y_prediction_train" : Y_prediction_train,
"w" : w,
"b" : b,
"learning_rate" : learning_rate,
"num_iterations": num_iterations}
return d
测试一下该模型吧:
- d = model(train_set_x, train_set_y, test_set_x, test_set_y, num_iterations = 2000, learning_rate = 0.005, print_cost = True)
此时,观察打印结果,我们可以发现我们的测试准确率已经可以达到70.0%。
而对于训练集,其准确性达到了99%。这表明了我们的模型有着一定的过拟合,不过不要着急,我们会在后续的内容中来解决这一问题。
使用如下代码,我们可以挑选其中的一些图片来看我们的预测结果:
# Example of a picture that was wrongly classified.
index = 14
plt.imshow(test_set_x[:,index].reshape((num_px, num_px, 3)))
print ("y = " + str(test_set_y[0,index]) + ", you predicted that it is a \"" + classes[int(d["Y_prediction_test"][0,index])].decode("utf-8") + "\" picture.")
此外,我们还可以画出我们的代价函数变化曲线:
- # Plot learning curve (with costs)
- costs = np.squeeze(d['costs'])
- plt.plot(costs)
- plt.ylabel('cost')
- plt.xlabel('iterations (per hundreds)')
- plt.title("Learning rate =" + str(d["learning_rate"]))
- plt.show()
之前的理论课程中,我们已经提及过学习速率对于最终的结果有着较大影响,现在,我们来用实验让大家有一个直观的了解。
learning_rates = [0.01, 0.001, 0.0001]
models = {}
for i in learning_rates:
print ("learning rate is: " + str(i))
models[str(i)] = model(train_set_x, train_set_y, test_set_x, test_set_y, num_iterations = 1500, learning_rate = i, print_cost = False)
print ('\n' + "-------------------------------------------------------" + '\n')
for i in learning_rates:
plt.plot(np.squeeze(models[str(i)]["costs"]), label= str(models[str(i)]["learning_rate"]))
plt.ylabel('cost')
plt.xlabel('iterations')
legend = plt.legend(loc='upper center', shadow=True)
frame = legend.get_frame()
frame.set_facecolor('0.90')
plt.show()
分析:不同的学习速率会导致不同的预测结果。较小的学习速度收敛速度较慢,而过大的学习速度可能导致震荡或无法收敛。
如果你希望用一副你自己的图像,而不是训练集或测试集中的图像,那么该如何实现呢?
- ## START CODE HERE ## (PUT YOUR IMAGE NAME)
- my_image = "my_image.jpg" # change this to the name of your image file
- ## END CODE HERE ##
- # We preprocess the image to fit your algorithm.
- fname = "images/" + my_image
- image = np.array(ndimage.imread(fname, flatten=False)) #读取图片
- my_image = scipy.misc.imresize(image, size=(num_px,num_px)).reshape((1, num_px*num_px*3)).T #放缩图像
- my_predicted_image = predict(d["w"], d["b"], my_image) #预测
- plt.imshow(image)
- print("y = " + str(np.squeeze(my_predicted_image)) + ", your algorithm predicts a \"" + classes[int(np.squeeze(my_predicted_image)),].decode("utf-8") + "\" picture.")