DeepLearing学习笔记-从逻辑回归出发

最新推荐文章于 2022-11-18 11:28:39 发布

JasonLiu1919

最新推荐文章于 2022-11-18 11:28:39 发布

阅读量1.6k

点赞数 2

分类专栏：机器学习文章标签： python 神经网络 deep-learning

本文链接：https://blog.csdn.net/ljp1919/article/details/78081693

版权

机器学习专栏收录该内容

17 篇文章 1 订阅

订阅专栏

背景：

从逻辑回归出发，介绍单层神经网络在模式分类中的简单应用。本文将阐述如何用逻辑回归进行猫的识别。从中，我们将创建一个常见的简单的算法模型：
1：参数初始化
2：计算代价函数及其梯度
3：采用优化算法，如梯度下降算法

准备工作：

numpy 是科学计算的常用库。
h5py 是python中用于处理H5文件的接口。
matplotlibpython中常用的图像绘制库。
PIL and scipy 在本文是用于测试自己训练的模型。
如上上述依赖库都正常安装的话，那么下面的代码对于库的加载操作就可以正常执行。

import numpy as np
import matplotlib.pyplot as plt
import h5py
import scipy
from PIL import Image
from scipy import ndimage
from lr_utils import load_dataset#数据的读取，自定义的文件

%matplotlib inline
'''
首先讲讲这句话的作用，matplotlib是最著名的Python图表绘制扩展库，
它支持输出多种格式的图形图像，并且可以使用多种GUI界面库交互式地显示图表。
使用%matplotlib命令可以将matplotlib的图表直接嵌入到Notebook之中，
或者使用指定的界面库显示图表，它有一个参数指定matplotlib图表的显示方式。
inline表示将图表嵌入到Notebook中。
'''

数据简介：

数据集基本信息：

本文所采用的数据是data.h是一种h5的数据存储方式，包括：

训练数据集及其真实值，是cat则（y=1），非cat则（y=0）。
测试数据集及其标注。
每张图片的尺寸是(num_px, num_px, 3)，其中的3表示图像是RGB图像，有三个通道。且图像是正方形的，(height = num_px) and (width = num_px)。

数据加载：

# Loading the data (cat/non-cat)
train_set_x_orig, train_set_y, test_set_x_orig, test_set_y, classes = load_dataset()

其中train_set_x_orig和test_set_x_orig就是原始的训练数据集和测试数据集（之后将对其进行预处理操作）。train_set_x_orig和test_set_x_orig中的每一行都是一个矩阵，这里的行根据index游标进行索引就可以遍历。

数据可视化：

# Example of a picture
index = 25
plt.imshow(train_set_x_orig[index])
print("type=train_set_x_orig",type(train_set_x_orig))
print("shape of image = ",train_set_x_orig[index].shape)
print(train_set_y[:, index])#结果是一个矩阵
print("type of ",type(train_set_y[:, index]))
print("squeeze=",np.squeeze(train_set_y[:, index]))#获取对应的y值
print ("y = " + str(train_set_y[:, index]) + ", it's a '" + classes[np.squeeze(train_set_y[:, index])].decode("utf-8") +  "' picture.")

运行结果：

type=train_set_x_orig <class 'numpy.ndarray'>
shape of image =  (64, 64, 3)
[1]
<class 'numpy.ndarray'>
squeeze= 1
y = [1], it's a 'cat' picture.

这里写图片描述

获取图像各维度尺寸信息：

我们记：
- m_train (训练样本数量)
- m_test (测试样本数量)
- num_px (训练数据集的长和宽)
需要铭记 train_set_x_orig 是一个numpy-array ，其尺寸是(m_train, num_px, num_px, 3)。所以我们可以通过以下方式train_set_x_orig.shape[0]访问 m_train 的第一个样本。

### START CODE HERE ### (≈ 3 lines of code)
m_train = len(train_set_x_orig)
m_test = len(test_set_x_orig)
num_px = train_set_x_orig.shape[1]
### END CODE HERE ###

print ("Number of training examples: m_train = " + str(m_train))
print ("Number of testing examples: m_test = " + str(m_test))
print ("Height/Width of each image: num_px = " + str(num_px))
print ("Each image is of size: (" + str(num_px) + ", " + str(num_px) + ", 3)")
print ("train_set_x shape: " + str(train_set_x_orig.shape))
print ("train_set_y shape: " + str(train_set_y.shape))
print ("test_set_x shape: " + str(test_set_x_orig.shape))
print ("test_set_y shape: " + str(test_set_y.shape))

运行结果：

Number of training examples: m_train = 209
Number of testing examples: m_test = 50
Height/Width of each image: num_px = 64
Each image is of size: (64, 64, 3)
train_set_x shape: (209, 64, 64, 3)
train_set_y shape: (1, 209)
test_set_x shape: (50, 64, 64, 3)
test_set_y shape: (1, 50)

从中我们可以知道m_train, m_test and num_px值如下：

m_train	209
m_test	50
num_px	64

为了方便起见，我们对于尺寸为(num_px, num_px, 3) 的图像进行reshape，使其转化为尺寸为(num_px ∗∗ num_px ∗∗ 3, 1)的numpy-array 。再转化之后，训练数据集和测试数据集的每一列都是一个扁平化的样本数据，样本数则对应的是列数量。
我们可以通过reshape来实现样本数据的扁平化操作：

X_flatten = X.reshape(X.shape[0], -1).T      # X.T is the transpose of X

具体例子如下：

# Reshape the training and test examples

### START CODE HERE ### (≈ 2 lines of code)
print("train_set_x_orig shape=",train_set_x_orig.shape)
train_set_x_flatten = train_set_x_orig.reshape(train_set_x_orig.shape[0],-1).T
test_set_x_flatten = test_set_x_orig.reshape(test_set_x_orig.shape[0],-1).T
### END CODE HERE ###

print ("train_set_x_flatten shape: " + str(train_set_x_flatten.shape))
print ("train_set_y shape: " + str(train_set_y.shape))
print ("test_set_x_flatten shape: " + str(test_set_x_flatten.shape))
print ("test_set_y shape: " + str(test_set_y.shape))
print ("sanity check after reshaping: " + str(train_set_x_flatten[0:5,0]))

运行结果：

train_set_x_orig shape= (209, 64, 64, 3)
train_set_x_flatten shape: (12288, 209)
train_set_y shape: (1, 209)
test_set_x_flatten shape: (12288, 50)
test_set_y shape: (1, 50)
sanity check after reshaping: [17 31 56 22 33]

train_set_x_flatten shape	(12288, 209)
train_set_y shape	(1, 209)
test_set_x_flatten shape	(12288, 50)
test_set_y shape	(1, 50)
sanity check after reshaping	[17 31 56 22 33]

另外，一般在对数据的预处理过程中，我们还常常对数据进行中心化和标准化，即每个样本的像素值减去整体的像素均值，再除以整体的标准差。对于图像数据的话，可以简单除以最大的像素值即可：

train_set_x = train_set_x_flatten/255.
test_set_x = test_set_x_flatten/255.

算法流程设计：

逻辑回归如下：
这里写图片描述
逻辑回归是一种简单的神经网络思路实现。

逻辑回归的数学表达式:

对于其中的一个样本 $x^{(i)}$ :

z (i) = w T x (i) + b (1)

$z^{(i)} = w^T x^{(i)} + b \tag{1}$

y^(i) = a (i) = s i g m o i d (z (i)) (2)

$\hat{y}^{(i)} = a^{(i)} = sigmoid(z^{(i)})\tag{2}$

L (a (i), y (i)) = - y (i) log (a (i)) - (1 - y (i)) log (1 - a (i)) (3)

$\mathcal{L}(a^{(i)}, y^{(i)}) = - y^{(i)} \log(a^{(i)}) - (1-y^{(i)} ) \log(1-a^{(i)})\tag{3}$

整个训练数据集的代价函数:

J = 1 m \sum i = 1 m L (a (i), y (i)) (6)

$J = \frac{1}{m} \sum_{i=1}^m \mathcal{L}(a^{(i)}, y^{(i)})\tag{6}$

关键步骤:

模型参数初始化
通过最小代价函数学习获取模型的参数
对于学习到的模型参数，用来对测试数据集进行预测，实现模型参数的校验
分析结果

算法实现：

创建一个神经网络的主要步骤：

定义模型的结构 (如输入特征的数量)
初始化模型参数
做以下循环:
- 计算当前的损失，即前向传播 (forward propagation)
- 计算当前的梯度，即后向传播 (backward propagation)
- 更新参数，即采用梯度下降法对参数进行优化更新 (gradient descent)

一般我们可以分开独立地实现1-3步骤，再将其集成到一个模型函数中如model()。

激活函数

由于我们之前分析的的激活函数是 sigmoid()，所以可以利用 $sigmoid( w^T x + b) = \frac{1}{1 + e^{-(w^T x + b)}}$ 进行对输入进行预测。

# GRADED FUNCTION: sigmoid

def sigmoid(z):
    """
    Compute the sigmoid of z

    Arguments:
    z -- A scalar or numpy array of any size.

    Return:
    s -- sigmoid(z)
    """

    ### START CODE HERE ### (≈ 1 line of code)
    s = 1/(1+np.exp(-z))
    ### END CODE HERE ###

    return s

测试上述函数：

print ("sigmoid([0, 2]) = " + str(sigmoid(np.array([0,2]))))

输出：

sigmoid([0, 2]) = [ 0.5         0.88079708]

第一：参数初始化

在这里由于是逻辑回归，我们可以将w初始为全0向量（采用np.zeros()）

代码实现：

# GRADED FUNCTION: initialize_with_zeros

def initialize_with_zeros(dim):
    """
    This function creates a vector of zeros of shape (dim, 1) for w and initializes b to 0.

    Argument:
    dim -- size of the w vector we want (or number of parameters in this case)

    Returns:
    w -- initialized vector of shape (dim, 1)
    b -- initialized scalar (corresponds to the bias)
    """

    ### START CODE HERE ### (≈ 1 line of code)
    w = np.zeros((dim,1))
    b = 0
    ### END CODE HERE ###

    assert(w.shape == (dim, 1))
    assert(isinstance(b, float) or isinstance(b, int))

    return w, b

测试代码：

dim = 2
w, b = initialize_with_zeros(dim)
print ("w = " + str(w))
print ("b = " + str(b))

执行结果：

w = [[ 0.]
 [ 0.]]
b = 0

注意，在使用np.zero（）的时候我们是可以通过dtype指定数据类型的，如np.int
对于本文的图像数据，w的尺寸是(num_px $\times$ num_px $\times$ 3, 1)， $w^T$ 的尺寸是(1，num_px $\times$ num_px $\times$ 3)

第二：前向和后向传播

在参数初始化之后，我们采用前向和后向传播来学习模型的参数。

前向传播(Forward Propagation):

确定输入X
计算 $A = \sigma(w^T X + b) = (a^{(0)}, a^{(1)}, ..., a^{(m-1)}, a^{(m)})$
-计算代价函数cost function: $J = -\frac{1}{m}\sum_{i=1}^{m}y^{(i)}\log(a^{(i)})+(1-y^{(i)})\log(1-a^{(i)})$

以下的两个公式直接给出，在此处忽略证明过程，后续再补充：

\partial J \partial w = 1 m X (A - Y) T (7)

$\frac{\partial J}{\partial w} = \frac{1}{m}X(A-Y)^T\tag{7}$

\partial J \partial b = 1 m \sum i = 1 m (a (i) - y (i)) (8)

$\frac{\partial J}{\partial b} = \frac{1}{m} \sum_{i=1}^m (a^{(i)}-y^{(i)})\tag{8}$

代码实现：

# GRADED FUNCTION: propagate

def propagate(w, b, X, Y):
    """
    Implement the cost function and its gradient for the propagation explained above

    Arguments:
    w -- weights, a numpy array of size (num_px * num_px * 3, 1)
    b -- bias, a scalar
    X -- data of size (num_px * num_px * 3, number of examples)
    Y -- true "label" vector (containing 0 if non-cat, 1 if cat) of size (1, number of examples)

    Return:
    cost -- negative log-likelihood cost for logistic regression
    dw -- gradient of the loss with respect to w, thus same shape as w
    db -- gradient of the loss with respect to b, thus same shape as b

    Tips:
    - Write your code step by step for the propagation. np.log(), np.dot()
    """

    m = X.shape[1]

    # FORWARD PROPAGATION (FROM X TO COST)
    ### START CODE HERE ### (≈ 2 lines of code)
    A = sigmoid(np.dot(w.T,X)+b)            # compute activation
    cost = -np.sum(np.dot(Y,np.log(A).T) + np.dot(1-Y,np.log(1-A).T))/m         # compute cost 这里需要转置操作，可以从维度来考虑。Y的维度是(1，m),A的维度也是（1，m）输出结果是（1,1），所以A的log操作是需要转置的
    ### END CODE HERE ###

    # BACKWARD PROPAGATION (TO FIND GRAD)
    ### START CODE HERE ### (≈ 2 lines of code)
    dw = np.dot(X,(A-Y).T)/m
    db = np.sum(A-Y,axis = 1, keepdims=True)/m#db的维度是（1,1），所以求和，是行方向的求和
    ### END CODE HERE ###

    assert(dw.shape == w.shape)
    assert(db.dtype == float)
    cost = np.squeeze(cost)
    assert(cost.shape == ())

    grads = {"dw": dw,
             "db": db}

    return grads, cost

测试代码：

w, b, X, Y = np.array([[1],[2]]), 2, np.array([[1,2],[3,4]]), np.array([[1,0]])
grads, cost = propagate(w, b, X, Y)
print ("dw = " + str(grads["dw"]))
print ("db = " + str(grads["db"]))
print ("cost = " + str(cost))

运行结果：

dw = [[ 0.99993216]
 [ 1.99980262]]
db = [[ 0.49993523]]
cost = 6.00006477319

第三：优化

在上述我们：初始化了参数，计算了代价函数和梯度。现在我们需要采用梯度下降来进行参数的更新。
我们的目标是通过最小化cost function $J$ 来学习获得 $w$ and $b$ 。对于参数 $\theta$ , 更新规则： $\theta = \theta - \alpha \text{ } d\theta$ , 此处的 $\alpha$ 是学习率。
每次的迭代都需要计算正向传播和反向传播，从而获取梯度和代价，之后在对模型参数w和b进行更新。

代码实现：

# GRADED FUNCTION: optimize

def optimize(w, b, X, Y, num_iterations, learning_rate, print_cost = False):
    """
    This function optimizes w and b by running a gradient descent algorithm

    Arguments:
    w -- weights, a numpy array of size (num_px * num_px * 3, 1)
    b -- bias, a scalar
    X -- data of shape (num_px * num_px * 3, number of examples)
    Y -- true "label" vector (containing 0 if non-cat, 1 if cat), of shape (1, number of examples)
    num_iterations -- number of iterations of the optimization loop
    learning_rate -- learning rate of the gradient descent update rule
    print_cost -- True to print the loss every 100 steps

    Returns:
    params -- dictionary containing the weights w and bias b
    grads -- dictionary containing the gradients of the weights and bias with respect to the cost function
    costs -- list of all the costs computed during the optimization, this will be used to plot the learning curve.

    Tips:
    You basically need to write down two steps and iterate through them:
        1) Calculate the cost and the gradient for the current parameters. Use propagate().
        2) Update the parameters using gradient descent rule for w and b.
    """

    costs = []

    for i in range(num_iterations):


        # Cost and gradient calculation (≈ 1-4 lines of code)
        ### START CODE HERE ### 
        grads, cost = propagate(w, b, X, Y)
        ### END CODE HERE ###

        # Retrieve derivatives from grads
        dw = grads["dw"]
        db = grads["db"]

        # update rule (≈ 2 lines of code)
        ### START CODE HERE ###
        w = w - learning_rate * dw
        b = b - learning_rate * db
        ### END CODE HERE ###

        # Record the costs
        if i % 100 == 0:
            costs.append(cost)

        # Print the cost every 100 training examples
        if print_cost and i % 100 == 0:
            print ("Cost after iteration %i: %f" %(i, cost))

    params = {"w": w,
              "b": b}

    grads = {"dw": dw,
             "db": db}

    return params, grads, costs

测试代码：

params, grads, costs = optimize(w, b, X, Y, num_iterations= 100, learning_rate = 0.009, print_cost = False)

print ("w = " + str(params["w"]))
print ("b = " + str(params["b"]))
print ("dw = " + str(grads["dw"]))
print ("db = " + str(grads["db"]))

运行结果：

w = [[ 0.1124579 ]
 [ 0.23106775]]
b = [[ 1.55930492]]
dw = [[ 0.90158428]
 [ 1.76250842]]
db = [[ 0.43046207]]

第四：预测

通过第三步骤的optimize 函数可以获得模型参数 w 和b，我们可以用来对数据集进行预测。本文采用 predict() 函数来进行预测，主要有以下两步：

计算 $\hat{Y} = A = \sigma(w^T X + b)$
对于激活输出结果，根据其函数曲线，我们可以将输出值<= 0.5记为0，激活函数输出值 > 0.5时，记为1，新的结果存储于 Y_prediction矩阵中。

代码实现：

# GRADED FUNCTION: predict

def predict(w, b, X):
    '''
    Predict whether the label is 0 or 1 using learned logistic regression parameters (w, b)

    Arguments:
    w -- weights, a numpy array of size (num_px * num_px * 3, 1)
    b -- bias, a scalar
    X -- data of size (num_px * num_px * 3, number of examples)

    Returns:
    Y_prediction -- a numpy array (vector) containing all predictions (0/1) for the examples in X
    '''
    print("size of X=",X.shape)
    m = X.shape[1]
    Y_prediction = np.zeros((1,m))
    print("shape size of w=",w.shape)
    w = w.reshape(X.shape[0], 1)
    print("reshape size of w=",w.shape)
    # Compute vector "A" predicting the probabilities of a cat being present in the picture
    ### START CODE HERE ### (≈ 1 line of code)
    A = sigmoid(np.dot(w.T,X)+b)
    ### END CODE HERE ###
    print("size of A=",A.shape)
    for i in range(A.shape[1]):

        # Convert probabilities A[0,i] to actual predictions p[0,i]
        ### START CODE HERE ### (≈ 4 lines of code)
        Y_prediction[0, i] = 1 if A[0,i] >= 0.5 else 0
        ### END CODE HERE ###

    assert(Y_prediction.shape == (1, m))

    return Y_prediction

测试代码：

print ("predictions = " + str(predict(w, b, X)))

运行结果：

size of X= (2, 2)
shape size of w= (2, 1)
reshape size of w= (2, 1)
size of A= (1, 2)
predictions = [[ 1.  1.]]

小结：

至此，我们梳理下此前的步骤：

初始化参数（w，b）
迭代方式优化代价函数，以学习获取最优的参数(w，b)
- 计算代价函数及其梯度
- 利用梯度下降法对梯度进行更新
采用学习到的最优参数（w，b）对测试数据集进行预测

第五：创建完整的逻辑回归模型

之前的1到4步骤，我们实现了各个独立的函数，所以现在，我们需要将各个模块融合到一个模块中，以创建一个完整的模型。
在模型函数中，我们做以下的符号约定:

Y_prediction是测试数据集的预测结果
Y_prediction_train是训练数据集的预测结果
w, costs, grads是优化函数optimize()的输出结果

代码实现：

# GRADED FUNCTION: model

def model(X_train, Y_train, X_test, Y_test, num_iterations = 2000, learning_rate = 0.5, print_cost = False):
    """
    Builds the logistic regression model by calling the function you've implemented previously

    Arguments:
    X_train -- training set represented by a numpy array of shape (num_px * num_px * 3, m_train)
    Y_train -- training labels represented by a numpy array (vector) of shape (1, m_train)
    X_test -- test set represented by a numpy array of shape (num_px * num_px * 3, m_test)
    Y_test -- test labels represented by a numpy array (vector) of shape (1, m_test)
    num_iterations -- hyperparameter representing the number of iterations to optimize the parameters
    learning_rate -- hyperparameter representing the learning rate used in the update rule of optimize()
    print_cost -- Set to true to print the cost every 100 iterations

    Returns:
    d -- dictionary containing information about the model.
    """

    ### START CODE HERE ###

    #1： initialize parameters with zeros (≈ 1 line of code)
    w, b = initialize_with_zeros(X_train.shape[0])

    #2： Gradient descent (≈ 1 line of code)
    parameters, grads, costs = optimize(w, b, X_train, Y_train, num_iterations, learning_rate, print_cost)

    # Retrieve parameters w and b from dictionary "parameters"
    w = parameters["w"]
    b = parameters["b"]

    # Predict test/train set examples (≈ 2 lines of code)
    Y_prediction_test = predict(w,b,X_test)
    Y_prediction_train = predict(w,b,X_train)

    ### END CODE HERE ###

    # Print train/test Errors
    print("train accuracy: {} %".format(100 - np.mean(np.abs(Y_prediction_train - Y_train)) * 100))
    print("test accuracy: {} %".format(100 - np.mean(np.abs(Y_prediction_test - Y_test)) * 100))


    d = {"costs": costs,
         "Y_prediction_test": Y_prediction_test, 
         "Y_prediction_train" : Y_prediction_train, 
         "w" : w, 
         "b" : b,
         "learning_rate" : learning_rate,
         "num_iterations": num_iterations}

    return d

测试代码：

d = model(train_set_x, train_set_y, test_set_x, test_set_y, num_iterations = 2000, learning_rate = 0.005, print_cost = False)

输出结果：

size of X= (12288, 50)
shape size of w= (12288, 1)
reshape size of w= (12288, 1)
size of A= (1, 50)
size of X= (12288, 209)
shape size of w= (12288, 1)
reshape size of w= (12288, 1)
size of A= (1, 209)
train accuracy: 99.04306220095694 %
test accuracy: 70.0 %

从输出结果我们可以看出，模型对训练集拟合得很好，准确率接近100%，而测试数据集的准确率在70%。鉴于数据集较小，这个结果对于线性分类器的逻辑回归这种简单模型来说，不算太坏。此外，我们需要注意的是，训练数据集接近100%的准确率，说明存在过拟合，我们可以通过归一化来处理这个问题。本文暂不展开说明，后续补充。

测试数据集结果查看：

我们可以查看测试数据集的预测结果：

# Example of a picture that was wrongly classified.
index = 1
plt.imshow(test_set_x[:,index].reshape((num_px, num_px, 3)))
print(d["Y_prediction_test"][0,index])#是个float
print(type(d["Y_prediction_test"][0,index]))
print ("y = " + str(test_set_y[0,index]) + ", you predicted that it is a \"" + classes[int(d["Y_prediction_test"][0,index])].decode("utf-8") +  "\" picture.")

运行结果：

1.0
<class 'numpy.float64'>
y = 1, you predicted that it is a "cat" picture.

这里写图片描述

绘制学习曲线：

代码实现：

# Plot learning curve (with costs)
costs = np.squeeze(d['costs'])
plt.plot(costs)
plt.ylabel('cost')
plt.xlabel('iterations (per hundreds)')
plt.title("Learning rate =" + str(d["learning_rate"]))
plt.show()

这里写图片描述
从上图我们可以看出代价函数是在下降的。如果我们增加迭代过程，那么训练数据的准确率会进一步提高，但是测试数据集的准确率可能会明显下降，这就是由于过拟合造成的。

第六：进一步的分析

学习率的选择：

如果学习率过大，则可能会直接跨越最优值，从而来回震荡；而如果过小的话，收敛速度过慢，迭代次数过多。

我们先对比下不同学习率对应下的学习效果。
代码：

learning_rates = [0.01, 0.001, 0.0001]
models = {}
for i in learning_rates:
    print ("learning rate is: " + str(i))
    models[str(i)] = model(train_set_x, train_set_y, test_set_x, test_set_y, num_iterations = 1500, learning_rate = i, print_cost = False)
    print ('\n' + "-------------------------------------------------------" + '\n')

for i in learning_rates:
    plt.plot(np.squeeze(models[str(i)]["costs"]), label= str(models[str(i)]["learning_rate"]))

plt.ylabel('cost')
plt.xlabel('iterations')

legend = plt.legend(loc='upper center', shadow=True)
frame = legend.get_frame()
frame.set_facecolor('0.90')
plt.show()

输出结果：

size of X= (12288, 50)
shape size of w= (12288, 1)
reshape size of w= (12288, 1)
size of A= (1, 50)
size of X= (12288, 209)
shape size of w= (12288, 1)
reshape size of w= (12288, 1)
size of A= (1, 209)
train accuracy: 68.42105263157895 %
test accuracy: 36.0 %

这里写图片描述
从上图可以看出：

当学习率过大 (0.01)，代价函数出现上下震荡，甚至可能出现偏离。本文选取的0.01最终幸运地收敛到了一个比较好的结果值，纯属运气。
过小的学习率可能会产生过拟合。特别是当训练数据集的准确率远大于测试数据集的时候。

第七：用自己的数据进行验证

至此，我们已经训练出了一个模型，那么我们可以用自己的图片输入到该模型，让模型做判断。

## START CODE HERE ## (PUT YOUR IMAGE NAME) 
my_image = "my_image2.jpg"   # change this to the name of your image file 
## END CODE HERE ##

# We preprocess the image to fit your algorithm.
fname = "images/" + my_image
image = np.array(ndimage.imread(fname, flatten=False))
my_image = scipy.misc.imresize(image, size=(num_px,num_px)).reshape((1, num_px*num_px*3)).T
my_predicted_image = predict(d["w"], d["b"], my_image)

plt.imshow(image)
print("y = " + str(np.squeeze(my_predicted_image)) + ", your algorithm predicts a \"" + classes[int(np.squeeze(my_predicted_image)),].decode("utf-8") +  "\" picture.")

输出结果：

size of X= (12288, 1)
shape size of w= (12288, 1)
reshape size of w= (12288, 1)
size of A= (1, 1)
y = 1.0, your algorithm predicts a "cat" picture.

这里写图片描述

DeepLearing学习笔记-从逻辑回归出发

背景：

准备工作：

数据简介：

数据集基本信息：

数据加载：

数据可视化：

获取图像各维度尺寸信息：

算法流程设计：

算法实现：

激活函数

第一：参数初始化

代码实现：

第二：前向和后向传播

代码实现：

第三：优化

代码实现：

第四：预测

代码实现：

小结：

第五：创建完整的逻辑回归模型

代码实现：

测试数据集结果查看：

绘制学习曲线：

第六： 进一步的分析

学习率的选择：

第七：用自己的数据进行验证

第六：进一步的分析