深度学习笔记（一）第2周PA - 具有神经网络思维的Logistic回归

最新推荐文章于 2022-07-17 22:01:29 发布

honghu_zero

最新推荐文章于 2022-07-17 22:01:29 发布

阅读量1.2k

点赞数 1

分类专栏： python Deep Learing

本文链接：https://blog.csdn.net/qq_36935593/article/details/79149387

版权

python 同时被 2 个专栏收录

1 篇文章 0 订阅

订阅专栏

Deep Learing

1 篇文章 0 订阅

订阅专栏

这篇博客的主要内容是在学习了Deep Learning.ai第一课程Neural Networks and Deep Learning后第二周的编程作业，主要介绍了一种二分分类机——逻辑回归

下面我们将通过对一个数据集的二分分类的例子来详细了解logistic regression
（每一段介绍中都会有对应的python代码实现）

一、处理数据

数据介绍

本篇所给的数据集是一个用hdf5形式压缩的图片集（hdf5是目前较为高效的数据压缩方式），该集中包含了两个数据文件train_catvnoncat.h5与test_catvnoncat.h5。每个都含有两种信息x和y;

其中x是由图片的RGB数据构成的四维数据（64x64像素的矩阵，RGB三个通道，图片数量），y是用来判断是猫非猫的0 1集。用来作为训练的图片有209个，测试的数据集有50个。
找不到数据文件到可以去我的github中下载’https://github.com/honghu-zero/deep-learning-coursera’

这里写图片描述

数据加载

# 加载数据集   
# Loading the data (cat/non-cat)
train_set_x_orig, train_set_y, test_set_x_orig, \
    test_set_y, classes = load_dataset()     #原始数据 四维   numx64x64x3

现在数据已经准备好了，但是因为它是非结构化的数据，不能直接用逻辑回归处理，我们需要将这个数据“压平”，将它转化为一维数组：

# shape[0]是图片的个数
# -1在这里是一个python技巧，可以让计算机自动运算拉成多少列
train_set_x_flatten = train_set_x_orig.reshape(
        train_set_x_orig.shape[0], -1).T
test_set_x_flatten = test_set_x_orig.reshape(
        test_set_x_orig.shape[0], -1).T

这里写图片描述

数据标准化

通常我们将数据标准化处理成0~1的数，即除以该组数据的最大值。

#RGB数据的最大值为255
train_set_x = train_set_x_flatten / 255.
test_set_x = test_set_x_flatten / 255.

二、逻辑回归算法设计

模型介绍

逻辑回归（Logistic Regression ）模型是一种分类模型或者说分类决策函数，也称分类器（Classifier）。如在本文判断是猫和不是猫的图片时，是（记为1）或者不是（记为0），输入特征向量 $x$ ，我们要预测出图片是猫非猫的概率，即 $\hat y$ 是0~1的数，若概率大于0.5可以看作是，否则不是。

def predict(w, b, X):
  """
  Compute vector "A" predicting the probabilities 
  of a cat being present in the picture
  """  
    m = X.shape[1]
    Y_prediction = np.zeros((1, m))
    w = w.reshape(X.shape[0], 1)

    A = sigmoid(np.dot(w.T, X) + b)

    for i in range(A.shape[1]):
        Y_prediction[0, i] = 1 if A[0, i] > 0.5 else 0

    return Y_prediction

在这里我们要引入一个特殊函数：sigmod的函数：

σ (z) = 1 1 + e - z

$\sigma (z) = \frac{1}{1+e^{-z}}$
该函数最早应用于比利时科学家Pierre François Verhulst对人口数量增长情况的研究，函数图像如下图，在中心点时函数变化速率最快。
这里写图片描述

def sigmoid(z):
    s = 1 / (1 + np.exp(-z))
    return s

利用此函数作预测模型，通过给定参数 $w,b$ 对特征向量 $x$ 作线性回归便是完整的逻辑回归模型了，其计算式为：

\hat{y} = σ (w^{T} x + b), w h e r e σ (z) = \frac{1}{1 + e^{- z}}

$\hat y = \sigma(w^Tx+b),where \,\sigma(z) = \frac{1}{1+e^{-z}}$

参数确定

当然了，模型给出了，我们还需确定参数 $w,b$ ，这里引入两个函数——损失函数（ $Loss\,function$ ）与成本函数（ $Cost \,function$ ），通过结果对参数的反馈（也叫反向传播）来确定最优的参数。

损失函数：通过预测得到的结果 $\hat y$ 与实际结果 $y$ 之间的误差。我们希望损失函数尽可能的小，这里采用了一种误差分析方法：

L (\hat{y}, y) = - (y l o g \hat{y} + (1 - y) l o g (1 - \hat{y}))

$\mathcal{L} (\hat y,y) = -(ylog\hat y+(1-y)log(1-\hat y))$
注意，这个函数是个凸函数！凸函数在最优化中有非常重要的作用，因为它的极值点只有一个，所以我们使用一般的优化算法就可以很好的达到其最优值。

但是这里的损失函数只是对一个y而言，下面给出成本函数，它反映了全体训练样本的表现。

J (w, b) = - 1 m \sum i = 1 m [y (i) l o g (y^(i)) + (1 - y (i)) l o g (1 - y^(i))]

$J(w,b) = -\frac{1}{m}\sum_{i=1}^m[y^{(i)}log(\hat y^{(i)}) + (1-y^{(i)})log(1-\hat y^{(i)})]$
同样，这个函数也是一个凸函数。

梯度下降法

接下来我们将用梯度下降法来确定参数 $w,b$ ，具体的梯度下降法思想我就不再叙述了，详细的请看我的另一篇博客——无约束优化问题（二）下面直接给出迭代格式：

w = w - α \partial J \partial w b = b - α \partial J \partial b

$w=w-\alpha \frac{\partial J}{\partial w}\\ b=b-\alpha\frac{\partial J}{\partial b}$
这里的步长

α α $\alpha$ 也叫做学习率，先设定其值为0.5。

既然是利用迭代格式，自然要设定初值：

def initialize_with_zeros(dim):
    """
    Argument:
    dim -- 表示w的长度
    """

    w = np.zeros(shape=(dim, 1))
    b = 0    #use broadcast

    return w, b

对于梯度（多元函数的一阶导数）的计算，可利用画计算图的方式，先一步一步算，最后用“链式法则”相乘。
这里写图片描述

def propagate(w, b, X, Y):
    """
    实现上面所解释的传播的成本函数及其梯度
    """

    m = X.shape[1]

    # FORWARD PROPAGATION (FROM X TO COST)
    A = sigmoid(np.dot(w.T, X) + b)        # compute activation
    cost = (- 1 / m) * np.sum(Y * np.log(A) + (1 - Y) * (
            np.log(1 - A)))                # compute cost

    # BACKWARD PROPAGATION (TO FIND GRAD)
    dw = (1 / m) * np.dot(X, (A - Y).T) 
    db = (1 / m) * np.sum(A - Y) 

    cost = np.squeeze(cost)

    grads = {"dw": dw,
             "db": db}

    return grads, cost

okay，写完了要用的梯度函数与成本函数，下面来写迭代格式求参数 $w,b$ 吧，注意迭代的次数（num_iterations）在这里也是需要设定的，可以先设为2000次。

def optimize(w, b, X, Y, num_iterations, learning_rate,
             print_cost = False):

    costs = []

    for i in range(num_iterations):
        # Cost and gradient calculation  
        grads, cost = propagate(w, b, X, Y)

        # Retrieve derivatives from grads
        dw = grads["dw"]
        db = grads["db"]

        # update rule 
        w = w - learning_rate * dw   # broadcast
        b = b - learning_rate * db

        # Record the costs
        if i % 100 == 0:
            costs.append(cost)

        # Print the cost every 100 training examples
        if print_cost and i % 100 == 0:
            print ("Cost after iteration %i: %f" % (i, cost))

    params = {"w": w,
              "b": b}

    grads = {"dw": dw,
             "db": db}

    return params, grads, costs

logistic regression模型实现

上面的所有准备工作已经做完了，下面就是用这些来构建判断图片是猫和非猫的逻辑回归模型函数了：

ef model(X_train, Y_train, X_test, Y_test, num_iterations=2000, 
          learning_rate=0.5, print_cost=False):
    """
    Returns:
    d -- dictionary containing information about the model.
    """

    # 初始化 
    w, b = initialize_with_zeros(X_train.shape[0])

    # 梯度下降确定参数 
    parameters, grads, costs = optimize(w, b, X_train, Y_train,
                                        num_iterations, learning_rate,
                                        print_cost)

    w = parameters["w"]
    b = parameters["b"]

    # 预测训练集和测试集中的示例图片
    Y_prediction_test = predict(w, b, X_test)
    Y_prediction_train = predict(w, b, X_train)

    # 打印预测的误差
    print("train accuracy: {} %".format(100 - np.mean(
            np.abs(Y_prediction_train - Y_train)) * 100))
    print("test accuracy: {} %".format(100 - np.mean(
            np.abs(Y_prediction_test - Y_test)) * 100))


    d = {"costs": costs,
         "Y_prediction_test": Y_prediction_test, 
         "Y_prediction_train" : Y_prediction_train, 
         "w" : w, 
         "b" : b,
         "learning_rate" : learning_rate,
         "num_iterations": num_iterations}

    return d

利用此模型判断图片的程序：（以测试集中第20张图片为例）

d = model(train_set_x, train_set_y, test_set_x, 
          test_set_y, num_iterations = 2000,
          learning_rate = 0.005, print_cost = True)    

index = 20
plt.imshow(test_set_x[:,index].reshape((64, 64, 3)))

print ("y = " + str(test_set_y[0, index]) + 
       ", you predicted that it is a \"" + 
       classes[int(d["Y_prediction_test"][0, index])].decode("utf-8")
       +"\" picture.")

运行结果：
这里写图片描述

honghu_zero

关注

1
点赞
踩
1

收藏

觉得还不错? 一键收藏
1
评论
深度学习笔记（一）第2周PA - 具有神经网络思维的Logistic回归

这篇博客的主要内容是在学习了Deep Learning.ai第一课程Neural Networks and Deep Learning后第二周的编程作业，主要介绍了一种二分分类机——逻辑回归下面我们将通过对一个数据集的二分分类的例子来详细了解logistic regression （每一段介绍中都会有对应的python代码实现）一、处理数据数据介绍本篇所给的数据集...
复制链接

扫一扫

专栏目录