深度学习DeepLearning.ai系列课程学习总结：4. Logistic代码实战

最新推荐文章于 2024-03-15 21:24:48 发布

WangZhe0912

最新推荐文章于 2024-03-15 21:24:48 发布

阅读量2.2k

点赞数

分类专栏：深度学习深度学习DeepLearning.ai系列课程学习总结

深度学习同时被 2 个专栏收录

16 篇文章 1 订阅

订阅专栏

深度学习DeepLearning.ai系列课程学习总结

15 篇文章 2 订阅

订阅专栏

转载过程中，图片丢失，代码显示错乱。

为了更好的学习内容，请访问原创版本：

http://www.missshi.cn/api/view/blog/59aa08fee519f50d04000170

Ps：初次访问由于js文件较大，请耐心等候（8s左右）

本节课中，我们将学习如何利用Python的来Logistic。

这是第一节Python代码内容，接下来我们将从一些基本的Python编程开始讲述。

使用numpy构建基本函数

numpy是Python在科学计算中最常用的库。接下来我们将要学习一些numpy中包含的常用函数。

练习1：利用np.exp()实现sigmod函数：

在利用np.exp()函数之前，我们首先使用math.exp()函数来实现sigmod函数，并将二者对比来突出np.exp()的优点。

其中，


  
  

  
  
   
   
    
    
     
     1
    
    
    
    
     
     2
    
    
    
    
     
     3
     
     
    
    
    
    
     
     4
    
    
    
    
     
     5
    
    
    
    
     
     6
    
    
    
    
     
     7
    
    
    
    
     
     8
    
    
    
    
     
     9
    
    
    
    
     
     10
    
    
    
    
     
     11
    
    
   
   
   
   
  
  
  
  
   
   
    
    
     
     
    
    
    
    
     
     
    
    
    
    
     
     
      
       
       import 
        
       math 
      
     
     
     
     
      
      
     
     
     
     
      
       
       def 
        
       basic_sigmod 
       ( 
       x 
       ): 
      
     
     
     
     
      
       
        
        
        
        
       """ 
      
     
     
     
     
      
       
       #计算单个标量的sigmod函数 
      
     
     
     
     
      
       
       """ 
      
     
     
     
     
      
       
        
        
        
        
       s 
        
       = 
        
       1.0 
        
       / 
        
       ( 
       1 
        
       + 
        
       1 
       / 
        
       math. 
       exp 
       ( 
       x 
       )) 
      
     
     
     
     
      
       
        
        
        
        
       return 
        
       s 
      
     
     
     
     
      
       
        
        
        
        
      
     
     
     
     
      
       
       print 
        
       basic_sigmod 
       ( 
       3 
       ) 
      
     
     
     
     
      
       
       #0.9525741268224334

上述描述了如何对一个标量执行sigmod函数，而在深度学习的应用中，我们通过是对向量或者矩阵来执行sigmod运算。

如何执行将该函数用于矢量或者矩阵，那么系统会抛出异常：


  
  

  
  
   
   
    
    
     
     1
    
    
   
   
   
   
  
  
  
  
   
   
    
    
     
     
    
    
    
    
     
     
    
    
    
    
     
     
      
       
       print 
        
       basic_sigmod 
       ([ 
       3, 
        
       2, 
        
       1 
       ])

而如果使用的是np.exp函数的话，如果输入的是一个矢量或者矩阵，那么对应的输出也会是矢量或矩阵，即针对每个元素进行指数计算。


  
  

  
  
   
   
    
    
     
     1
    
    
    
    
     
     2
    
    
    
    
     
     3
    
    
    
    
     
     4
    
    
   
   
   
   
  
  
  
  
   
   
    
    
     
     
    
    
    
    
     
     
    
    
    
    
     
     
      
       
       import 
        
       numpy 
        
       as 
        
       np 
      
     
     
     
     
      
       
       x 
        
       = 
        
       np. 
       array 
       ([ 
       1, 
        
       2, 
        
       3 
       ]) 
      
     
     
     
     
      
       
       print 
        
       np. 
       exp 
       ( 
       x 
       ) 
      
     
     
     
     
      
       
       #[2.718281837.389056120.08553692]

此外，对于numpy array类型的变量，其加减乘除的方法也统一被改写。

以下面的例子为例：

接下来，我们来实现一个真正的、可用于矢量或矩阵的sigmod函数：

其需求如下：


  
  

  
  
   
   
    
    
     
     1
    
    
    
    
     
     2
    
    
    
    
     
     3
     
     
    
    
    
    
     
     4
    
    
    
    
     
     5
    
    
    
    
     
     6
    
    
    
    
     
     7
    
    
    
    
     
     8
    
    
    
    
     
     9
    
    
    
    
     
     10
    
    
    
    
     
     11
    
    
    
    
     
     12
    
    
   
   
   
   
  
  
  
  
   
   
    
    
     
     
    
    
    
    
     
     
    
    
    
    
     
     
      
       
       import 
        
       numpy 
        
       as 
        
       np 
      
     
     
     
     
      
      
     
     
     
     
      
       
       def 
        
       sigmod 
       ( 
       x 
       ): 
      
     
     
     
     
      
       
        
        
        
        
       """ 
      
     
     
     
     
      
       
       #sigmod函数，可用于矢量和矩阵 
      
     
     
     
     
      
       
       """ 
      
     
     
     
     
      
       
        
        
        
        
       s 
        
       = 
        
       1.0 
        
       / 
        
       ( 
       1 
        
       + 
        
       1 
        
       / 
        
       np. 
       exp 
       ( 
       x 
       )) 
      
     
     
     
     
      
       
        
        
        
        
       return 
        
       s 
      
     
     
     
     
      
      
     
     
     
     
      
       
       x 
        
       = 
        
       np. 
       array 
       ([ 
       1, 
        
       2, 
        
       3 
       ]) 
      
     
     
     
     
      
       
       print 
        
       np. 
       exp 
       ( 
       x 
       ) 
        
      
     
     
     
     
      
       
       #[0.73105858,0.88079708,0.95257413]

练习2：计算sigmod函数的导数

在之前的理论课程中，我们学习到了sigmod函数的导数公式如下：

接下来，我们通过Python代码进行实现：


  
  

  
  
   
   
    
    
     
     1
     
     
    
    
    
    
     
     2
    
    
    
    
     
     3
    
    
    
    
     
     4
    
    
    
    
     
     5
    
    
    
    
     
     6
     
     
    
    
    
    
     
     7
    
    
    
    
     
     8
     
     
    
    
    
    
     
     9
    
    
    
    
     
     10
    
    
    
    
     
     11
    
    
    
    
     
     12
    
    
    
    
     
     13
    
    
    
    
     
     14
    
    
    
    
     
     15
    
    
    
    
     
     16
    
    
    
    
     
     17
    
    
    
    
     
     18
    
    
    
    
     
     19
    
    
   
   
   
   
  
  
  
  
   
   
    
    
     
     
    
    
    
    
     
     
    
    
    
    
     
     
      
       
       def 
        
       sigmoid_derivative 
       ( 
       x 
       ): 
      
     
     
     
     
      
       
        
        
        
        
       """ 
      
     
     
     
     
      
       
       Computethegradient(alsocalledtheslopeorderivative)ofthesigmoidfunctionwithrespecttoitsinputx. 
      
     
     
     
     
      
       
       Youcanstoretheoutputofthesigmoidfunctionintovariablesandthenuseittocalculatethegradient. 
      
     
     
     
     
      
       
        
      
     
     
     
     
      
       
       Arguments: 
      
     
     
     
     
      
       
       x--Ascalarornumpyarray 
      
     
     
     
     
      
       
       Return: 
      
     
     
     
     
      
       
       ds--Yourcomputedgradient. 
      
     
     
     
     
      
       
       """ 
      
     
     
     
     
      
       
        
      
     
     
     
     
      
       
        
        
        
        
       s 
        
       = 
        
       1.0 
        
       / 
        
       ( 
       1 
        
       + 
        
       1 
        
       / 
        
       np. 
       exp 
       ( 
       x 
       )) 
      
     
     
     
     
      
       
        
        
        
        
       ds 
        
       = 
        
       s 
        
       * 
        
       ( 
       1 
        
       - 
        
       s 
       ) 
      
     
     
     
     
      
       
        
        
        
        
        
      
     
     
     
     
      
       
        
        
        
        
       return 
        
       ds 
      
     
     
     
     
      
       
        
        
        
        
        
      
     
     
     
     
      
       
       x 
        
       = 
        
       np. 
       array 
       ([ 
       1, 
        
       2, 
        
       3 
       ]) 
      
     
     
     
     
      
       
       print 
        
       "sigmoid_derivative(x)=" 
        
       + 
        
       str 
       ( 
       sigmoid_derivative 
       ( 
       x 
       )) 
      
     
     
     
     
      
       
       #sigmoid_derivative(x)=[0.196611930.104993590.04517666]

练习3：将一副图像转为为一个向量

在numpy中，有两个常用的函数：np.shape和np.reshape()。

其中，X.shape可以用于查看当前矩阵的维度。

X.reshape()可以用于修改矩阵的维度或形状。

例如，对于一副彩色图像，其通常是由一个三维矩阵组成的（RGB三个通道）。然而，在深度学习的应用中，我们通常需要将其转换为一个矢量，其长度为3*length*width。

即我们需要将一个三维的矩阵转换为一个一维的向量。

接下来，我们需要实现一个image2vector函数，其输入为一个三维矩阵（length, height, 3），输出为一个矢量。


  
  

  
  
   
   
    
    
     
     1
     
     
    
    
    
    
     
     2
    
    
    
    
     
     3
     
     
    
    
    
    
     
     4
    
    
    
    
     
     5
    
    
    
    
     
     6
     
     
    
    
    
    
     
     7
    
    
    
    
     
     8
    
    
    
    
     
     9
    
    
    
    
     
     10
    
    
    
    
     
     11
    
    
    
    
     
     12
    
    
    
    
     
     13
    
    
    
    
     
     14
    
    
    
    
     
     15
    
    
    
    
     
     16
    
    
    
    
     
     17
    
    
    
    
     
     18
    
    
    
    
     
     19
    
    
    
    
     
     20
    
    
    
    
     
     21
    
    
    
    
     
     22
    
    
    
    
     
     23
    
    
    
    
     
     24
    
    
    
    
     
     25
    
    
    
    
     
     26
    
    
    
    
     
     27
    
    
   
   
   
   
  
  
  
  
   
   
    
    
     
     
    
    
    
    
     
     
    
    
    
    
     
     
      
       
       def 
        
       image2vector 
       ( 
       image 
       ): 
      
     
     
     
     
      
       
        
        
        
        
       """ 
      
     
     
     
     
      
       
       Argument: 
      
     
     
     
     
      
       
       image--anumpyarrayofshape(length,height,depth) 
      
     
     
     
     
      
       
        
      
     
     
     
     
      
       
       Returns: 
      
     
     
     
     
      
       
       v--avectorofshape(length*height*depth,1) 
      
     
     
     
     
      
       
       """ 
      
     
     
     
     
      
       
        
        
        
        
      
     
     
     
     
      
       
        
        
        
        
       v 
        
       = 
        
       image. 
       reshape 
       (( 
       image. 
       shape 
       [ 
       0 
       ] 
        
       * 
        
       image. 
       shape 
       [ 
       1 
       ] 
        
       * 
        
       image. 
       shape 
       [ 
       2 
       ], 
        
       1 
       )) 
      
     
     
     
     
      
       
        
        
        
        
      
     
     
     
     
      
       
        
        
        
        
       return 
        
       v 
      
     
     
     
     
      
       
        
        
        
        
      
     
     
     
     
      
       
       image 
        
       = 
        
       np. 
       array 
       ([[[ 
        
       0.67826139, 
        
        
       0.29380381 
       ], 
      
     
     
     
     
      
       
        
        
        
        
        
        
        
        
       [ 
        
       0.90714982, 
        
        
       0.52835647 
       ], 
      
     
     
     
     
      
       
        
        
        
        
        
        
        
        
       [ 
        
       0.4215251 
       , 
        
        
       0.45017551 
       ]], 
      
     
     
     
     
      
      
     
     
     
     
      
       
        
        
        
        
        
        
        
       [[ 
        
       0.92814219, 
        
        
       0.96677647 
       ], 
      
     
     
     
     
      
       
        
        
        
        
        
        
        
        
       [ 
        
       0.85304703, 
        
        
       0.52351845 
       ], 
      
     
     
     
     
      
       
        
        
        
        
        
        
        
        
       [ 
        
       0.19981397, 
        
        
       0.27417313 
       ]], 
      
     
     
     
     
      
      
     
     
     
     
      
       
        
        
        
        
        
        
        
       [[ 
        
       0.60659855, 
        
        
       0.00533165 
       ], 
      
     
     
     
     
      
       
        
        
        
        
        
        
        
        
       [ 
        
       0.10820313, 
        
        
       0.49978937 
       ], 
      
     
     
     
     
      
       
        
        
        
        
        
        
        
        
       [ 
        
       0.34144279, 
        
        
       0.94630077 
       ]]]) 
      
     
     
     
     
      
      
     
     
     
     
      
       
       print 
        
       "image2vector(image)=" 
        
       + 
        
       str 
       ( 
       image2vector 
       ( 
       image 
       )) 
      
     
     
     
     
      
       
       #[[0.67826139][0.29380381][0.90714982][0.52835647][0.4215251][0.45017551][0.92814219][0.96677647][0.85304703][0.52351845][0.19981397][0 
      
      
      
             
       .27417313][0.60659855][0.00533165][0.10820313][0.49978937][0.34144279][0.94630077]]

练习4：按行归一化

在深度学习中，常用的一个技巧是需要对我们的数据进行归一化。

通过，在对数据进行归一化后，梯度下降算法的收敛速度会明显加快。

接下来，我们需要对一个矩阵进行按行归一化，归一化后的结果是每一个的长度为1。

例如：


  
  

  
  
   
   
    
    
     
     1
     
     
    
    
    
    
     
     2
    
    
    
    
     
     3
    
    
    
    
     
     4
    
    
    
    
     
     5
     
     
    
    
    
    
     
     6
    
    
    
    
     
     7
    
    
    
    
     
     8
     
     
    
    
    
    
     
     9
    
    
    
    
     
     10
    
    
    
    
     
     11
    
    
    
    
     
     12
    
    
    
    
     
     13
    
    
    
    
     
     14
    
    
    
    
     
     15
    
    
    
    
     
     16
    
    
    
    
     
     17
     
     
    
    
    
    
     
     18
    
    
    
    
     
     19
    
    
    
    
     
     20
    
    
    
    
     
     21
    
    
   
   
   
   
  
  
  
  
   
   
    
    
     
     
    
    
    
    
     
     
    
    
    
    
     
     
      
       
       def 
        
       normalizeRows 
       ( 
       x 
       ): 
      
     
     
     
     
      
       
        
        
        
        
       """ 
      
     
     
     
     
      
       
       Implementafunctionthatnormalizeseachrowofthematrixx(tohaveunitlength). 
      
     
     
     
     
      
       
        
      
     
     
     
     
      
       
       Argument: 
      
     
     
     
     
      
       
       x--Anumpymatrixofshape(n,m) 
      
     
     
     
     
      
       
        
      
     
     
     
     
      
       
       Returns: 
      
     
     
     
     
      
       
       x--Thenormalized(byrow)numpymatrix.Youareallowedtomodifyx. 
      
     
     
     
     
      
       
       """ 
      
     
     
     
     
      
       
        
        
        
        
      
     
     
     
     
      
       
        
        
        
        
       x_norm 
        
       = 
        
       np. 
       linalg. 
       norm 
       ( 
       x, 
        
       axis 
       = 
       1, 
        
       keepdims 
        
       = 
        
       True 
       ) 
        
        
       #计算每一行的长度，得到一个列向量 
      
     
     
     
     
      
       
        
        
        
        
       x 
        
       = 
        
       x 
        
       / 
        
       x_norm 
        
        
       #利用numpy的广播，用矩阵与列向量相除。 
      
     
     
     
     
      
      
     
     
     
     
      
       
        
        
        
        
       return 
        
       x 
      
     
     
     
     
      
       
        
        
        
        
      
     
     
     
     
      
       
       x 
        
       = 
        
       np. 
       array 
       ([ 
      
     
     
     
     
      
       
        
        
        
        
       [ 
       0, 
        
       3, 
        
       4 
       ], 
      
     
     
     
     
      
       
        
        
        
        
       [ 
       1, 
        
       6, 
        
       4 
       ]]) 
      
     
     
     
     
      
       
       print 
        
       "normalizeRows(x)=" 
        
       + 
        
       str 
       ( 
       normalizeRows 
       ( 
       x 
       )) 
      
     
     
     
     
      
       
       #normalizeRows(x)=[[0.0.60.8][0.137360560.824163380.54944226]]

在上面的代码中，我们利用了广播的特性，接下来我们主要学习一下广播的使用。

练习5：广播的使用及softmax函数的实现

广播是numpy中一个非常强大的功能，它可以帮助我们对不同维度的矩阵、向量、标量之前快速计算。

接下来，我们需要实现一个softmax函数，其定义如下：


  
  

  
  
   
   
    
    
     
     1
     
     
    
    
    
    
     
     2
    
    
    
    
     
     3
    
    
    
    
     
     4
    
    
    
    
     
     5
    
    
    
    
     
     6
     
     
    
    
    
    
     
     7
    
    
    
    
     
     8
    
    
    
    
     
     9
     
     
    
    
    
    
     
     10
    
    
    
    
     
     11
    
    
    
    
     
     12
    
    
    
    
     
     13
    
    
    
    
     
     14
    
    
    
    
     
     15
    
    
    
    
     
     16
    
    
    
    
     
     17
    
    
    
    
     
     18
    
    
    
    
     
     19
     
     
    
    
    
    
     
     20
    
    
    
    
     
     21
    
    
    
    
     
     22
    
    
    
    
     
     23
    
    
   
   
   
   
  
  
  
  
   
   
    
    
     
     
    
    
    
    
     
     
    
    
    
    
     
     
      
       
       def 
        
       softmax 
       ( 
       x 
       ): 
      
     
     
     
     
      
       
        
        
        
        
       """Calculatesthesoftmaxforeachrowoftheinputx. 
      
     
     
     
     
      
      
     
     
     
     
      
       
       Yourcodeshouldworkforarowvectorandalsoformatricesofshape(n,m). 
      
     
     
     
     
      
      
     
     
     
     
      
       
       Argument: 
      
     
     
     
     
      
       
       x--Anumpymatrixofshape(n,m) 
      
     
     
     
     
      
      
     
     
     
     
      
       
       Returns: 
      
     
     
     
     
      
       
       s--Anumpymatrixequaltothesoftmaxofx,ofshape(n,m) 
      
     
     
     
     
      
       
       """ 
      
     
     
     
     
      
       
        
        
        
        
      
     
     
     
     
      
       
        
        
        
        
       x_exp 
        
       = 
        
       np. 
       exp 
       ( 
       x 
       ) 
        
       #(n,m) 
      
     
     
     
     
      
       
        
        
        
        
       x_sum 
        
       = 
        
       np. 
       sum 
       ( 
       x_exp, 
        
       axis 
        
       = 
        
       1, 
        
       keepdims 
        
       = 
        
       True 
       ) 
        
       #(n,1) 
      
     
     
     
     
      
       
        
        
        
        
       s 
        
       = 
        
       x_exp 
        
       / 
        
       x_sum 
        
        
       #(n,m)广播的作用 
      
     
     
     
     
      
       
        
        
        
        
      
     
     
     
     
      
       
        
        
        
        
       return 
        
       s 
      
     
     
     
     
      
       
        
        
        
        
      
     
     
     
     
      
       
       x 
        
       = 
        
       np. 
       array 
       ([ 
      
     
     
     
     
      
       
        
        
        
        
       [ 
       9, 
        
       2, 
        
       5, 
        
       0, 
        
       0 
       ], 
      
     
     
     
     
      
       
        
        
        
        
       [ 
       7, 
        
       5, 
        
       0, 
        
       0 
       , 
       0 
       ]]) 
      
     
     
     
     
      
       
       print 
        
       "softmax(x)=" 
        
       + 
        
       str 
       ( 
       softmax 
       ( 
       x 
       )) 
      
     
     
     
     
      
       
       #softmax(x)=[[9.80897665e-018.94462891e-041.79657674e-021.21052389e-041.21052389e-04][8.78679856e-011.18916387e-018.01252314e-048.01252314e-048.01252314e-04]]

矢量化

在深度学习中，我们通常会处理大数据量的数据集。

因此=，计算速度可能会成为整个训练过程中的瓶颈。

为了保证我们计算的效率，我们需要对进行过程矢量化。

接下来，我们对比一下是否使用矢量化对于点乘、外积和按元素相乘等操作来说，计算效率的比较。

首先，利用原生方法的实现过程如下：


  
  

  
  
   
   
    
    
     
     1
    
    
    
    
     
     2
    
    
    
    
     
     3
    
    
    
    
     
     4
    
    
    
    
     
     5
    
    
    
    
     
     6
    
    
    
    
     
     7
    
    
    
    
     
     8
    
    
    
    
     
     9
     
     
    
    
    
    
     
     10
    
    
    
    
     
     11
    
    
    
    
     
     12
    
    
    
    
     
     13
    
    
    
    
     
     14
    
    
    
    
     
     15
    
    
    
    
     
     16
    
    
    
    
     
     17
     
     
    
    
    
    
     
     18
     
     
    
    
    
    
     
     19
    
    
    
    
     
     20
    
    
    
    
     
     21
    
    
    
    
     
     22
    
    
    
    
     
     23
    
    
    
    
     
     24
    
    
    
    
     
     25
    
    
    
    
     
     26
     
     
    
    
    
    
     
     27
    
    
    
    
     
     28
    
    
    
    
     
     29
    
    
    
    
     
     30
    
    
    
    
     
     31
    
    
    
    
     
     32
    
    
    
    
     
     33
    
    
    
    
     
     34
    
    
    
    
     
     35
     
     
    
    
    
    
     
     36
     
     
    
    
    
    
     
     37
    
    
    
    
     
     38
    
    
    
    
     
     39
    
    
    
    
     
     40
    
    
    
    
     
     41
    
    
    
    
     
     42
    
    
    
    
     
     43
    
    
    
    
     
     44
    
    
   
   
   
   
  
  
  
  
   
   
    
    
     
     
    
    
    
    
     
     
    
    
    
    
     
     
      
       
       import 
        
       time 
      
     
     
     
     
      
      
     
     
     
     
      
       
       x1 
        
       = 
        
       [ 
       9, 
        
       2, 
        
       5, 
        
       0, 
        
       0, 
        
       7, 
        
       5, 
        
       0, 
        
       0, 
        
       0, 
        
       9, 
        
       2, 
        
       5, 
        
       0, 
        
       0 
       ] 
      
     
     
     
     
      
       
       x2 
        
       = 
        
       [ 
       9, 
        
       2, 
        
       2, 
        
       9, 
        
       0, 
        
       9, 
        
       2, 
        
       5, 
        
       0, 
        
       0, 
        
       9, 
        
       2, 
        
       5, 
        
       0, 
        
       0 
       ] 
      
     
     
     
     
      
      
     
     
     
     
      
       
       ###CLASSICDOTPRODUCTOFVECTORSIMPLEMENTATION### 
      
     
     
     
     
      
       
       tic 
        
       = 
        
       time. 
       process_time 
       ( 
       ) 
      
     
     
     
     
      
       
       dot 
        
       = 
        
       0 
      
     
     
     
     
      
       
       for 
        
       i 
        
       in 
        
       range 
       ( 
       len 
       ( 
       x1 
       )): 
      
     
     
     
     
      
       
        
        
        
        
       dot 
       += 
        
       x1 
       [ 
       i 
       ] 
       * 
       x2 
       [ 
       i 
       ] 
      
     
     
     
     
      
       
       toc 
        
       = 
        
       time. 
       process_time 
       ( 
       ) 
      
     
     
     
     
      
       
       print 
        
       ( 
       "dot-----Computationtime=" 
        
       + 
        
       str 
       ( 
       1000 
       * 
       ( 
       toc 
        
       - 
        
       tic 
       )) 
        
       + 
        
       "ms" 
       ) 
      
     
     
     
     
      
      
     
     
     
     
      
       
       ###CLASSICOUTERPRODUCTIMPLEMENTATION### 
      
     
     
     
     
      
       
       tic 
        
       = 
        
       time. 
       process_time 
       ( 
       ) 
      
     
     
     
     
      
       
       outer 
        
       = 
        
       np. 
       zeros 
       (( 
       len 
       ( 
       x1 
       ), 
       len 
       ( 
       x2 
       ))) 
        
       #wecreatealen(x1)*len(x2)matrixwithonlyzeros 
      
     
     
     
     
      
       
       for 
        
       i 
        
       in 
        
       range 
       ( 
       len 
       ( 
       x1 
       )): 
      
     
     
     
     
      
       
        
        
        
        
       for 
        
       j 
        
       in 
        
       range 
       ( 
       len 
       ( 
       x2 
       )): 
      
     
     
     
     
      
       
        
        
        
        
        
        
        
        
       outer 
       [ 
       i, 
       j 
       ] 
        
       = 
        
       x1 
       [ 
       i 
       ] 
       * 
       x2 
       [ 
       j 
       ] 
      
     
     
     
     
      
       
       toc 
        
       = 
        
       time. 
       process_time 
       ( 
       ) 
      
     
     
     
     
      
       
       print 
        
       ( 
       "outer-----Computationtime=" 
        
       + 
        
       str 
       ( 
       1000 
       * 
       ( 
       toc 
        
       - 
        
       tic 
       )) 
        
       + 
        
       "ms" 
       ) 
      
     
     
     
     
      
      
     
     
     
     
      
       
       ###CLASSICELEMENTWISEIMPLEMENTATION### 
      
     
     
     
     
      
       
       tic 
        
       = 
        
       time. 
       process_time 
       ( 
       ) 
      
     
     
     
     
      
       
       mul 
        
       = 
        
       np. 
       zeros 
       ( 
       len 
       ( 
       x1 
       )) 
      
     
     
     
     
      
       
       for 
        
       i 
        
       in 
        
       range 
       ( 
       len 
       ( 
       x1 
       )): 
      
     
     
     
     
      
       
        
        
        
        
       mul 
       [ 
       i 
       ] 
        
       = 
        
       x1 
       [ 
       i 
       ] 
       * 
       x2 
       [ 
       i 
       ] 
      
     
     
     
     
      
       
       toc 
        
       = 
        
       time. 
       process_time 
       ( 
       ) 
      
     
     
     
     
      
       
       print 
        
       ( 
       "elementwisemultiplication-----Computationtime=" 
        
       + 
        
       str 
       ( 
       1000 
       * 
       ( 
       toc 
        
       - 
        
       tic 
       )) 
        
       + 
        
       "ms" 
       ) 
      
     
     
     
     
      
      
     
     
     
     
      
       
       ###CLASSICGENERALDOTPRODUCTIMPLEMENTATION### 
      
     
     
     
     
      
       
       W 
        
       = 
        
       np. 
       random. 
       rand 
       ( 
       3, 
       len 
       ( 
       x1 
       )) 
        
       #Random3*len(x1)numpyarray 
      
     
     
     
     
      
       
       tic 
        
       = 
        
       time. 
       process_time 
       ( 
       ) 
      
     
     
     
     
      
       
       gdot 
        
       = 
        
       np. 
       zeros 
       ( 
       W. 
       shape 
       [ 
       0 
       ]) 
      
     
     
     
     
      
       
       for 
        
       i 
        
       in 
        
       range 
       ( 
       W. 
       shape 
       [ 
       0 
       ]): 
      
     
     
     
     
      
       
        
        
        
        
       for 
        
       j 
        
       in 
        
       range 
       ( 
       len 
       ( 
       x1 
       )): 
      
     
     
     
     
      
       
        
        
        
        
        
        
        
        
       gdot 
       [ 
       i 
       ] 
        
       += 
        
       W 
       [ 
       i, 
       j 
       ] 
       * 
       x1 
       [ 
       j 
       ] 
      
     
     
     
     
      
       
       toc 
        
       = 
        
       time. 
       process_time 
       ( 
       ) 
      
     
     
     
     
      
       
       print 
        
       ( 
       "gdot-----Computationtime=" 
        
       + 
        
       str 
       ( 
       1000 
       * 
       ( 
       toc 
        
       - 
        
       tic 
       )) 
        
       + 
        
       "ms" 
       ) 
      
     
     
     
     
      
      
     
     
     
     
      
       
       #dot-----Computationtime=0.17002099999974263ms 
      
     
     
     
     
      
       
       #outer-----Computationtime=0.34057500000006513ms 
      
     
     
     
     
      
       
       #elementwisemultiplication-----Computationtime=0.1940779999998199ms 
      
     
     
     
     
      
       
       #gdot-----Computationtime=0.2362039999999066ms

接下来，利用矢量化实现的结果如下：


  
  

  
  
   
   
    
    
     
     1
    
    
    
    
     
     2
    
    
    
    
     
     3
    
    
    
    
     
     4
    
    
    
    
     
     5
    
    
    
    
     
     6
    
    
    
    
     
     7
    
    
    
    
     
     8
    
    
    
    
     
     9
    
    
    
    
     
     10
    
    
    
    
     
     11
    
    
    
    
     
     12
    
    
    
    
     
     13
    
    
    
    
     
     14
    
    
    
    
     
     15
    
    
    
    
     
     16
    
    
    
    
     
     17
    
    
    
    
     
     18
    
    
    
    
     
     19
    
    
    
    
     
     20
    
    
    
    
     
     21
    
    
    
    
     
     22
    
    
    
    
     
     23
    
    
    
    
     
     24
    
    
    
    
     
     25
    
    
    
    
     
     26
    
    
    
    
     
     27
    
    
    
    
     
     28
    
    
    
    
     
     29
    
    
    
    
     
     30
    
    
    
    
     
     31
    
    
   
   
   
   
  
  
  
  
   
   
    
    
     
     
    
    
    
    
     
     
    
    
    
    
     
     
      
       
       x1 
        
       = 
        
       [ 
       9, 
        
       2, 
        
       5, 
        
       0, 
        
       0, 
        
       7, 
        
       5, 
        
       0, 
        
       0, 
        
       0, 
        
       9, 
        
       2, 
        
       5, 
        
       0, 
        
       0 
       ] 
      
     
     
     
     
      
       
       x2 
        
       = 
        
       [ 
       9, 
        
       2, 
        
       2, 
        
       9, 
        
       0, 
        
       9, 
        
       2, 
        
       5, 
        
       0, 
        
       0, 
        
       9, 
        
       2, 
        
       5, 
        
       0, 
        
       0 
       ] 
      
     
     
     
     
      
      
     
     
     
     
      
       
       ###VECTORIZEDDOTPRODUCTOFVECTORS### 
      
     
     
     
     
      
       
       tic 
        
       = 
        
       time. 
       process_time 
       ( 
       ) 
      
     
     
     
     
      
       
       dot 
        
       = 
        
       np. 
       dot 
       ( 
       x1, 
       x2 
       ) 
      
     
     
     
     
      
       
       toc 
        
       = 
        
       time. 
       process_time 
       ( 
       ) 
      
     
     
     
     
      
       
       print 
        
       ( 
       "dot-----Computationtime=" 
        
       + 
        
       str 
       ( 
       1000 
       * 
       ( 
       toc 
        
       - 
        
       tic 
       )) 
        
       + 
        
       "ms" 
       ) 
      
     
     
     
     
      
      
     
     
     
     
      
       
       ###VECTORIZEDOUTERPRODUCT### 
      
     
     
     
     
      
       
       tic 
        
       = 
        
       time. 
       process_time 
       ( 
       ) 
      
     
     
     
     
      
       
       outer 
        
       = 
        
       np. 
       outer 
       ( 
       x1, 
       x2 
       ) 
      
     
     
     
     
      
       
       toc 
        
       = 
        
       time. 
       process_time 
       ( 
       ) 
      
     
     
     
     
      
       
       print 
        
       ( 
       "outer-----Computationtime=" 
        
       + 
        
       str 
       ( 
       1000 
       * 
       ( 
       toc 
        
       - 
        
       tic 
       )) 
        
       + 
        
       "ms" 
       ) 
      
     
     
     
     
      
      
     
     
     
     
      
       
       ###VECTORIZEDELEMENTWISEMULTIPLICATION### 
      
     
     
     
     
      
       
       tic 
        
       = 
        
       time. 
       process_time 
       ( 
       ) 
      
     
     
     
     
      
       
       mul 
        
       = 
        
       np. 
       multiply 
       ( 
       x1, 
       x2 
       ) 
      
     
     
     
     
      
       
       toc 
        
       = 
        
       time. 
       process_time 
       ( 
       ) 
      
     
     
     
     
      
       
       print 
        
       ( 
       "elementwisemultiplication-----Computationtime=" 
        
       + 
        
       str 
       ( 
       1000 
       * 
       ( 
       toc 
        
       - 
        
       tic 
       )) 
        
       + 
        
       "ms" 
       ) 
      
     
     
     
     
      
      
     
     
     
     
      
       
       ###VECTORIZEDGENERALDOTPRODUCT### 
      
     
     
     
     
      
       
       tic 
        
       = 
        
       time. 
       process_time 
       ( 
       ) 
      
     
     
     
     
      
       
       dot 
        
       = 
        
       np. 
       dot 
       ( 
       W, 
       x1 
       ) 
      
     
     
     
     
      
       
       toc 
        
       = 
        
       time. 
       process_time 
       ( 
       ) 
      
     
     
     
     
      
       
       print 
        
       ( 
       "gdot-----Computationtime=" 
        
       + 
        
       str 
       ( 
       1000 
       * 
       ( 
       toc 
        
       - 
        
       tic 
       )) 
        
       + 
        
       "ms" 
       ) 
      
     
     
     
     
      
      
     
     
     
     
      
       
       #dot-----Computationtime=0.16546899999991815ms 
      
     
     
     
     
      
       
       #outer-----Computationtime=0.14168100000011563ms 
      
     
     
     
     
      
       
       #elementwisemultiplication-----Computationtime=0.10738799999998605ms 
      
     
     
     
     
      
       
       #gdot-----Computationtime=0.38393900000022185ms

从上述结果中，我们可以看到矢量化的代码明显简单了很多。

同时，运行时间也有了一定程度的降低。降低的幅度不大主要是由于数据量较小的原因，随着数据量的增大，减小的幅度也会越来越明显。

练习1：L1误差函数的实现

我们需要使用numpy函数来实现L1误差函数：

其中，L1误差函数的定义如下：

^y表示估计值，y表示真实值。


  
  

  
  
   
   
    
    
     
     1
    
    
    
    
     
     2
     
     
    
    
    
    
     
     3
    
    
    
    
     
     4
     
     
    
    
    
    
     
     5
    
    
    
    
     
     6
    
    
    
    
     
     7
    
    
    
    
     
     8
     
     
    
    
    
    
     
     9
    
    
    
    
     
     10
    
    
    
    
     
     11
    
    
    
    
     
     12
    
    
    
    
     
     13
    
    
    
    
     
     14
    
    
    
    
     
     15
    
    
    
    
     
     16
    
    
    
    
     
     17
    
    
    
    
     
     18
    
    
   
   
   
   
  
  
  
  
   
   
    
    
     
     
    
    
    
    
     
     
    
    
    
    
     
     
      
       
       import 
        
       numpy 
        
       as 
        
       np 
      
     
     
     
     
      
       
       def 
        
       L1 
       ( 
       yhat, 
        
       y 
       ): 
      
     
     
     
     
      
       
        
        
        
        
       """ 
      
     
     
     
     
      
       
       Arguments: 
      
     
     
     
     
      
       
       yhat--vectorofsizem(predictedlabels) 
      
     
     
     
     
      
       
       y--vectorofsizem(truelabels) 
      
     
     
     
     
      
       
        
      
     
     
     
     
      
       
       Returns: 
      
     
     
     
     
      
       
       loss--thevalueoftheL1lossfunctiondefinedabove 
      
     
     
     
     
      
       
       """ 
      
     
     
     
     
      
       
        
        
        
        
      
     
     
     
     
      
       
        
        
        
        
       loss 
        
       = 
        
       np. 
       sum 
       ( 
       np. 
       abs 
       ( 
       y 
        
       - 
        
       yhat 
       )) 
      
     
     
     
     
      
       
        
        
        
        
       return 
        
       loss 
      
     
     
     
     
      
      
     
     
     
     
      
       
       yhat 
        
       = 
        
       np. 
       array 
       ([ 
       .9, 
        
       0.2, 
        
       0.1, 
        
       .4, 
        
       .9 
       ]) 
      
     
     
     
     
      
       
       y 
        
       = 
        
       np. 
       array 
       ([ 
       1, 
        
       0, 
        
       0, 
        
       1, 
        
       1 
       ]) 
      
     
     
     
     
      
       
       print 
        
       "L1=" 
        
       + 
        
       str 
       ( 
       L1 
       ( 
       yhat, 
       y 
       )) 
      
     
     
     
     
      
       
       #L1=1.1

练习2：L2误差函数的实现

L2误差函数的定义如下：


  
  

  
  
   
   
    
    
     
     1
    
    
    
    
     
     2
     
     
    
    
    
    
     
     3
    
    
    
    
     
     4
     
     
    
    
    
    
     
     5
    
    
    
    
     
     6
    
    
    
    
     
     7
    
    
    
    
     
     8
     
     
    
    
    
    
     
     9
    
    
    
    
     
     10
    
    
    
    
     
     11
    
    
    
    
     
     12
    
    
    
    
     
     13
    
    
    
    
     
     14
    
    
    
    
     
     15
    
    
    
    
     
     16
    
    
    
    
     
     17
    
    
    
    
     
     18
    
    
   
   
   
   
  
  
  
  
   
   
    
    
     
     
    
    
    
    
     
     
    
    
    
    
     
     
      
       
       import 
        
       numpy 
        
       as 
        
       np 
      
     
     
     
     
      
       
       def 
        
       L2 
       ( 
       yhat, 
        
       y 
       ): 
      
     
     
     
     
      
       
        
        
        
        
       """ 
      
     
     
     
     
      
       
       Arguments: 
      
     
     
     
     
      
       
       yhat--vectorofsizem(predictedlabels) 
      
     
     
     
     
      
       
       y--vectorofsizem(truelabels) 
      
     
     
     
     
      
       
        
      
     
     
     
     
      
       
       Returns: 
      
     
     
     
     
      
       
       loss--thevalueoftheL2lossfunctiondefinedabove 
      
     
     
     
     
      
       
       """ 
      
     
     
     
     
      
       
        
        
        
        
      
     
     
     
     
      
       
        
        
        
        
       loss 
        
       = 
        
       np. 
       sum 
       ( 
       np. 
       power 
       (( 
       y 
        
       - 
        
       yhat 
       ), 
        
       2 
       )) 
      
     
     
     
     
      
       
        
        
        
        
       return 
        
       loss 
      
     
     
     
     
      
       
        
        
        
        
      
     
     
     
     
      
       
       yhat 
        
       = 
        
       np. 
       array 
       ([ 
       .9, 
        
       0.2, 
        
       0.1, 
        
       .4, 
        
       .9 
       ]) 
      
     
     
     
     
      
       
       y 
        
       = 
        
       np. 
       array 
       ([ 
       1, 
        
       0, 
        
       0, 
        
       1, 
        
       1 
       ]) 
      
     
     
     
     
      
       
       print 
        
       "L2=" 
        
       + 
        
       str 
       ( 
       L2 
       ( 
       yhat, 
       y 
       )) 
      
     
     
     
     
      
       
       #L2=0.43

Logistic的实现

接下来的内容中，我们将实现一个完成Logistic函数。包括：初始化、计算代价函数和梯度、使用梯度下降算法进行优化等并把他们整合成为一个函数。

本实验用于通过训练来判断一副图像是否为猫。

在这个过程中，我们将会用到如下库：

numpy：Python科学计算中最重要的库

h5py：Python与H5文件交互的库

mathplotlib：Python画图的库

PIL：Python图像相关的库

scipy：Python科学计算相关的库

在程序的开头，我们首先需要引入相关的库：


  
  

  
  
   
   
    
    
     
     1
    
    
    
    
     
     2
    
    
    
    
     
     3
    
    
    
    
     
     4
    
    
    
    
     
     5
    
    
    
    
     
     6
    
    
    
    
     
     7
    
    
    
    
     
     8
    
    
   
   
   
   
  
  
  
  
   
   
    
    
     
     
    
    
    
    
     
     
    
    
    
    
     
     
      
       
       import 
        
       numpy 
        
       as 
        
       np 
      
     
     
     
     
      
       
       import 
        
       matplotlib. 
       pyplot 
        
       as 
        
       plt 
      
     
     
     
     
      
       
       import 
        
       h5py 
      
     
     
     
     
      
       
       import 
        
       scipy 
      
     
     
     
     
      
       
       from 
        
       PIL 
        
       import 
        
       Image 
      
     
     
     
     
      
       
       from 
        
       scipy 
        
       import 
        
       ndimage 
      
     
     
     
     
      
      
     
     
     
     
      
       
       % 
       matplotlib 
        
       inline 
        
        
       #设置matplotlib在行内显示图片

在训练之前，首先需要读取数据，读取数据的代码如下：


  
  

  
  
   
   
    
    
     
     1
     
     
    
    
    
    
     
     2
    
    
    
    
     
     3
    
    
    
    
     
     4
    
    
    
    
     
     5
    
    
    
    
     
     6
    
    
    
    
     
     7
    
    
    
    
     
     8
    
    
    
    
     
     9
    
    
    
    
     
     10
    
    
    
    
     
     11
    
    
    
    
     
     12
    
    
    
    
     
     13
    
    
    
    
     
     14
    
    
    
    
     
     15
    
    
    
    
     
     16
    
    
    
    
     
     17
    
    
    
    
     
     18
    
    
    
    
     
     19
    
    
    
    
     
     20
    
    
   
   
   
   
  
  
  
  
   
   
    
    
     
     
    
    
    
    
     
     
     
     
     
     
     
     
     
     
    
    
    
    
     
     
      
       
       def 
        
       load_dataset 
       ( 
       ): 
      
     
     
     
     
      
       
        
        
        
        
       """ 
      
     
     
     
     
      
       
       #加载数据集 
      
     
     
     
     
      
       
       """ 
      
     
     
     
     
      
       
        
        
        
        
       train_dataset 
        
       = 
        
       h5py. 
       File 
       ( 
       'datasets/train_catvnoncat.h5', 
        
       "r" 
       ) 
        
        
       #读取H5文件 
      
     
     
     
     
      
       
        
        
        
        
       train_set_x_orig 
        
       = 
        
       np. 
       array 
       ( 
       train_dataset 
       [ 
       "train_set_x" 
       ] 
       [: 
       ]) 
        
       #yourtrainsetfeatures 
      
     
     
     
     
      
       
        
        
        
        
       train_set_y_orig 
        
       = 
        
       np. 
       array 
       ( 
       train_dataset 
       [ 
       "train_set_y" 
       ] 
       [: 
       ]) 
        
       #yourtrainsetlabels 
      
     
     
     
     
      
      
     
     
     
     
      
       
        
        
        
        
       test_dataset 
        
       = 
        
       h5py. 
       File 
       ( 
       'datasets/test_catvnoncat.h5', 
        
       "r" 
       ) 
      
     
     
     
     
      
       
        
        
        
        
       test_set_x_orig 
        
       = 
        
       np. 
       array 
       ( 
       test_dataset 
       [ 
       "test_set_x" 
       ] 
       [: 
       ]) 
        
       #yourtestsetfeatures 
      
     
     
     
     
      
       
        
        
        
        
       test_set_y_orig 
        
       = 
        
       np. 
       array 
       ( 
       test_dataset 
       [ 
       "test_set_y" 
       ] 
       [: 
       ]) 
        
       #yourtestsetlabels 
      
     
     
     
     
      
      
     
     
     
     
      
       
        
        
        
        
       classes 
        
       = 
        
       np. 
       array 
       ( 
       test_dataset 
       [ 
       "list_classes" 
       ] 
       [: 
       ]) 
        
       #thelistofclasses 
      
     
     
     
     
      
       
        
        
        
        
      
     
     
     
     
      
       
        
        
        
        
       train_set_y_orig 
        
       = 
        
       train_set_y_orig. 
       reshape 
       (( 
       1, 
        
       train_set_y_orig. 
       shape 
       [ 
       0 
       ])) 
        
        
       #对训练集和测试集标签进行reshape 
      
     
     
     
     
      
       
        
        
        
        
       test_set_y_orig 
        
       = 
        
       test_set_y_orig. 
       reshape 
       (( 
       1, 
        
       test_set_y_orig. 
       shape 
       [ 
       0 
       ])) 
      
     
     
     
     
      
       
        
        
        
        
      
     
     
     
     
      
       
        
        
        
        
       return 
        
       train_set_x_orig, 
        
       train_set_y_orig, 
        
       test_set_x_orig, 
        
       test_set_y_orig, 
        
       classes 
      
     
     
     
     
      
       
        
        
        
        
      
     
     
     
     
      
       
       train_set_x_orig, 
       train_set_y, 
       test_set_x_orig, 
       test_set_y, 
       classes 
       = 
       load_dataset 
       ( 
       )

数据说明：

对于训练集的标签而言，对于猫，标记为1，否则标记为0。

每一个图像的维度都是(num_px, num_px, 3)，其中，长宽相同，3表示是RGB图像。

train_set_x_orig和test_set_x_orig中，包含_orig是由于我们稍候需要对图像进行预处理，预处理后的变量将会命名为train_set_x和train_set_y。

train_set_x_orig中的每一个元素对于这一副图像，我们可以用如下代码将图像显示出来：


  
  

  
  
   
   
    
    
     
     1
    
    
    
    
     
     2
    
    
    
    
     
     3
    
    
    
    
     
     4
    
    
   
   
   
   
  
  
  
  
   
   
    
    
     
     
    
    
    
    
     
     
    
    
    
    
     
     
      
       
       index 
       = 
       25 
      
     
     
     
     
      
       
       plt. 
       imshow 
       ( 
       train_set_x_orig 
       [ 
       index 
       ]) 
      
     
     
     
     
      
       
       print 
       "y = " 
       + 
       str 
       ( 
       train_set_y 
       [:, 
       index 
       ]) 
       + 
       ", it's a '" 
       + 
       classes 
       [ 
       np. 
       squeeze 
       ( 
       train_set_y 
       [:, 
       index 
       ])]. 
       decode 
       ( 
       "utf-8" 
       ) 
       + 
       "' picture." 
      
     
     
     
     
      
       
       # y = [1], it's a 'cat' picture.

接下来，我们需要根据图像集来计算出训练集的大小、测试集的大小以及图片的大小：


  
  

  
  
   
   
    
    
     
     1
    
    
    
    
     
     2
    
    
    
    
     
     3
    
    
    
    
     
     4
    
    
    
    
     
     5
    
    
   
   
   
   
  
  
  
  
   
   
    
    
     
     
    
    
    
    
     
     
     
     
     
     
     
     
    
    
    
    
     
     
      
       
       m_train 
       = 
       train_set_x_orig. 
       shape 
       [ 
       0 
       ] 
      
     
     
     
     
      
       
       m_test 
       = 
       test_set_x_orig. 
       shape 
       [ 
       0 
       ] 
      
     
     
     
     
      
       
       num_px 
       = 
       train_set_x_orig. 
       shape 
       [ 
       1 
       ] 
      
     
     
     
     
      
       
       print 
       ( 
       m_train, 
       m_test, 
       num_px 
       ) 
      
     
     
     
     
      
       
       # 209, 50, 64

接下来，我们需要对将每幅图像转为一个矢量，即矩阵的一列。

最终，整个训练集将会转为一个矩阵，其中包括num_px*numpy*3行，m_train列。


  
  

  
  
   
   
    
    
     
     1
    
    
    
    
     
     2
    
    
   
   
   
   
  
  
  
  
   
   
    
    
     
     
    
    
    
    
     
     
    
    
    
    
     
     
      
       
       train_set_x_flatten 
       = 
       train_set_x_orig. 
       reshape 
       ( 
       train_set_x_orig. 
       shape 
       [ 
       0 
       ], 
       - 
       1 
       ). 
       T 
      
     
     
     
     
      
       
       test_set_x_flatten 
       = 
       test_set_x_orig. 
       reshape 
       ( 
       test_set_x_orig. 
       shape 
       [ 
       0 
       ], 
       - 
       1 
       ). 
       T

Ps：其中X_flatten = X.reshape(X.shape[0], -1).T可以将一个维度为(a,b,c,d)的矩阵转换为一个维度为(b∗c∗d, a)的矩阵。

接下来，我们需要对图像值进行归一化。

由于图像的原始值在0到255之间，最简单的方式是直接除以255即可。


  
  

  
  
   
   
    
    
     
     1
    
    
    
    
     
     2
    
    
   
   
   
   
  
  
  
  
   
   
    
    
     
     
    
    
    
    
     
     
    
    
    
    
     
     
      
       
       train_set_x 
       = 
       train_set_x_flatten 
       / 
       255. 
      
     
     
     
     
      
       
       test_set_x 
       = 
       test_set_x_flatten 
       / 
       255.

接下来，我们来看一下Logistic的结构：

对于每个训练样本x，其误差函数的计算方式如下：

而整体的代价函数计算如下：

接下来，我们将按照如下步骤来实现Logistic：

1. 定义模型结构

2. 初始化模型参数

3. 循环

3.1 前向传播

3.2 反向传递

3.3 更新参数

4. 整合成为一个完整的模型

Step1：实现sigmod函数


  
  

  
  
   
   
    
    
     
     1
     
     
    
    
    
    
     
     2
    
    
    
    
     
     3
    
    
    
    
     
     4
    
    
    
    
     
     5
     
     
    
    
    
    
     
     6
    
    
    
    
     
     7
    
    
    
    
     
     8
     
     
    
    
    
    
     
     9
    
    
    
    
     
     10
    
    
    
    
     
     11
    
    
    
    
     
     12
    
    
   
   
   
   
  
  
  
  
   
   
    
    
     
     
    
    
    
    
     
     
     
     
    
    
    
    
     
     
      
       
       def 
       sigmoid 
       ( 
       z 
       ): 
      
     
     
     
     
      
       
       """ 
      
     
     
     
     
      
       
        Compute the sigmoid of z 
      
     
     
     
     
      
      
     
     
     
     
      
       
        Arguments: 
      
     
     
     
     
      
       
        z -- A scalar or numpy array of any size. 
      
     
     
     
     
      
      
     
     
     
     
      
       
        Return: 
      
     
     
     
     
      
       
        s -- sigmoid(z) 
      
     
     
     
     
      
       
        """ 
      
     
     
     
     
      
       
       s 
       = 
       1.0 
       / 
       ( 
       1 
       + 
       1 
       / 
       np. 
       exp 
       ( 
       z 
       )) 
      
     
     
     
     
      
       
       return 
       s

Step2：初始化参数


  
  

  
  
   
   
    
    
     
     1
     
     
    
    
    
    
     
     2
    
    
    
    
     
     3
    
    
    
    
     
     4
    
    
    
    
     
     5
     
     
    
    
    
    
     
     6
    
    
    
    
     
     7
    
    
    
    
     
     8
     
     
    
    
    
    
     
     9
    
    
    
    
     
     10
    
    
    
    
     
     11
    
    
    
    
     
     12
    
    
    
    
     
     13
    
    
    
    
     
     14
    
    
   
   
   
   
  
  
  
  
   
   
    
    
     
     
    
    
    
    
     
     
    
    
    
    
     
     
      
       
       def 
       initialize_with_zeros 
       ( 
       dim 
       ): 
      
     
     
     
     
      
       
       """ 
      
     
     
     
     
      
       
        This function creates a vector of zeros of shape (dim, 1) for w and initializes b to 0. 
      
     
     
     
     
      
       
         
      
     
     
     
     
      
       
        Argument: 
      
     
     
     
     
      
       
        dim -- size of the w vector we want (or number of parameters in this case) 
      
     
     
     
     
      
       
         
      
     
     
     
     
      
       
        Returns: 
      
     
     
     
     
      
       
        w -- initialized vector of shape (dim, 1) 
      
     
     
     
     
      
       
        b -- initialized scalar (corresponds to the bias) 
      
     
     
     
     
      
       
        """ 
      
     
     
     
     
      
       
       w 
       = 
       np. 
       zeros 
       (( 
       dim, 
       1 
       )) 
      
     
     
     
     
      
       
       b 
       = 
       0 
      
     
     
     
     
      
       
       return 
       w, 
       b

Step3：前向传播与反向传播

Ps：计算公式如下：（具体计算公式来源请查看之前的理论课）


  
  

  
  
   
   
    
    
     
     1
     
     
    
    
    
    
     
     2
    
    
    
    
     
     3
    
    
    
    
     
     4
    
    
    
    
     
     5
     
     
    
    
    
    
     
     6
    
    
    
    
     
     7
    
    
    
    
     
     8
    
    
    
    
     
     9
    
    
    
    
     
     10
    
    
    
    
     
     11
     
     
    
    
    
    
     
     12
    
    
    
    
     
     13
    
    
    
    
     
     14
    
    
    
    
     
     15
    
    
    
    
     
     16
     
     
    
    
    
    
     
     17
    
    
    
    
     
     18
    
    
    
    
     
     19
    
    
    
    
     
     20
    
    
    
    
     
     21
    
    
    
    
     
     22
    
    
    
    
     
     23
    
    
    
    
     
     24
    
    
    
    
     
     25
    
    
    
    
     
     26
    
    
    
    
     
     27
    
    
    
    
     
     28
    
    
    
    
     
     29
    
    
    
    
     
     30
    
    
    
    
     
     31
    
    
    
    
     
     32
    
    
    
    
     
     33
    
    
    
    
     
     34
    
    
   
   
   
   
  
  
  
  
   
   
    
    
     
     
    
    
    
    
     
     
    
    
    
    
     
     
      
       
       def 
       propagate 
       ( 
       w, 
       b, 
       X, 
       Y 
       ): 
      
     
     
     
     
      
       
       """ 
      
     
     
     
     
      
       
        Implement the cost function and its gradient for the propagation explained above 
      
     
     
     
     
      
      
     
     
     
     
      
       
        Arguments: 
      
     
     
     
     
      
       
        w -- weights, a numpy array of size (num_px * num_px * 3, 1) 
      
     
     
     
     
      
       
        b -- bias, a scalar 
      
     
     
     
     
      
       
        X -- data of size (num_px * num_px * 3, number of examples) 
      
     
     
     
     
      
       
        Y -- true "label" vector (containing 0 if non-cat, 1 if cat) of size (1, number of examples) 
      
     
     
     
     
      
      
     
     
     
     
      
       
        Return: 
      
     
     
     
     
      
       
        cost -- negative log-likelihood cost for logistic regression 
      
     
     
     
     
      
       
        dw -- gradient of the loss with respect to w, thus same shape as w 
      
     
     
     
     
      
       
        db -- gradient of the loss with respect to b, thus same shape as b 
      
     
     
     
     
      
       
         
      
     
     
     
     
      
       
        Tips: 
      
     
     
     
     
      
       
        - Write your code step by step for the propagation. np.log(), np.dot() 
      
     
     
     
     
      
       
        """ 
      
     
     
     
     
      
       
      
     
     
     
     
      
       
       m 
       = 
       X. 
       shape 
       [ 
       1 
       ] 
      
     
     
     
     
      
       
      
     
     
     
     
      
       
       # FORWARD PROPAGATION (FROM X TO COST) 
      
     
     
     
     
      
       
       A 
       = 
       sigmoid 
       ( 
       np. 
       dot 
       ( 
       w. 
       T, 
       X 
       ) 
       + 
       b 
       ) 
       # compute activation 
      
     
     
     
     
      
       
       cost 
       = 
       - 
       1 
       / 
       m 
       * 
       np. 
       sum 
       ( 
       Y 
       * 
       np. 
       log 
       ( 
       A 
       ) 
       + 
       ( 
       1 
       - 
       Y 
       ) 
       * 
       np. 
       log 
       ( 
       1 
       - 
       A 
       )) 
       # compute cost 
      
     
     
     
     
      
       
      
     
     
     
     
      
       
       # BACKWARD PROPAGATION (TO FIND GRAD) 
      
     
     
     
     
      
       
       dw 
       = 
       1 
       / 
       m 
       * 
       np. 
       dot 
       ( 
       X, 
       ( 
       A 
       - 
       Y 
       ). 
       T 
       ) 
      
     
     
     
     
      
       
       db 
       = 
       1 
       / 
       m 
       * 
       np. 
       sum 
       ( 
       A 
       - 
       Y 
       ) 
      
     
     
     
     
      
       
       cost 
       = 
       np. 
       squeeze 
       ( 
       cost 
       ) 
      
     
     
     
     
      
       
      
     
     
     
     
      
       
       grads 
       = 
       { 
       "dw": 
       dw, 
      
     
     
     
     
      
       
         
         
         
       "db": 
       db 
       } 
      
     
     
     
     
      
       
      
     
     
     
     
      
       
       return 
       grads, 
       cost

Step4：更新参数

更新参数的公式如下：

完整代码如下：


  
  

  
  
   
   
    
    
     
     1
     
     
    
    
    
    
     
     2
    
    
    
    
     
     3
    
    
    
    
     
     4
    
    
    
    
     
     5
     
     
    
    
    
    
     
     6
    
    
    
    
     
     7
    
    
    
    
     
     8
    
    
    
    
     
     9
    
    
    
    
     
     10
    
    
    
    
     
     11
    
    
    
    
     
     12
    
    
    
    
     
     13
    
    
    
    
     
     14
     
     
    
    
    
    
     
     15
    
    
    
    
     
     16
    
    
    
    
     
     17
    
    
    
    
     
     18
    
    
    
    
     
     19
     
     
    
    
    
    
     
     20
     
     
    
    
    
    
     
     21
    
    
    
    
     
     22
    
    
    
    
     
     23
    
    
    
    
     
     24
    
    
    
    
     
     25
    
    
    
    
     
     26
    
    
    
    
     
     27
     
     
    
    
    
    
     
     28
    
    
    
    
     
     29
    
    
    
    
     
     30
    
    
    
    
     
     31
    
    
    
    
     
     32
    
    
    
    
     
     33
    
    
    
    
     
     34
    
    
    
    
     
     35
    
    
    
    
     
     36
    
    
    
    
     
     37
    
    
    
    
     
     38
    
    
    
    
     
     39
    
    
    
    
     
     40
    
    
    
    
     
     41
    
    
    
    
     
     42
     
     
    
    
    
    
     
     43
    
    
    
    
     
     44
    
    
    
    
     
     45
    
    
    
    
     
     46
     
     
    
    
    
    
     
     47
    
    
    
    
     
     48
    
    
    
    
     
     49
    
    
    
    
     
     50
    
    
    
    
     
     51
    
    
    
    
     
     52
    
    
   
   
   
   
  
  
  
  
   
   
    
    
     
     
    
    
    
    
     
     
    
    
    
    
     
     
      
       
       def 
       optimize 
       ( 
       w, 
       b, 
       X, 
       Y, 
       num_iterations, 
       learning_rate, 
       print_cost 
       = 
       False 
       ): 
      
     
     
     
     
      
       
       """ 
      
     
     
     
     
      
       
        This function optimizes w and b by running a gradient descent algorithm 
      
     
     
     
     
      
       
         
      
     
     
     
     
      
       
        Arguments: 
      
     
     
     
     
      
       
        w -- weights, a numpy array of size (num_px * num_px * 3, 1) 
      
     
     
     
     
      
       
        b -- bias, a scalar 
      
     
     
     
     
      
       
        X -- data of shape (num_px * num_px * 3, number of examples) 
      
     
     
     
     
      
       
        Y -- true "label" vector (containing 0 if non-cat, 1 if cat), of shape (1, number of examples) 
      
     
     
     
     
      
       
        num_iterations -- number of iterations of the optimization loop 
      
     
     
     
     
      
       
        learning_rate -- learning rate of the gradient descent update rule 
      
     
     
     
     
      
       
        print_cost -- True to print the loss every 100 steps 
      
     
     
     
     
      
       
         
      
     
     
     
     
      
       
        Returns: 
      
     
     
     
     
      
       
        params -- dictionary containing the weights w and bias b 
      
     
     
     
     
      
       
        grads -- dictionary containing the gradients of the weights and bias with respect to the cost function 
      
     
     
     
     
      
       
        costs -- list of all the costs computed during the optimization, this will be used to plot the learning curve. 
      
     
     
     
     
      
       
         
      
     
     
     
     
      
       
        Tips: 
      
     
     
     
     
      
       
        You basically need to write down two steps and iterate through them: 
      
     
     
     
     
      
       
         
        1) Calculate the cost and the gradient for the current parameters. Use propagate(). 
      
     
     
     
     
      
       
         
        2) Update the parameters using gradient descent rule for w and b. 
      
     
     
     
     
      
       
        """ 
      
     
     
     
     
      
       
      
     
     
     
     
      
       
       costs 
       = 
       [ 
       ] 
      
     
     
     
     
      
       
      
     
     
     
     
      
       
       for 
       i 
       in 
       range 
       ( 
       num_iterations 
       ): 
       #每次迭代循环一次， num_iterations为迭代次数 
      
     
     
     
     
      
       
         
      
     
     
     
     
      
       
         
      
     
     
     
     
      
       
         
       # Cost and gradient calculation  
      
     
     
     
     
      
       
         
       grads, 
       cost 
       = 
       propagate 
       ( 
       w, 
       b, 
       X, 
       Y 
       ) 
      
     
     
     
     
      
       
         
      
     
     
     
     
      
       
         
       # Retrieve derivatives from grads 
      
     
     
     
     
      
       
         
       dw 
       = 
       grads 
       [ 
       "dw" 
       ] 
      
     
     
     
     
      
       
         
       db 
       = 
       grads 
       [ 
       "db" 
       ] 
      
     
     
     
     
      
       
         
      
     
     
     
     
      
       
         
       # update rule  
      
     
     
     
     
      
       
         
       w 
       = 
       w 
       - 
       learning_rate 
       * 
       dw 
      
     
     
     
     
      
       
         
       b 
       = 
       b 
       - 
       learning_rate 
       * 
       db 
      
     
     
     
     
      
       
         
      
     
     
     
     
      
       
         
       # Record the costs 
      
     
     
     
     
      
       
         
       if 
       i 
       % 
       100 
       == 
       0: 
      
     
     
     
     
      
       
         
         
       costs. 
       append 
       ( 
       cost 
       ) 
      
     
     
     
     
      
       
         
      
     
     
     
     
      
       
         
       # Print the cost every 100 training examples 
      
     
     
     
     
      
       
         
       if 
       print_cost 
       and 
       i 
       % 
       100 
       == 
       0: 
      
     
     
     
     
      
       
         
         
       print 
       ( 
       "Cost after iteration %i: %f" 
       % 
       ( 
       i, 
       cost 
       )) 
      
     
     
     
     
      
       
       params 
       = 
       { 
       "w": 
       w, 
      
     
     
     
     
      
       
         
         
         
       "b": 
       b 
       } 
      
     
     
     
     
      
       
       grads 
       = 
       { 
       "dw": 
       dw, 
      
     
     
     
     
      
       
         
         
         
       "db": 
       db 
       } 
      
     
     
     
     
      
       
       return 
       params, 
       grads, 
       costs

Step5：利用训练好的模型对测试集进行预测：

计算公式如下：

当输入大于0.5时，我们认为其预测认为结果是猫，否则不是猫。


  
  

  
  
   
   
    
    
     
     1
     
     
    
    
    
    
     
     2
    
    
    
    
     
     3
    
    
    
    
     
     4
    
    
    
    
     
     5
     
     
    
    
    
    
     
     6
    
    
    
    
     
     7
    
    
    
    
     
     8
    
    
    
    
     
     9
    
    
    
    
     
     10
     
     
    
    
    
    
     
     11
    
    
    
    
     
     12
    
    
    
    
     
     13
    
    
    
    
     
     14
    
    
    
    
     
     15
    
    
    
    
     
     16
    
    
    
    
     
     17
    
    
    
    
     
     18
    
    
    
    
     
     19
    
    
    
    
     
     20
    
    
    
    
     
     21
     
     
    
    
    
    
     
     22
    
    
    
    
     
     23
     
     
    
    
    
    
     
     24
    
    
    
    
     
     25
     
     
    
    
    
    
     
     26
    
    
    
    
     
     27
    
    
    
    
     
     28
    
    
   
   
   
   
  
  
  
  
   
   
    
    
     
     
    
    
    
    
     
     
    
    
    
    
     
     
      
       
       def 
       predict 
       ( 
       w, 
       b, 
       X 
       ): 
      
     
     
     
     
      
       
       ''' 
      
     
     
     
     
      
       
        Predict whether the label is 0 or 1 using learned logistic regression parameters (w, b) 
      
     
     
     
     
      
       
         
      
     
     
     
     
      
       
        Arguments: 
      
     
     
     
     
      
       
        w -- weights, a numpy array of size (num_px * num_px * 3, 1) 
      
     
     
     
     
      
       
        b -- bias, a scalar 
      
     
     
     
     
      
       
        X -- data of size (num_px * num_px * 3, number of examples) 
      
     
     
     
     
      
       
         
      
     
     
     
     
      
       
        Returns: 
      
     
     
     
     
      
       
        Y_prediction -- a numpy array (vector) containing all predictions (0/1) for the examples in X 
      
     
     
     
     
      
       
        ''' 
      
     
     
     
     
      
       
      
     
     
     
     
      
       
       m 
       = 
       X. 
       shape 
       [ 
       1 
       ] 
      
     
     
     
     
      
       
       Y_prediction 
       = 
       np. 
       zeros 
       (( 
       1, 
       m 
       )) 
      
     
     
     
     
      
       
       w 
       = 
       w. 
       reshape 
       ( 
       X. 
       shape 
       [ 
       0 
       ], 
       1 
       ) 
      
     
     
     
     
      
       
      
     
     
     
     
      
       
       # Compute vector "A" predicting the probabilities of a cat being present in the picture 
      
     
     
     
     
      
       
       A 
       = 
       sigmoid 
       ( 
       np. 
       dot 
       ( 
       w. 
       T, 
       X 
       ) 
       + 
       b 
       ) 
      
     
     
     
     
      
       
      
     
     
     
     
      
       
       for 
       i 
       in 
       range 
       ( 
       A. 
       shape 
       [ 
       1 
       ]): 
      
     
     
     
     
      
       
         
       # Convert probabilities A[0,i] to actual predictions p[0,i] 
      
     
     
     
     
      
       
         
       if 
       A 
       [ 
       0 
       ] 
       [ 
       i 
       ] 
       > 
       0.5: 
      
     
     
     
     
      
       
         
         
       Y_prediction 
       [ 
       0 
       ] 
       [ 
       i 
       ] 
       = 
       1 
      
     
     
     
     
      
       
         
       else: 
      
     
     
     
     
      
       
         
         
       Y_prediction 
       [ 
       0 
       ] 
       [ 
       i 
       ] 
       = 
       0 
      
     
     
     
     
      
       
      
     
     
     
     
      
       
       return 
       Y_prediction

Step5：将以上功能整合到一个模型中：


  
  

  
  
   
   
    
    
     
     1
     
     
    
    
    
    
     
     2
    
    
    
    
     
     3
    
    
    
    
     
     4
    
    
    
    
     
     5
     
     
    
    
    
    
     
     6
    
    
    
    
     
     7
    
    
    
    
     
     8
    
    
    
    
     
     9
    
    
    
    
     
     10
    
    
    
    
     
     11
    
    
    
    
     
     12
    
    
    
    
     
     13
    
    
    
    
     
     14
     
     
    
    
    
    
     
     15
    
    
    
    
     
     16
    
    
    
    
     
     17
    
    
    
    
     
     18
    
    
    
    
     
     19
    
    
    
    
     
     20
    
    
    
    
     
     21
    
    
    
    
     
     22
    
    
    
    
     
     23
    
    
    
    
     
     24
    
    
    
    
     
     25
    
    
    
    
     
     26
    
    
    
    
     
     27
    
    
    
    
     
     28
    
    
    
    
     
     29
    
    
    
    
     
     30
    
    
    
    
     
     31
    
    
    
    
     
     32
    
    
    
    
     
     33
    
    
    
    
     
     34
    
    
    
    
     
     35
    
    
    
    
     
     36
    
    
    
    
     
     37
    
    
    
    
     
     38
    
    
    
    
     
     39
    
    
    
    
     
     40
    
    
    
    
     
     41
    
    
    
    
     
     42
    
    
    
    
     
     43
    
    
    
    
     
     44
    
    
   
   
   
   
  
  
  
  
   
   
    
    
     
     
    
    
    
    
     
     
    
    
    
    
     
     
      
       
       def 
       model 
       ( 
       X_train, 
       Y_train, 
       X_test, 
       Y_test, 
       num_iterations 
       = 
       2000, 
       learning_rate 
       = 
       0.5, 
       print_cost 
       = 
       False 
       ): 
      
     
     
     
     
      
       
       """ 
      
     
     
     
     
      
       
        Builds the logistic regression model by calling the function you've implemented previously 
      
     
     
     
     
      
       
         
      
     
     
     
     
      
       
        Arguments: 
      
     
     
     
     
      
       
        X_train -- training set represented by a numpy array of shape (num_px * num_px * 3, m_train) 
      
     
     
     
     
      
       
        Y_train -- training labels represented by a numpy array (vector) of shape (1, m_train) 
      
     
     
     
     
      
       
        X_test -- test set represented by a numpy array of shape (num_px * num_px * 3, m_test) 
      
     
     
     
     
      
       
        Y_test -- test labels represented by a numpy array (vector) of shape (1, m_test) 
      
     
     
     
     
      
       
        num_iterations -- hyperparameter representing the number of iterations to optimize the parameters 
      
     
     
     
     
      
       
        learning_rate -- hyperparameter representing the learning rate used in the update rule of optimize() 
      
     
     
     
     
      
       
        print_cost -- Set to true to print the cost every 100 iterations 
      
     
     
     
     
      
       
         
      
     
     
     
     
      
       
        Returns: 
      
     
     
     
     
      
       
        d -- dictionary containing information about the model. 
      
     
     
     
     
      
       
        """ 
      
     
     
     
     
      
       
       # initialize parameters with zeros  
      
     
     
     
     
      
       
       w, 
       b 
       = 
       initialize_with_zeros 
       ( 
       X_train. 
       shape 
       [ 
       0 
       ]) 
      
     
     
     
     
      
      
     
     
     
     
      
       
       # Gradient descent 
      
     
     
     
     
      
       
       parameters, 
       grads, 
       costs 
       = 
       optimize 
       ( 
       w, 
       b, 
       X_train, 
       Y_train, 
       num_iterations, 
       learning_rate, 
       print_cost 
       ) 
      
     
     
     
     
      
       
      
     
     
     
     
      
       
       # Retrieve parameters w and b from dictionary "parameters" 
      
     
     
     
     
      
       
       w 
       = 
       parameters 
       [ 
       "w" 
       ] 
      
     
     
     
     
      
       
       b 
       = 
       parameters 
       [ 
       "b" 
       ] 
      
     
     
     
     
      
       
      
     
     
     
     
      
       
       # Predict test/train set examples  
      
     
     
     
     
      
       
       Y_prediction_test 
       = 
       predict 
       ( 
       w, 
       b, 
       X_test 
       ) 
      
     
     
     
     
      
       
       Y_prediction_train 
       = 
       predict 
       ( 
       w, 
       b, 
       X_train 
       ) 
      
     
     
     
     
      
      
     
     
     
     
      
       
       # Print train/test Errors 
      
     
     
     
     
      
       
       print 
       ( 
       "train accuracy: {} %". 
       format 
       ( 
       100 
       - 
       np. 
       mean 
       ( 
       np. 
       abs 
       ( 
       Y_prediction_train 
       - 
       Y_train 
       )) 
       * 
       100 
       )) 
      
     
     
     
     
      
       
       print 
       ( 
       "test accuracy: {} %". 
       format 
       ( 
       100 
       - 
       np. 
       mean 
       ( 
       np. 
       abs 
       ( 
       Y_prediction_test 
       - 
       Y_test 
       )) 
       * 
       100 
       )) 
      
     
     
     
     
      
      
     
     
     
     
      
       
      
     
     
     
     
      
       
       d 
       = 
       { 
       "costs": 
       costs, 
      
     
     
     
     
      
       
         
         
       "Y_prediction_test": 
       Y_prediction_test, 
      
     
     
     
     
      
       
         
         
       "Y_prediction_train" : 
       Y_prediction_train, 
      
     
     
     
     
      
       
         
         
       "w" : 
       w, 
      
     
     
     
     
      
       
         
         
       "b" : 
       b, 
      
     
     
     
     
      
       
         
         
       "learning_rate" : 
       learning_rate, 
      
     
     
     
     
      
       
         
         
       "num_iterations": 
       num_iterations 
       } 
      
     
     
     
     
      
       
      
     
     
     
     
      
       
       return 
       d

测试一下该模型吧：


  
  

  
  
   
   
    
    
     
     1
    
    
   
   
   
   
  
  
  
  
   
   
    
    
     
     
    
    
    
    
     
     
    
    
    
    
     
     
      
       
       d 
       = 
       model 
       ( 
       train_set_x, 
       train_set_y, 
       test_set_x, 
       test_set_y, 
       num_iterations 
       = 
       2000, 
       learning_rate 
       = 
       0.005, 
       print_cost 
       = 
       True 
       )

此时，观察打印结果，我们可以发现我们的测试准确率已经可以达到70.0%。

而对于训练集，其准确性达到了99%。这表明了我们的模型有着一定的过拟合，不过不要着急，我们会在后续的内容中来解决这一问题。

使用如下代码，我们可以挑选其中的一些图片来看我们的预测结果：


  
  

  
  
   
   
    
    
     
     1
    
    
    
    
     
     2
    
    
    
    
     
     3
    
    
    
    
     
     4
    
    
   
   
   
   
  
  
  
  
   
   
    
    
     
     
    
    
    
    
     
     
     
     
    
    
    
    
     
     
      
       
       # Example of a picture that was wrongly classified. 
      
     
     
     
     
      
       
       index 
       = 
       14 
      
     
     
     
     
      
       
       plt. 
       imshow 
       ( 
       test_set_x 
       [:, 
       index 
       ]. 
       reshape 
       (( 
       num_px, 
       num_px, 
       3 
       ))) 
      
     
     
     
     
      
       
       print 
       ( 
       "y = " 
       + 
       str 
       ( 
       test_set_y 
       [ 
       0, 
       index 
       ]) 
       + 
       ", you predicted that it is a  
       \" 
       " 
       + 
       classes 
       [ 
       d 
       [ 
       "Y_prediction_test" 
       ] 
       [ 
       0, 
       index 
       ]]. 
       decode 
       ( 
       "utf-8" 
       ) 
       + 
       " 
       \" 
        picture." 
       )

此外，我们还可以画出我们的代价函数变化曲线：


  
  

  
  
   
   
    
    
     
     1
    
    
    
    
     
     2
    
    
    
    
     
     3
    
    
    
    
     
     4
    
    
    
    
     
     5
    
    
    
    
     
     6
    
    
    
    
     
     7
    
    
   
   
   
   
  
  
  
  
   
   
    
    
     
     
    
    
    
    
     
     
     
     
    
    
    
    
     
     
      
       
       # Plot learning curve (with costs) 
      
     
     
     
     
      
       
       costs 
       = 
       np. 
       squeeze 
       ( 
       d 
       [ 
       'costs' 
       ]) 
      
     
     
     
     
      
       
       plt. 
       plot 
       ( 
       costs 
       ) 
      
     
     
     
     
      
       
       plt. 
       ylabel 
       ( 
       'cost' 
       ) 
      
     
     
     
     
      
       
       plt. 
       xlabel 
       ( 
       'iterations (per hundreds)' 
       ) 
      
     
     
     
     
      
       
       plt. 
       title 
       ( 
       "Learning rate =" 
       + 
       str 
       ( 
       d 
       [ 
       "learning_rate" 
       ])) 
      
     
     
     
     
      
       
       plt. 
       show 
       ( 
       )

之前的理论课程中，我们已经提及过学习速率对于最终的结果有着较大影响，现在，我们来用实验让大家有一个直观的了解。


  
  

  
  
   
   
    
    
     
     1
    
    
    
    
     
     2
    
    
    
    
     
     3
     
     
    
    
    
    
     
     4
    
    
    
    
     
     5
    
    
    
    
     
     6
    
    
    
    
     
     7
    
    
    
    
     
     8
     
     
    
    
    
    
     
     9
    
    
    
    
     
     10
    
    
    
    
     
     11
    
    
    
    
     
     12
    
    
    
    
     
     13
    
    
    
    
     
     14
    
    
    
    
     
     15
    
    
    
    
     
     16
    
    
    
    
     
     17
    
    
   
   
   
   
  
  
  
  
   
   
    
    
     
     
    
    
    
    
     
     
     
     
    
    
    
    
     
     
      
       
       learning_rates 
       = 
       [ 
       0.01, 
       0.001, 
       0.0001 
       ] 
      
     
     
     
     
      
       
       models 
       = 
       { 
       } 
      
     
     
     
     
      
       
       for 
       i 
       in 
       learning_rates: 
      
     
     
     
     
      
       
       print 
       ( 
       "learning rate is: " 
       + 
       str 
       ( 
       i 
       )) 
      
     
     
     
     
      
       
       models 
       [ 
       str 
       ( 
       i 
       )] 
       = 
       model 
       ( 
       train_set_x, 
       train_set_y, 
       test_set_x, 
       test_set_y, 
       num_iterations 
       = 
       1500, 
       learning_rate 
       = 
       i, 
       print_cost 
       = 
       False 
       ) 
      
     
     
     
     
      
       
       print 
       ( 
       ' 
       \n 
       ' 
       + 
       "-------------------------------------------------------" 
       + 
       ' 
       \n 
       ' 
       ) 
      
     
     
     
     
      
      
     
     
     
     
      
       
       for 
       i 
       in 
       learning_rates: 
      
     
     
     
     
      
       
       plt. 
       plot 
       ( 
       np. 
       squeeze 
       ( 
       models 
       [ 
       str 
       ( 
       i 
       )] 
       [ 
       "costs" 
       ]), 
       label 
       = 
       str 
       ( 
       models 
       [ 
       str 
       ( 
       i 
       )] 
       [ 
       "learning_rate" 
       ])) 
      
     
     
     
     
      
      
     
     
     
     
      
       
       plt. 
       ylabel 
       ( 
       'cost' 
       ) 
      
     
     
     
     
      
       
       plt. 
       xlabel 
       ( 
       'iterations' 
       ) 
      
     
     
     
     
      
      
     
     
     
     
      
       
       legend 
       = 
       plt. 
       legend 
       ( 
       loc 
       = 
       'upper center', 
       shadow 
       = 
       True 
       ) 
      
     
     
     
     
      
       
       frame 
       = 
       legend. 
       get_frame 
       ( 
       ) 
      
     
     
     
     
      
       
       frame. 
       set_facecolor 
       ( 
       '0.90' 
       ) 
      
     
     
     
     
      
       
       plt. 
       show 
       ( 
       )

分析：不同的学习速率会导致不同的预测结果。较小的学习速度收敛速度较慢，而过大的学习速度可能导致震荡或无法收敛。

如果你希望用一副你自己的图像，而不是训练集或测试集中的图像，那么该如何实现呢？


  
  

  
  
   
   
    
    
     
     1
    
    
    
    
     
     2
    
    
    
    
     
     3
    
    
    
    
     
     4
    
    
    
    
     
     5
    
    
    
    
     
     6
    
    
    
    
     
     7
    
    
    
    
     
     8
    
    
    
    
     
     9
    
    
    
    
     
     10
    
    
    
    
     
     11
    
    
    
    
     
     12
    
    
   
   
   
   
  
  
  
  
   
   
    
    
     
     
    
    
    
    
     
     
     
     
     
     
     
     
    
    
    
    
     
     
      
       
       ## START CODE HERE ## (PUT YOUR IMAGE NAME)  
      
     
     
     
     
      
       
       my_image 
       = 
       "my_image.jpg" 
       # change this to the name of your image file  
      
     
     
     
     
      
       
       ## END CODE HERE ## 
      
     
     
     
     
      
      
     
     
     
     
      
       
       # We preprocess the image to fit your algorithm. 
      
     
     
     
     
      
       
       fname 
       = 
       "images/" 
       + 
       my_image 
      
     
     
     
     
      
       
       image 
       = 
       np. 
       array 
       ( 
       ndimage. 
       imread 
       ( 
       fname, 
       flatten 
       = 
       False 
       )) 
       #读取图片 
      
     
     
     
     
      
       
       my_image 
       = 
       scipy. 
       misc. 
       imresize 
       ( 
       image, 
       size 
       = 
       ( 
       num_px, 
       num_px 
       )). 
       reshape 
       (( 
       1, 
       num_px 
       * 
       num_px 
       * 
       3 
       )). 
       T 
       #放缩图像 
      
     
     
     
     
      
       
       my_predicted_image 
       = 
       predict 
       ( 
       d 
       [ 
       "w" 
       ], 
       d 
       [ 
       "b" 
       ], 
       my_image 
       ) 
       #预测 
      
     
     
     
     
      
      
     
     
     
     
      
       
       plt. 
       imshow 
       ( 
       image 
       ) 
      
     
     
     
     
      
       
       print 
       ( 
       "y = " 
       + 
       str 
       ( 
       np. 
       squeeze 
       ( 
       my_predicted_image 
       )) 
       + 
       ", your algorithm predicts a  
       \" 
       " 
       + 
       classes 
       [ 
       int 
       ( 
       np. 
       squeeze 
       ( 
       my_predicted_image 
       )), 
       ]. 
       decode 
       ( 
       "utf-8" 
       ) 
       + 
       " 
       \" 
        picture." 
       )

更多更详细的内容，请访问原创网站：

http://www.missshi.cn/api/view/blog/59aa08fee519f50d04000170

Ps：初次访问由于js文件较大，请耐心等候（8s左右）

WangZhe0912

关注

0
点赞
踩
3

收藏

觉得还不错? 一键收藏
0
评论
深度学习DeepLearning.ai系列课程学习总结：4. Logistic代码实战

转载过程中，图片丢失，代码显示错乱。为了更好的学习内容，请访问原创版本：http://www.missshi.cn/api/view/blog/59aa08fee519f50d04000170Ps：初次访问由于js文件较大，请耐心等候（8s左右）本节课中，我们将学习如何利用Python的来Logistic。这是第一节Python代码内
复制链接

扫一扫