吴恩达 Deep Learning assignment2_1

最新推荐文章于 2019-10-29 10:12:00 发布

Annnnnm

最新推荐文章于 2019-10-29 10:12:00 发布

阅读量371

点赞数 3

本文链接：https://blog.csdn.net/weixin_42102716/article/details/84799103

版权

其实已经看完第一部分神经网络和深度学习了

网易云课堂里没有作业，下了作业来实践，大段大段地英文开始让我头痛了，决定开始磕磕巴巴的啃了……

不是全部的作业代码，觉得没必要解释的省去了

1 - Building basic functions with numpy

1.1 - sigmoid function, np.exp()

首先一个练习关于math.exp(),其目的是要与np.exp()做对比，math库的输入是实数，而numpy的输入可以是矩阵和向量，所以numpy对于深度学习来说很有用。

import numpy as np

# example of np.exp
x = np.array([1, 2, 3])
print(np.exp(x)) # result is (exp(1), exp(2), exp(3))

exercise：用numpy实现sigmoid函数

# GRADED FUNCTION: sigmoid

import numpy as np # this means you can access numpy functions by writing np.function() instead of numpy.function()

def sigmoid(x):
    """
    Compute the sigmoid of x

    Arguments:
    x -- A scalar or numpy array of any size

    Return:
    s -- sigmoid(x)
    """
    
    ### START CODE HERE ### (≈ 1 line of code)
    s = 1 / (1 + np.exp(-x))
    ### END CODE HERE ###
    
    return s

x = np.array([1, 2, 3])
sigmoid(x)

1.2 - Sigmoid gradient

exercise：实现函数sigmoid_grad（）以计算sigmoid函数相对于其输入x的渐变。

公式：

# GRADED FUNCTION: sigmoid_derivative

def sigmoid_derivative(x):
    """
    Compute the gradient (also called the slope or derivative) of the sigmoid function with respect to its input x.
    You can store the output of the sigmoid function into variables and then use it to calculate the gradient.
    
    Arguments:
    x -- A scalar or numpy array

    Return:
    ds -- Your computed gradient.
    """
    
    ### START CODE HERE ### (≈ 2 lines of code)
    s = sigmoid(x)
    ds = s * (1 - s)
    ### END CODE HERE ###
    
    return ds

x = np.array([1, 2, 3])
print ("sigmoid_derivative(x) = " + str(sigmoid_derivative(x)))

1.3 - Reshaping arrays（重塑阵列）

深度学习中使用的两个常见的numpy函数是np.shape和np.reshape（）。
X.shape用于获得矩阵/向量X的形状（维度）。
X.reshape（...）用于将X重塑为其他维度。

exercise：实现image2vector（），它接受形状（长度，高度，3）的输入并返回形状向量（长度*高度* 3,1）。

例如，将形状（a，b，c）的数组v重塑为形状矢量（a * b，c），可以：
v = v.reshape（（v.shape [0] * v.shape [1]，v.shape [2]））＃v.shape [0] = a; v.shape [1] = b; v.shape [2] = c
请不要将图像的尺寸硬编码为常数。而是使用image.shape [0]等查找所需的数量。

# GRADED FUNCTION: image2vector
def image2vector(image):
    """
    Argument:
    image -- a numpy array of shape (length, height, depth)
    
    Returns:
    v -- a vector of shape (length*height*depth, 1)
    """
    
    ### START CODE HERE ### (≈ 1 line of code)
    v = image.reshape(image.shape[0] * image.shape[1] * image.shape[2], 1)
    ### END CODE HERE ###
    
    return v

# This is a 3 by 3 by 2 array, typically images will be (num_px_x, num_px_y,3) where 3 represents the RGB values
image = np.array([[[ 0.67826139,  0.29380381],
        [ 0.90714982,  0.52835647],
        [ 0.4215251 ,  0.45017551]],

       [[ 0.92814219,  0.96677647],
        [ 0.85304703,  0.52351845],
        [ 0.19981397,  0.27417313]],

       [[ 0.60659855,  0.00533165],
        [ 0.10820313,  0.49978937],
        [ 0.34144279,  0.94630077]]])

print ("image2vector(image) = " + str(image2vector(image)))

输出结果：

1.4 - Normalizing rows（规范化行）

我们在机器学习和深度学习中使用的另一种常用技术是规范化我们的数据。它通常会带来更好的性能，因为梯度下降在归一化后收敛得更快。

exercise:用normalizeRows（）实现规范化矩阵的行。在将此函数应用于输入矩阵x之后，x的每一行应该是单位长度的向量（意味着长度为1）。

# GRADED FUNCTION: normalizeRows

def normalizeRows(x):
    """
    Implement a function that normalizes each row of the matrix x (to have unit length).
    
    Argument:
    x -- A numpy matrix of shape (n, m)
    
    Returns:
    x -- The normalized (by row) numpy matrix. You are allowed to modify x.
    """
    
    ### START CODE HERE ### (≈ 2 lines of code)
    # Compute x_norm as the norm 2 of x. Use np.linalg.norm(..., ord = 2, axis = ..., keepdims = True)
    x_norm = np.linalg.norm(x, axis = 1, keepdims = True) #计算x的范数
    
    # Divide x by its norm.
    x = x / x_norm    
    ### END CODE HERE ###

    return x

x = np.array([
    [0, 3, 4],
    [1, 6, 4]])
print("normalizeRows(x) = " + str(normalizeRows(x)))

注意：在normalizeRows（）中，可以尝试打印x_norm和x的形状，然后重新运行评估。你会发现它们有不同的形状。这是正常的，因为x_norm取x的每一行的范数。因此x_norm具有相同的行数但只有1列。那么当你用x_norm划分x时它是如何工作的？这叫做广播，接下来我们会谈到！

1.5 - Broadcasting and the softmax function (广播和softmax函数)

exercise：使用numpy实现softmax函数。可以将softmax视为当算法需要对两个或更多类进行分类时使用的规范化函数

# GRADED FUNCTION: softmax

def softmax(x):
    """Calculates the softmax for each row of the input x.

    Your code should work for a row vector and also for matrices of shape (n, m).

    Argument:
    x -- A numpy matrix of shape (n,m)

    Returns:
    s -- A numpy matrix equal to the softmax of x, of shape (n,m)
    """
    
    ### START CODE HERE ### (≈ 3 lines of code)
    # Apply exp() element-wise to x. Use np.exp(...).
    x_exp = np.exp(x)

    # Create a vector x_sum that sums each row of x_exp. Use np.sum(..., axis = 1, keepdims = True).
    x_sum = np.sum(x_exp, axis = 1, keepdims = True)
    
    # Compute softmax(x) by dividing x_exp by x_sum. It should automatically use numpy broadcasting.
    s = x_exp / x_sum

    ### END CODE HERE ###
    
    return s

x = np.array([
    [9, 2, 5, 0, 0],
    [7, 5, 0, 0 ,0]])
print("softmax(x) = " + str(softmax(x)))

注意：
如果打印上面的x_exp，x_sum和s的形状并重新运行评估单元格，我们将看到x_sum的形状为（2,1），而x_exp和s的形状为（2,5）。 x_exp / x_sum由于python广播而起作用。

需要记住的：

np.exp（x）适用于任何np.array x，并将指数函数应用于每个坐标
sigmoid函数及其渐变
image2vector常用于深度学习
np.reshape被广泛使用。在未来，我们将看到保持矩阵/矢量尺寸的直接将会消除许多错误。
numpy具有高效的内置功能
广播非常有用

2 Vectorization （向量化）

增强计算效率

import time

x1 = [9, 2, 5, 0, 0, 7, 5, 0, 0, 0, 9, 2, 5, 0, 0]
x2 = [9, 2, 2, 9, 0, 9, 2, 5, 0, 0, 9, 2, 5, 0, 0]

### CLASSIC DOT PRODUCT OF VECTORS IMPLEMENTATION ###
tic = time.process_time()
dot = 0
for i in range(len(x1)):
    dot+= x1[i]*x2[i]     #矩阵里对应数字相乘，最后求和 9*9+2*2……
toc = time.process_time()
print ("dot = " + str(dot) + "\n ----- Computation time = " + str(1000*(toc - tic)) + "ms")

### CLASSIC OUTER PRODUCT IMPLEMENTATION ###
tic = time.process_time()
outer = np.zeros((len(x1),len(x2))) # we create a len(x1)*len(x2) matrix with only zeros 
for i in range(len(x1)):
    for j in range(len(x2)):
        outer[i,j] = x1[i]*x2[j] #i*j 1*1 1*2 ……15*15
toc = time.process_time()
print ("outer = " + str(outer) + "\n ----- Computation time = " + str(1000*(toc - tic)) + "ms")

### CLASSIC ELEMENTWISE IMPLEMENTATION ###
tic = time.process_time()
mul = np.zeros(len(x1))
for i in range(len(x1)):
    mul[i] = x1[i]*x2[i] #算法雷同dot
toc = time.process_time()
print ("elementwise multiplication = " + str(mul) + "\n ----- Computation time = " + str(1000*(toc - tic)) + "ms")

### CLASSIC GENERAL DOT PRODUCT IMPLEMENTATION ###
W = np.random.rand(3,len(x1)) # Random 3*len(x1) numpy array 随机产生3行与x1列数相同的数组
tic = time.process_time()
gdot = np.zeros(W.shape[0])  #W.shape[0]=15 产生一个3*1的全为零的数组
for i in range(W.shape[0]):
    for j in range(len(x1)):
        gdot[i] += W[i,j]*x1[j]
toc = time.process_time()
print ("gdot = " + str(gdot) + "\n ----- Computation time = " + str(1000*(toc - tic)) + "ms")

结果：

x1 = [9, 2, 5, 0, 0, 7, 5, 0, 0, 0, 9, 2, 5, 0, 0]
x2 = [9, 2, 2, 9, 0, 9, 2, 5, 0, 0, 9, 2, 5, 0, 0]

### VECTORIZED DOT PRODUCT OF VECTORS ###
tic = time.process_time()
dot = np.dot(x1,x2)  #矩阵内积a1*b1+a2*b2+……
toc = time.process_time()
print ("dot = " + str(dot) + "\n ----- Computation time = " + str(1000*(toc - tic)) + "ms")

### VECTORIZED OUTER PRODUCT ###
tic = time.process_time()
outer = np.outer(x1,x2) #outer是x1的第一个元素跟x2的每一个元素相乘作为第一行，第二个元素跟x2的每一个元素相乘作为第二个元素...(列向量x1与行向量x2相乘)
toc = time.process_time()
print ("outer = " + str(outer) + "\n ----- Computation time = " + str(1000*(toc - tic)) + "ms")

### VECTORIZED ELEMENTWISE MULTIPLICATION ###
tic = time.process_time()
mul = np.multiply(x1,x2)#对应元素相乘
toc = time.process_time()
print ("elementwise multiplication = " + str(mul) + "\n ----- Computation time = " + str(1000*(toc - tic)) + "ms")

### VECTORIZED GENERAL DOT PRODUCT ###
tic = time.process_time()
dot = np.dot(W,x1)
toc = time.process_time()
print ("gdot = " + str(dot) + "\n ----- Computation time = " + str(1000*(toc - tic)) + "ms")

结果与上一练习完全相同，简化了代码，可以看出矢量化显得更清晰更高效

想要了解更多numpy函数线代计算里的知识，请看官方解释https://docs.scipy.org/doc/numpy/reference/routines.linalg.html

2.1 Implement the L1 and L2 loss functions （实现损失函数）

exercise：实现L1损失的numpy矢量化版本。abs(x)（x的绝对值）

note: 损失用于评估模型的性能。损失函数越大，预测与真实值越不同。

公式：

# GRADED FUNCTION: L1

def L1(yhat, y):
    """
    Arguments:
    yhat -- vector of size m (predicted labels)
    y -- vector of size m (true labels)
    
    Returns:
    loss -- the value of the L1 loss function defined above
    """
    
    ### START CODE HERE ### (≈ 1 line of code)
    loss = np.sum(np.abs(y - yhat)) #
    ### END CODE HERE ###
    
    return loss

yhat = np.array([.9, 0.2, 0.1, .4, .9])
y = np.array([1, 0, 0, 1, 1])
print("L1 = " + str(L1(yhat,y)))

exercise：实现L2损失的numpy矢量化版本。使用dot() 如果x为1维则是内积，如果x是2维则是矩阵乘法。

官方解释：https://docs.scipy.org/doc/numpy/reference/generated/numpy.dot.html#numpy.dot

# GRADED FUNCTION: L2

def L2(yhat, y):
    """
    Arguments:
    yhat -- vector of size m (predicted labels)
    y -- vector of size m (true labels)
    
    Returns:
    loss -- the value of the L2 loss function defined above
    """
    
    ### START CODE HERE ### (≈ 1 line of code)
    loss = np.dot((y - yhat),(y - yhat).T) #两矩阵相乘
    ### END CODE HERE ###
    
    return loss

yhat = np.array([.9, 0.2, 0.1, .4, .9])
y = np.array([1, 0, 0, 1, 1])
print("L2 = " + str(L2(yhat,y)))

不明白此处为什么还要加个转置，该矩阵的内积或者当作矩阵乘法来求结果不是一样的吗?

Annnnnm

关注

3
点赞
踩
4

收藏

觉得还不错? 一键收藏
0
评论
吴恩达 Deep Learning assignment2_1

其实已经看完第一部分神经网络和深度学习了网易云课堂里没有作业，下了作业来实践，大段大段地英文开始让我头痛了，决定开始磕磕巴巴的啃了……不是全部的作业代码，觉得没必要解释的省去了1 - Building basic functions with numpy 1.1 - sigmoid function, np.exp()首先一个练习关于math.exp(),其目的是要与np.e...
复制链接

扫一扫