Stanford CS230深度学习（一）

最新推荐文章于 2024-09-14 07:19:08 发布

学吧学吧终成学霸

最新推荐文章于 2024-09-14 07:19:08 发布

阅读量1.9k

点赞数 4

分类专栏：深度学习

本文链接：https://blog.csdn.net/weixin_44750583/article/details/104748488

版权

深度学习专栏收录该内容

11 篇文章

订阅专栏

斯坦福CS230可以作为深度学习的入门课，最近我也在跟着看视频、完成编程作业。首先列出使用的资源链接，然后给出第一课的理解和编程作业的代码。

所有资料如下：

一、课程连接：

b站课堂讲授版：Stanford CS230(吴恩达深度学习 Deep Learning | Autumn 2018)(中英双字幕)
（一共10个视频，每周一个，但是每个视频对应下面的配套视频较多）
Coursera配套视频：Neural Networks and Deep Learning
（注：内容和下面是一样的，只不过在每个讲解视频完之后多了一些小quiz，但是视频可能播放不了，需要设置root权限，嫌麻烦请转3）
b站配套视频：【中文字幕】深度学习_吴恩达_DeepLearning.ai

二、官方slide和网络上其他人的笔记：

三、GitHub上编程作业：

第一课：用神经网络来看逻辑回归

逻辑回归原本是广义线性模型中用于分类的一种算法，它通过一个非线性变换使得最后的输出值在 $[0, 1]$ 之间，这个值可以解释为给出 $x$ 时 $y$ 为正例的概率，即： $P(y=1|X)=\sigma(W^TX+b)$ 然后设定阈值，例如0.5，即可得到分类结果。
从神经网络的视角来看，输入层即为 $X$ ，线性变换 $W^TX+b$ 得到隐层，最后经过一个sigmoid激活函数得到输出层。这是神经网络的前向传播过程（Forward Propagation）。
对于这个结果的评价，使用的是对数似然损失(Log-likelihood Loss), 也称交叉熵损失(cross-entropy Loss)，对于二分类的情况 $L(\hat y,y)=-{1\over n}\sum_{i=1}^n\big[y_i\log \hat y_i + (1-y_i)\log (1-\hat y_i)\big]$
我们的目的是学习到整个网络的参数，也就是这里的 $W$ 和 $b$ ，方法是通过迭代来更新参数使得错误率最小（也就是损失函数最小）。因此需要将每次前向传播得到的结果与标签数据进行比较，得到一个损失，考察参数对这个损失的影响就要通过损失函数对参数的梯度来体现。但是这个梯度直接求导不行，因为本来这个模型就不是一个线性函数，而是一个复合函数，因此需要通过链式法则一层一层求导得到这个导数。直观上理解就是让误差从输出层传到隐层，再从隐层传到输入层，这就是误差的反向传播过程（Bcakward Propagation）。


import numpy as np
import matplotlib.pyplot as plt
import h5py
from skimage.transform import resize 


# 导入数据
def load_dataset():
    train_dataset = h5py.File('datasets/train_catvnoncat.h5', "r")
    train_set_x_orig = np.array(train_dataset["train_set_x"][:]) # your train set features
    train_set_y_orig = np.array(train_dataset["train_set_y"][:]) # your train set labels

    test_dataset = h5py.File('datasets/test_catvnoncat.h5', "r")
    test_set_x_orig = np.array(test_dataset["test_set_x"][:]) # your test set features
    test_set_y_orig = np.array(test_dataset["test_set_y"][:]) # your test set labels

    classes = np.array(test_dataset["list_classes"][:]) # the list of classes
    
    train_set_y_orig = train_set_y_orig.reshape((1, train_set_y_orig.shape[0]))
    test_set_y_orig = test_set_y_orig.reshape((1, test_set_y_orig.shape[0]))
    
    return train_set_x_orig, train_set_y_orig, test_set_x_orig, test_set_y_orig, classes


# 得到训练集和测试集
train_set_x_orig, train_set_y, test_set_x_orig, test_set_y, classes = load_dataset()


# 图片展示
index = 2
plt.imshow(train_set_x_orig[index])
print ("y = " + str(train_set_y[:, index]) + ", it's a '" + classes[np.squeeze(train_set_y[:, index])].decode("utf-8") +  "' picture.")


# 数据展开
train_set_x_flatten = train_set_x_orig.reshape((209,-1)).T #(12288, 209)
test_set_x_flatten = test_set_x_orig.reshape((50,-1)).T #(12288, 50)

# 标准化
train_set_x = train_set_x_flatten/255
test_set_x = test_set_x_flatten/255



# 定义一些重要的函数以供后面调用
def sigmoid(x):
    s = 1 / (1 + np.exp(-x))
    return s

def initalize(length_of_features):
    W = np.zeros((length_of_features,1))
    b = 0
    return W, b

def acc(Y, pred_Y):
    n = Y.shape[1]
    right = np.sum(Y-pred_Y==0)
    return right/n
  


# Forward and backward propagation
def propagation(W, b, X, Y):
    n = X.shape[1]
    Z = np.dot(W.T, X) + b
    A = sigmoid(Z)
    cost = - np.sum(Y * np.log(A) + (1-Y) * np.log(1-A)) / n
    dW = (1/n) * np.dot(X, (A-Y).T)
    db = (1/n) * np.sum(A-Y)
    return dW, db, cost
    
# 测试
w, b, X, Y = np.array([[1],[2]]), 2, np.array([[1,2],[3,4]]), np.array([[1,0]])
# propagation(w, b, X, Y)
# (array([[0.99993216],
#        [1.99980262]]),
# 0.49993523062470574,
# 6.000064773192205)


# 通过迭代优化参数
def optimize(W, b, X, Y, num_iterations, learning_rate, verbose=False):
    costs = []
    for i in range(num_iterations):
        dW, db, cost = propagation(W, b, X, Y)
        W = W - learning_rate * dW
        b = b - learning_rate * db
        costs.append(cost)
        if verbose == True and i % 100 == 0:
            print('Cost after iteration %d: %f' %(i, cost))
    return W, b, costs
    

# 测试
optimize(w, b, X, Y, num_iterations=100, learning_rate=0.009, verbose=True)
# array([[0.1124579 ],
#         [0.23106775]]),
#  1.5593049248448891,


# 输出预测结果
def predict(W, b, X):
    prob = sigmoid(np.dot(W.T, X) + b)
    y_pred = np.zeros(prob.shape)
    for i in range(y_pred.shape[1]):
        if prob[0,i] > 0.5:
            y_pred[0,i] = 1
            
    return y_pred

# predict(w, b, X)
# array([[1., 1.]])



# 整合得到最终的模型，后面只需调用这个就行
def model(X_train, Y_train, X_test, Y_test, num_iterations, learning_rate, verbose=False):
    W, b = initalize(X_train.shape[0])
    W_final, b_final, costs = optimize(W, b, X_train, Y_train, num_iterations, learning_rate, verbose)
    
    pred_train_y = predict(W_final, b_final, X_train)
    pred_test_y = predict(W_final, b_final, X_test)
    
    print('Train accuracy:%f' %acc(Y_train, pred_train_y))
    print('Test accuracy:%f' %acc(Y_test, pred_test_y))
    
    output = {'costs': costs,
              'W': W_final,
              'b': b_final,
              'pred_train_y': pred_train_y,
              'pred_test_y': pred_test_y
              }
    return output


# 调用，在这个训练集上训练，然后在测试集上看模型表现
d = model(train_set_x, train_set_y, test_set_x, test_set_y, num_iterations=2000, learning_rate=0.005, verbose=True)


# 画出分类错误的样本图案
index = 46
plt.imshow(test_set_x[:,index].reshape((64,64,3)))
print("y = %d means this is a cat, but pred_y = %d" %(test_set_y[0,index],int(d['pred_test_y'][0,index])))



# 画出学习曲线
plt.plot(np.squeeze(d['costs']))


# 比较不同的学习率的表现
learning_rates = [0.01, 0.001, 0.005]
models = {}
for i in learning_rates:
    print ("learning rate is: " + str(i))
    models[str(i)] = model(train_set_x, train_set_y, test_set_x, test_set_y, num_iterations = 1500, learning_rate = i, verbose = False)
    print ('\n' + "-------------------------------------------------------" + '\n')

for i in learning_rates:
    plt.plot(np.squeeze(models[str(i)]["costs"]), label= str(i))

plt.ylabel('cost')
plt.xlabel('iterations')

legend = plt.legend(loc='upper right')
frame = legend.get_frame()
frame.set_facecolor('0.90')
plt.show()


# 在自己的图片上试试效果
my_image = "myimage2.jpg"
# 读入图片数据
image = plt.imread(my_image) 
plt.imshow(image)
# 调整像素大小
image = resize(image, output_shape=(64,64)).reshape((1, 64*64*3)).T
predicted_image = predict(d['W'], d["b"], image)
# 预测结果以及调整后的图片
plt.imshow(image.reshape(64,64,3))
print(predicted_image)