深度学习笔记（18）：第二课第三周作业

最新推荐文章于 2021-03-19 11:03:47 发布

酸菜火锅bb

最新推荐文章于 2021-03-19 11:03:47 发布

阅读量301

点赞数

分类专栏：吴恩达深度学习笔记神经网络笔记

本文链接：https://blog.csdn.net/weixin_43197820/article/details/105624677

版权

笔记同时被 3 个专栏收录

46 篇文章 1 订阅

订阅专栏

吴恩达深度学习笔记

38 篇文章 9 订阅

订阅专栏

神经网络

38 篇文章 3 订阅

订阅专栏

使用TensorFlow框架搭建神经网络的一般过程

Create a graph containing Tensors (Variables, Placeholders …) and Operations (tf.matmul, tf.add, …)（创建张量计算图，一些variables，placeholders，constant…）
Create a session（创建一个session）
Initialize the session（初始化session）
Run the session to execute the graph（运行session，包括输入优化器，代价函数，输入placeholder的feed_dict等）

一些基础性的代码（因为csdn一些奇怪的bug，我用英文写）

Create a constant：

y_hat = tf.constant(36, name='y_hat')            # Define y_hat constant. Set to 36.
y = tf.constant(39, name='y')                    # Define y. Set to 39

matrix multiply /add/random generate：

You might find the following functions helpful: 
- tf.matmul(..., ...) to do a matrix multiplication
- tf.add(..., ...) to do an addition
- np.random.randn(...) to initialize randomly

Create place holder:

x = tf.placeholder(tf.int64, name = 'x')

Two methods for session generate:

sess = tf.Session()
# Run the variables initialization (if needed), run the operations
result = sess.run(..., feed_dict = {...})
sess.close() # Close the session

with tf.Session() as sess: 
    # run the variables initialization (if needed), run the operations
    result = sess.run(..., feed_dict = {...})
    # This takes care of closing the session for you :)

Operate the sigmoid function:

sigmoid = tf.sigmoid(x)

Operate the Relu function:

tf.nn.relu(...)

Operate the cost of the sigmoid layer:

tf.nn.sigmoid_cross_entropy_with_logits(logits = ...,  labels = ...)

Operate the cost of the softmax layer:

tf.nn.softmax_cross_entropy_with_logits(logits = logits, labels = labels)

One hot:

one_hot_operation = tf.one_hot(labels,depth = C,axis = 0)

Ones:

ones_op = tf.ones(shape)

Initialize W,b:

    W1 = tf.get_variable("W1", [25,12288], initializer = tf.contrib.layers.xavier_initializer(seed = 1))
    b1 = tf.get_variable("b1", [25,1], initializer = tf.zeros_initializer())
    W2 = tf.get_variable("W2", [12,25],initializer = tf.contrib.layers.xavier_initializer(seed = 1))
    b2 = tf.get_variable("b2", [12,1], initializer = tf.zeros_initializer())
    W3 = tf.get_variable("W3", [6,12],initializer = tf.contrib.layers.xavier_initializer(seed = 1))
    b3 = tf.get_variable("b3", [6,1], initializer = tf.zeros_initializer())

实验代码（使用softmax和TensorFlow预测手势,比较简单，就只贴代码了）

# GRADED FUNCTION: create_placeholders

def create_placeholders(n_x, n_y):
    """
    Creates the placeholders for the tensorflow session.
    
    Arguments:
    n_x -- scalar, size of an image vector (num_px * num_px = 64 * 64 * 3 = 12288)
    n_y -- scalar, number of classes (from 0 to 5, so -> 6)
    
    Returns:
    X -- placeholder for the data input, of shape [n_x, None] and dtype "float"
    Y -- placeholder for the input labels, of shape [n_y, None] and dtype "float"
    
    Tips:
    - You will use None because it let's us be flexible on the number of examples you will for the placeholders.
      In fact, the number of examples during test/train is different.
    """

    ### START CODE HERE ### (approx. 2 lines)  
    X = tf.placeholder(shape = [n_x , None] , dtype = tf.float32 )
    Y = tf.placeholder(shape = [n_y , None] , dtype = tf.float32 )
    ### END CODE HERE ###  
    
    return X, Y

# GRADED FUNCTION: initialize_parameters

def initialize_parameters():
    """
    Initializes parameters to build a neural network with tensorflow. The shapes are:
                        W1 : [25, 12288]
                        b1 : [25, 1]
                        W2 : [12, 25]
                        b2 : [12, 1]
                        W3 : [6, 12]
                        b3 : [6, 1]
    
    Returns:
    parameters -- a dictionary of tensors containing W1, b1, W2, b2, W3, b3
    """
    
    tf.set_random_seed(1)                   # so that your "random" numbers match ours
        
    ### START CODE HERE ### (approx. 6 lines of code)  
    W1 = tf.get_variable("W1", [25,12288], initializer = tf.contrib.layers.xavier_initializer(seed = 1))
    b1 = tf.get_variable("b1", [25,1], initializer = tf.zeros_initializer())
    W2 = tf.get_variable("W2", [12,25],initializer = tf.contrib.layers.xavier_initializer(seed = 1))
    b2 = tf.get_variable("b2", [12,1], initializer = tf.zeros_initializer())
    W3 = tf.get_variable("W3", [6,12],initializer = tf.contrib.layers.xavier_initializer(seed = 1))
    b3 = tf.get_variable("b3", [6,1], initializer = tf.zeros_initializer())
    
    ### END CODE HERE ### 

    parameters = {"W1": W1,
                  "b1": b1,
                  "W2": W2,
                  "b2": b2,
                  "W3": W3,
                  "b3": b3}
    
    return parameters

# GRADED FUNCTION: forward_propagation

def forward_propagation(X, parameters):
    """
    Implements the forward propagation for the model: LINEAR -> RELU -> LINEAR -> RELU -> LINEAR -> SOFTMAX
    
    Arguments:
    X -- input dataset placeholder, of shape (input size, number of examples)
    parameters -- python dictionary containing your parameters "W1", "b1", "W2", "b2", "W3", "b3"
                  the shapes are given in initialize_parameters

    Returns:
    Z3 -- the output of the last LINEAR unit
    """
    
    # Retrieve the parameters from the dictionary "parameters" 
    W1 = parameters['W1']
    b1 = parameters['b1']
    W2 = parameters['W2']
    b2 = parameters['b2']
    W3 = parameters['W3']
    b3 = parameters['b3']
    
    ### START CODE HERE ### (approx. 5 lines)              # Numpy Equivalents:  
    Z1 = tf.matmul(W1,X)+b1                                         # Z1 = np.dot(W1, X) + b1  
    A1 = tf.nn.relu(Z1)                                             # A1 = relu(Z1)  
    Z2 = tf.matmul(W2, A1) + b2                                     # Z2 = np.dot(W2, a1) + b2  
    A2 = tf.nn.relu(Z2)                                             # A2 = relu(Z2)  
    Z3 = tf.matmul(W3, A2) + b3                                     # Z3 = np.dot(W3,a2) + b3  
    ### END CODE HERE ###  
    
    return Z3

# GRADED FUNCTION: compute_cost 

def compute_cost(Z3, Y):
    """
    Computes the cost
    
    Arguments:
    Z3 -- output of forward propagation (output of the last LINEAR unit), of shape (6, number of examples)
    Y -- "true" labels vector placeholder, same shape as Z3
    
    Returns:
    cost - Tensor of the cost function
    """
    
    # to fit the tensorflow requirement for tf.nn.softmax_cross_entropy_with_logits(...,...)
    logits = tf.transpose(Z3)
    labels = tf.transpose(Y)
    
    ### START CODE HERE ### (1 line of code)  
    cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits = logits, labels = labels))
    ### END CODE HERE ###  
    
    return cost

def model(X_train, Y_train, X_test, Y_test, learning_rate = 0.0001,
          num_epochs = 1500, minibatch_size = 32, print_cost = True):
    """
    Implements a three-layer tensorflow neural network: LINEAR->RELU->LINEAR->RELU->LINEAR->SOFTMAX.
    
    Arguments:
    X_train -- training set, of shape (input size = 12288, number of training examples = 1080)
    Y_train -- test set, of shape (output size = 6, number of training examples = 1080)
    X_test -- training set, of shape (input size = 12288, number of training examples = 120)
    Y_test -- test set, of shape (output size = 6, number of test examples = 120)
    learning_rate -- learning rate of the optimization
    num_epochs -- number of epochs of the optimization loop
    minibatch_size -- size of a minibatch
    print_cost -- True to print the cost every 100 epochs
    
    Returns:
    parameters -- parameters learnt by the model. They can then be used to predict.
    """
    
    ops.reset_default_graph()                         # to be able to rerun the model without overwriting tf variables
    tf.set_random_seed(1)                             # to keep consistent results
    seed = 3                                          # to keep consistent results
    (n_x, m) = X_train.shape                          # (n_x: input size, m : number of examples in the train set)
    n_y = Y_train.shape[0]                            # n_y : output size
    costs = []                                        # To keep track of the cost
    
    # Create Placeholders of shape (n_x, n_y)
    ### START CODE HERE ### (1 line)  
    X ,Y = create_placeholders(n_x, n_y)
    ### END CODE HERE ###  
  
    # Initialize parameters  
    ### START CODE HERE ### (1 line)  
    parameters = initialize_parameters()
    ### END CODE HERE ###  
      
    # Forward propagation: Build the forward propagation in the tensorflow graph  
    ### START CODE HERE ### (1 line)  
    Z3 = forward_propagation(X, parameters)
    ### END CODE HERE ###  
      
    # Cost function: Add cost function to tensorflow graph  
    ### START CODE HERE ### (1 line)  
    cost = compute_cost(Z3, Y)
    ### END CODE HERE ###  
      
    # Backpropagation: Define the tensorflow optimizer. Use an AdamOptimizer.  
    ### START CODE HERE ### (1 line)  
    optimizer = tf.train.AdamOptimizer(learning_rate = learning_rate).minimize(cost)
    ### END CODE HERE ###  
      
    # Initialize all the variables  
    init = tf.global_variables_initializer()  
  
    # Start the session to compute the tensorflow graph  
    with tf.Session() as sess:  
          
        # Run the initialization  
        sess.run(init)  
          
        # Do the training loop  
        for epoch in range(num_epochs):  
  
            epoch_cost = 0.                       # Defines a cost related to an epoch  
            num_minibatches = int(m / minibatch_size) # number of minibatches of size minibatch_size in the train set  
            seed = seed + 1  
            minibatches = random_mini_batches(X_train, Y_train, minibatch_size, seed)  
  
            for minibatch in minibatches:  
  
                # Select a minibatch  
                (minibatch_X, minibatch_Y) = minibatch  
                  
                # IMPORTANT: The line that runs the graph on a minibatch.  
                # Run the session to execute the "optimizer" and the "cost", the feedict should contain a minibatch for (X,Y).  
                ### START CODE HERE ### (1 line)  
                _ , minibatch_cost = sess.run([optimizer, cost], feed_dict={X: minibatch_X, Y: minibatch_Y})  
                ### END CODE HERE ###  
                  
                epoch_cost += minibatch_cost / num_minibatches  
  
            # Print the cost every epoch  
            if print_cost == True and epoch % 100 == 0:  
                print ("Cost after epoch %i: %f" % (epoch, epoch_cost))  
            if print_cost == True and epoch % 5 == 0:  
                costs.append(epoch_cost)  
                  
        # plot the cost  
        plt.plot(np.squeeze(costs))  
        plt.ylabel('cost')  
        plt.xlabel('iterations (per tens)')  
        plt.title("Learning rate =" + str(learning_rate))  
        plt.show()  
  
        # lets save the parameters in a variable  
        parameters = sess.run(parameters)  
        print ("Parameters have been trained!")  
  
        # Calculate the correct predictions  
        correct_prediction = tf.equal(tf.argmax(Z3), tf.argmax(Y))  
  
        # Calculate accuracy on the test set  
        accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))  
  
        print ("Train Accuracy:", accuracy.eval({X: X_train, Y: Y_train}))  
        print ("Test Accuracy:", accuracy.eval({X: X_test, Y: Y_test}))  
          
        return parameters

Expected Output√:
Train Accuracy: 0.9990741
Test Accuracy: 0.725

Amazing, your algorithm can recognize a sign representing a figure between 0 and 5 with 71.7% accuracy.

Insights:

Your model seems big enough to fit the training set well. However, given the difference between train and test accuracy, you could try to add L2 or dropout regularization to reduce overfitting.
Think about the session as a block of code to train the model. Each time you run the session on a minibatch, it trains the parameters. In total you have run the session a large number of times (1500 epochs) until you obtained well trained parameters.

最后的最后，我们尝试预测了一下这个大拇指，虽然"You deserve a thumb up",但是模型给出的答案是3，这很正常，模型并不能认识它没见过的东西，如果他有生命，在他的世界里只能认识0~5这些手势。我们以后来解决这个问题。
在这里插入图片描述