神经网络学习笔记1_深入浅出神经网络与深度学习

本文介绍了神经网络的基础知识,包括前馈计算方法、反向传播更新权值和偏置的原理,以及梯度下降算法在神经网络中的应用。通过实例解析了神经网络的计算过程,并提供了使用Python实现神经网络的代码示例,适合深度学习初学者参考。
摘要由CSDN通过智能技术生成

前言

  学习深度学习的过程中发现一本好书——《深入浅出神经网络与深度学习》,以前不太懂的部分知识在仔细研读了这本书后豁然开朗,本着记录学习心得的态度在此处写下我对神经网络的理解,希望能对你有所帮助。

一、神经网络系统的前馈计算方法

  学习神经网络系统需要了解其前向的计算算法以及反向传播算法更新权值和偏置。前馈计算方法用于给定输出依次计算输出,这比较简单,依次对每一层神经元进行计算即可,下面基于一个例子加以介绍。
  由于时间关系就不编撰图形和公式了,手写的笔记如下:
在这里插入图片描述
按照相同的计算步骤依次计算即可。

二、神经网络系统的反向传播更新权值和偏置

在这里插入图片描述
说到底就是一个链式法则的应用,还是比较简单的,仔细推导应该不难。

三、梯度下降算法更新权值和偏置

  梯度下降策略也称最速下降法,是最优控制中的一种常用方法,由于其原理十分简单广泛应用于多种场合。
  首先用一个简单的例子讲解一下梯度下降法的运行策略:
  以下是最速下降法(也就是梯度下降策略)的算法步骤,如果对于下图理解较为困难的话不妨直接看接下来的例子。
在这里插入图片描述
在这里插入图片描述
依次迭代即可,可以很快得到当
在这里插入图片描述
取得最小值。
  当梯度下降发应用到神经网络中时,即可把代价函数 C看成函数C(w,d)即以所有的权值w和偏置d为自变量的一个函数,利用标准正态分布给权值w和偏置d随机赋予初值(赋予初值的一种方法,并不唯一),并且利用反向传播算法我们可以计算代价函数对任意权值w和偏置d的偏导数,套用上述的梯度下降策略就可以更新权值和偏置了(与上述例子不同的是,取一合适的∝即可,固定学习速率不变)。

四、神经网络算法实现(python)

部分代码源自《深入浅出神经网络与深度学习》一书


import random
import numpy as np

class Network(object):

    def __init__(self, sizes):
        """The list ``sizes`` contains the number of neurons in the
        respective layers of the network.  For example, if the list
        was [2, 3, 1] then it would be a three-layer network, with the
        first layer containing 2 neurons, the second layer 3 neurons,
        and the third layer 1 neuron.  The biases and weights for the
        network are initialized randomly, using a Gaussian
        distribution with mean 0, and variance 1.  Note that the first
        layer is assumed to be an input layer, and by convention we
        won't set any biases for those neurons, since biases are only
        ever used in computing the outputs from later layers."""
        self.num_layers = len(sizes)
        self.sizes = sizes
        self.biases = [np.random.randn(y, 1) for y in sizes[1:]]
        self.weights = [np.random.randn(y, x)
                        for x, y in zip(sizes[:-1], sizes[1:])]

    def feedforward(self, a):
        """Return the output of the network if ``a`` is input."""
        for b, w in zip(self.biases, self.weights):
            a = sigmoid(np.dot(w, a)+b)
        return a

    def SGD(self, training_data, epochs, mini_batch_size, eta,
            test_data=None):
        """Train the neural network using mini-batch stochastic
        gradient descent.  The ``training_data`` is a list of tuples
        ``(x, y)`` representing the training inputs and the desired
        outputs.  The other non-optional parameters are
        self-explanatory.  If ``test_data`` is provided then the
        network will be evaluated against the test data after each
        epoch, and partial progress printed out.  This is useful for
        tracking progress, but slows things down substantially."""

        training_data = list(training_data)
        n = len(training_data)

        if test_data:
            test_data = list(test_data)
            n_test = len(test_data)

        for j in range(epochs):
            random.shuffle(training_data)
            mini_batches = [
                training_data[k:k+mini_batch_size]
                for k in range(0, n, mini_batch_size)]
            for mini_batch in mini_batches:
                self.update_mini_batch(mini_batch, eta)
            if test_data:
                print("Epoch {} : {} / {}".format(j,self.evaluate(test_data),n_test));
            else:
                print("Epoch {} complete".format(j))

    def update_mini_batch(self, mini_batch, eta):
        """Update the network's weights and biases by applying
        gradient descent using backpropagation to a single mini batch.
        The ``mini_batch`` is a list of tuples ``(x, y)``, and ``eta``
        is the learning rate."""
        nabla_b = [np.zeros(b.shape) for b in self.biases]
        nabla_w = [np.zeros(w.shape) for w in self.weights]
        for x, y in mini_batch:
            delta_nabla_b, delta_nabla_w = self.backprop(x, y)
            nabla_b = [nb+dnb for nb, dnb in zip(nabla_b, delta_nabla_b)]
            nabla_w = [nw+dnw for nw, dnw in zip(nabla_w, delta_nabla_w)]
        self.weights = [w-(eta/len(mini_batch))*nw
                        for w, nw in zip(self.weights, nabla_w)]
        self.biases = [b-(eta/len(mini_batch))*nb
                       for b, nb in zip(self.biases, nabla_b)]

    def backprop(self, x, y):
        """Return a tuple ``(nabla_b, nabla_w)`` representing the
        gradient for the cost function C_x.  ``nabla_b`` and
        ``nabla_w`` are layer-by-layer lists of numpy arrays, similar
        to ``self.biases`` and ``self.weights``."""
        nabla_b = [np.zeros(b.shape) for b in self.biases]
        nabla_w = [np.zeros(w.shape) for w in self.weights]
        # feedforward
        activation = x
        activations = [x] # list to store all the activations, layer by layer
        zs = [] # list to store all the z vectors, layer by layer
        for b, w in zip(self.biases, self.weights):
            z = np.dot(w, activation)+b
            zs.append(z)
            activation = sigmoid(z)
            activations.append(activation)
        # backward pass
        delta = self.cost_derivative(activations[-1], y) * \
            sigmoid_prime(zs[-1])
        nabla_b[-1] = delta
        nabla_w[-1] = np.dot(delta, activations[-2].transpose())
        # Note that the variable l in the loop below is used a little
        # differently to the notation in Chapter 2 of the book.  Here,
        # l = 1 means the last layer of neurons, l = 2 is the
        # second-last layer, and so on.  It's a renumbering of the
        # scheme in the book, used here to take advantage of the fact
        # that Python can use negative indices in lists.
        for l in range(2, self.num_layers):
            z = zs[-l]
            sp = sigmoid_prime(z)
            delta = np.dot(self.weights[-l+1].transpose(), delta) * sp
            nabla_b[-l] = delta
            nabla_w[-l] = np.dot(delta, activations[-l-1].transpose())
        return (nabla_b, nabla_w)

    def evaluate(self, test_data):
        """Return the number of test inputs for which the neural
        network outputs the correct result. Note that the neural
        network's output is assumed to be the index of whichever
        neuron in the final layer has the highest activation."""
        test_results = [(np.argmax(self.feedforward(x)), y)
                        for (x, y) in test_data]
        return sum(int(x == y) for (x, y) in test_results)

    def cost_derivative(self, output_activations, y):
        """Return the vector of partial derivatives \partial C_x /
        \partial a for the output activations."""
        return (output_activations-y)

#### Miscellaneous functions
def sigmoid(z):
    """The sigmoid function."""
    return 1.0/(1.0+np.exp(-z))

def sigmoid_prime(z):
    """Derivative of the sigmoid function."""
    return sigmoid(z)*(1-sigmoid(z))

上述是神经网络代码的主体部分,接下来写一个测试代码验算一下看效果如何。

from numpy import *
import network
import operator
from os import listdir

def file2matrix(filename):
    fr = open(filename)
    numberOfLines = len(fr.readlines())         #get the number of lines in the file
    returnMat = zeros((numberOfLines,3))        #prepare matrix to return
    classLabelVector = []                       #prepare labels return   
    fr = open(filename)
    index = 0
    for line in fr.readlines():
        line = line.strip()
        listFromLine = line.split('\t')
        returnMat[index,:] = listFromLine[0:3]
        classLabelVector.append(int(listFromLine[-1]))
        index += 1
    return returnMat,classLabelVector
#读入数据 转换格式
dataMat,dataLabel = file2matrix("test_data.txt")   #dataMat为全部数据,dataLabel为最后一列的数据集(多为标签)
dataMat = delete(dataMat, -1, axis=1) #除掉最后一列标签列
dataMat=[reshape(x,(2,1)) for x in dataMat]   #转化为2x1的矩阵,并将所有的放在一个列表中,作为输入
dataLabel=[reshape(y,(1,1)) for y in dataLabel] #同上
test_data=zip(dataMat,dataLabel) 

dataMat,dataLabel = file2matrix("ex2data2.txt")
dataMat = delete(dataMat, -1, axis=1)
dataMat=[reshape(x,(2,1)) for x in dataMat]
dataLabel=[reshape(y,(1,1)) for y in dataLabel]
training_data=zip(dataMat,dataLabel)

net=network.Network([2,3,1])
net.SGD(training_data,30,100,0.05,test_data=test_data)
  • 3
    点赞
  • 8
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值