与ANN的第一次约会:ANN是什么?

人工神经网络(ANN)是受构成动物大脑的生物神经网络的启发而建立的计算系统。

Biological Neuron (neurone)生物神经元

  • A neuron, also known as a nerve cell, communicates with other cells via
    specialized connections called synapses.
    神经元也称为神经细胞,通过突触(专门用于连接的组织)与其他细胞通信。
  • A neuron consists of a cell body (soma), dendrites, and a single axon.
    一个神经元由一个细胞体(体细胞)、多个树突和单个轴突组成。
  • Most neurons receive signals via the dendrites and soma, and send out signals down the axon. 神经元通过树突和体细胞接收信号,并沿轴突发出信号。
  • At the majority of synapses, signals cross from the axon of one neuron to a dendrite of another.
    在大多数突触处,信号从一个神经元的轴突传递到另一个神经元的树突。
  • Synapses can connect an axon to another axon or a dendrite to another dendrite.突触可连接一个轴突到另一个轴突,或连接一个树突到另一个树突。

Working States of Biological Neuron

Neurons have two normal working states. 神经元有两种常规工作状态:

Excited state兴奋状态Inhibitory state 抑制状态
when the afferent nerve impulse causes the cell membrane potential to exceed the action potential threshold, then the cell enters the excited state, produces the nerve impulse, and outputs by the axon; 当传入的神经冲动使细胞膜电位超过动作电位的阈值 ,则细胞进入兴奋状态,产生神经冲动,并由轴突输出;when the afferent nerve impulse decreases the cell membrane potential to a lower value than the threshold of the action potential, the cell enters the inhibitory state and has no nerve impulse output.当传入的神经冲动使细胞膜电位下降低于动作电位的阈值,则细胞进入抑制状态,没有神经冲动输出。
  • Learning and forgetting: synaptic transmission can be enhanced and weakened due to the plasticity of neuronal structures.
    学习与遗忘:由于神经元结构的可塑性,突触的传递作用可增强和减弱。

Mathematical Model for Neuron 神经元数学模型

  • In 1943, McCulloch(麦克洛奇)and Pitts (皮兹) proposed M-P Model.

McCulloch: Neurologist & anatomist mathematician
Pitts: mathematician

  • The unit j’s output activation is aj , where ai is the output activation of unit i and wi,j is the weight on the link from unit i to unit j. 单元 j 的输出激活是 aj , 公式(1)中 ai 是单元i 的输出激活,wi,j是从单元 i 到单元 j 的链上的权重
    在这里插入图片描述

Artificial Neuron 人工神经元

在这里插入图片描述
Activation Function 激活函数
激活函数

  • Artificial neurons loosely model the neurons in a biological brain. ANN is a system constructed by interconnecting “artificial neurons” . An ANN is based on a collection of connected units or nodes called artificial neurons,
    人工神经元是对生物大脑中神经元的松散模拟。ANN是通过互联“人工神经元”而构建的系统,ANN是人工神经元的连接单元或结点的集合.
  • An ANN tries to simulate the learning processing of human brain. Each connection, like the synapses in a biological brain, can transmit a signal from one artificial neuron to another.
    ANN试图模拟人脑的学习过程。如同生物大脑中的突触,每个连接可将信号从一个神经元传递到另一个神经元。
  • An artificial neuron that receives a signal can process it and then signal additional artificial neurons connected to it.
    接收信号的神经元可处理信号,然后发信号通知与之相连的其他人工神经元。
  • In common ANN implementations, the signal at a connection between artificial neurons is a real number, and the output of each artificial neuron is computed by some non-linear function of the sum of its inputs.
    在常见的ANN实现中,人工神经元间连接处的信号是实数,并且每个人工神经元的输出都是通过对其输入求和(线性组合)、再进行非线性函数计算而得到的。
  • The connections between artificial neurons are called ‘edges’ which have a weight that can be adjusted as learning proceeds. The weight increases or decreases the strength of the signal at a connection.
    人工神经元之间的连接称为“边”,边具有权重,可在学习过程中进行调整。权重可增强或减弱在连接处信号的强度。(用于模拟:突触的传递作用可增强和减弱)
  • Artificial neurons may have a threshold such that the signal is only sent if the aggregate signal crosses that threshold. 人工神经元也可设置阈值,仅当聚合信号超过该阈值时才发送信号(模拟:兴奋、抑制状态)。

Artificial Neural Networks

  • Typically, artificial neurons are aggregated into layers. Neurons in different layers are connected to each other to form an artificial neural network.
    通常,人工神经元聚集成层。不同层的神经元相互连接形成人工神经网络。
  • Signals travel from the first layer (the input layer), to the last layer (the output layer), possibly after traversing the layers multiple times.
    信号可能在多次遍历(因为可能有反馈)各层后,从第一层(输入层)传播到最后一层(输出层)。
  • The original goal of the ANN approach was to solve problems in the same way that a human brain would.
    人工神经网络方法的最初目标是以与人类大脑相同的方式解决问题。
  • However, over time, attention moved to performing specific tasks, leading to deviations from biology.
    然而,随着时间的推移,注意力转移到执行特定任务,导致偏离生物学。
  • ANN have been used on a variety of tasks, including computer vision, speech recognition, machine translation, social network filtering, playing board and video games and medical diagnosis.
    ANN已用于各种任务,包括计算机视觉,语音识别,机器翻译,社交网络过滤,游戏板和视频游戏以及医学诊断。
  • it is thus obvious that ANN itself is not an algorithm, but rather a framework for many different machine learning algorithms to work together and process complex data inputs.
    可见,ANN本身不是算法,而是许多不同机器学习算法的框架,它们协同工作并处理复杂的数据输入。

Perceptron(感知机)

  • Perceptron, proposed by Frank Rosenblatt in 1957, is a binary linear classifier.
    1957年罗森布拉特提出的感知机,是一种二元线性分类器。
  • A network with all the inputs connected directly to the outputs is called a single-layer neural network, or a perceptron. 所有输入直接连接到输出的网络称为单层神经网络,也称为感知机.
  • perceptron has only one set of input units and one output unit.
    感知机只有一组输入单元 (输入层) 和一个输出单元(输出层)。
    在这里插入图片描述- Disadvantage: Perceptron can only solve the linear separable
    classification problem.
    缺点:感知机只能解决线性可分的分类问题。
  • Solving method:increase the number of layers of perceptron in
    order to make it solve complex problem. 增加层数

1985:Multilayer Neural Network (MNN) 多层神经网络

在这里插入图片描述

  • In addition to the input layer and the output layer, introduce the intermediate layer, called the hidden layer. There can be multiple hidden layers.
    除了输入层和输出层外,引入了中间层,称为隐藏层,可以有多层隐藏层。
  • The output of each layer cell is the input of the next layer cell.
    每层单元的输出作为下一层单元的输入。
  • The hidden layer, like its name, does not deal directly with the external environment, and the number of the hidden layers can be from zero to several layers.
    隐含层:如同其名,不直接与外部环境打交道,隐含层的层数可从零到若干层。
    在这里插入图片描述
    -When the number of NN layers is counted, the input layer is not counted, but the number of hidden layer and output layer is only counted, is called the depth of NN.
    在统计神经网络层数时,不算输入层,只算隐含层和输出层的数量。
  • When the number of layers of NN reaches a certain number, it is called deep neural network (DNN). 当神经网络层数达到一定数量时,称之为深度神经网络。
  • MNN can solve nonlinear separable problems. MNN可用于解决非线性可分问题!
  • NN with multiple output units can solve the problem of multiple classification.
    有多个输出单元的神经网络可用于解决多分类问题
  • Suppose there are n samples, m categories C1,…,Cm .When constructing neural network, n input units and m output units need to be designed.
    假设有n个样本,有m个类别C1,…,Cm,构建神经网络时则需设计n个输入单元、m个输出单元
  • When the class of a sample is Ci, the desired output value of the ith output unit is 1, and the desired output value of the other output units is 0.
    当样本类别为Ci时,第i个输出单元的期望输出值为1,其他输出单元的期望输出值为0。

Shallow vs. Deep Neural Network 浅层与深层神经网络

  • There is no universally agreed upon threshold of depth dividing shallow neural networks from deep neural networks. 就划分浅层神经网络与深层神经网络的深度而言,尚未有公认的观点。
  • But most researchers agree that deep neural networks have more than 2 of hidden layers, and hidden layers > 10 to be very deep neural networks.
    但大多数研究人员认为,深度神经网络的隐藏层超过2(即3层隐层),而隐藏层大于10的为超深度神经网络

Activation Function (激活函数)

  • Activation function, also called transfer function or output transformation function.
    激活(活化、激励)函数,又称为传递函数或输出变换函数。
Why need activation function? 为什么要用激活函数
  • The activation function of the early artificial neuron is used to simulate the action potential (threshold) in biology. If the cell membrane potential exceeds it, the cell is in the excited state and outputs the signal, otherwise the cell is in the suppressed state and does not output the signal. Therefore, step function is adopted.
    早期人工神经元的激活函数用于模拟生物学上的动作电位(阈值),若膜电位超过它,则细胞处于兴奋状态,输出信号,否则细胞处于抑制状态,不输出信号。因此,采用阶跃函数
  • Later, when the single-layer NN is extended to multi-layer NN, it is found that the output of each layer is a linear function of the upper input. No matter how many layers NN has, the output is a linear combination of inputs. The linear model can not solve the nonlinear problem.
    后来,当将单层NN扩展为多层NN时,发现:每一层的输出都是上层输入的线性函数,无论神经网络有多少层,输出都是输入的线性组合,线性模型无法解决非线性问题
  • Therefore, the nonlinear activation function is introduced to the neuron in order to transform the linearity into nonlinearity. NN can approach any nonlinear function, so that NN can be used to solve the nonlinear problem.
    于是,给神经元引入非线性激活函数,目的是将线性转化为非线性。使得神经网络可以逼近任何非线性函数,这样神经网络就可用于解决非线性问题。
5 types of often used activation functions (5种常用的激活函数)

(1) Linear:Threshold Function (i.e. jump function, 阈值函数, 即阶跃函数)
(2) Non-linear Function:ReLU ,Logistic-Sigmoid,Tanh-Sigmoid,Softmax
S形函数包括:逻辑S形函数、双曲正切(tanh)(也称为双极S形函数)
在这里插入图片描述

Logistic-Sigmoid function 逻辑S形函数

在这里插入图片描述

Softmax Function
  • Softmax function, or normalized exponential function, make each
    output value is between (0, 1) and the sum of all elements is 1.
    也称为称归一化指数函数,使得每个输出值在 (0,1)之间,且所有元素和为1。
    在这里插入图片描述
  • It is applied to the output layer for multi-class NN, normalizing the
    output results to probability distribution.
    Softmax用于多分类NN的输出层(不用于隐藏层),将输出结果归一化为概率分布
  • Its general meaning is to normalize the output vector, highlight the
    maximum value and suppress the other components that are far
    below the maximum value.
    该函数的意义:对输出向量进行归一化,凸显其中最大的值,并抑制远低于最大的其他分量。
  • For example, the input vector [1, 2, 3, 4, 1, 2, 3] corresponds to the softmax function value [0.024,0.064, 0.175, 0.475,0.024, 0.064, 0.175]. The item with the maximum weight in the output vector corresponds to the maximum value of “4” in the input vector.
    例如:输入向量[1,2,3,4,1,2,3]对应的 Softmax 函数值为[0.024,0.064,0.175,0.475,0.024,0.064,0.175]。输出向量中有最大权重的项对应着输入向量中的最大值“4”。
  • Sigmoid is a special case of Softmax. When the number of classes is 2, Sigmoid is Softmax.
    Sigmoid 是 softmax的特例。当类别数为2时, Sigmoid 就是 softmax。
  • Sigmoid is used to solve the problem of binary classification, and
    Softmax is used to solve the problem of multi-classification.
    Sigmoid 用于解决二分类问题,而 softmax 用于解决多分类问题。
  • When the number of classes is 2, the fully connected NN without
    hidden layer becomes logistic regression.
    当分类数为2时,全连接且不含隐藏层的神经网络,就变为logistic回归。

全连接层,通俗的说就是前面一层的每个单元都与后面一层的相连接。

  • The multi-class of Softmax is mutually exclusive, that is, an input can only be classified into one class;
    Softmax 的多类别间是互斥的,即一个输入只能被归为一类;
  • Multiple logistic regression can also implement multiple classifications, but the output categories are not mutually exclusive. “Apple” is also “fruit”.
    多个logistic回归也可实现多分类,但输出的类别并不互斥,如“苹果”这个词语既属于“水果”类也属于“食品"类别。
    在这里插入图片描述

History of Artificial Neural Networks 人工神经网络的发展史

在这里插入图片描述

  • But at that time, the research of the neural network was in the trough
    period(1969-1982), so BP algorithm did not draw great attention. 但当时,神经网络的研究正处于低谷期,故BP算法并未引起重视。
  • BP algorithm regenerated interest until the research of Neural Network ushered in a second climax in the 1980s(1983~1990).
    直到1980年代神经网络的研究迎来了第二次高潮,BP算法才又引发研究者的兴趣。
  • In 1985, multi-layer ANN appeared, which broke through the limitation of early perceptron.
    1985年,出现了多层人工神经网络,突破了早期感知机局限的。
  • In 1986, Rumelhart(American psychologist)and Geoffrey Hinton
    independently put forward the learning algorithm of MNN–BP algorithm.
    1986年,Rumelhart 与Hinton等人重新独立地提出了多层神经网络的学习算法—BP算法。(人工智能三大奠基人Geoffrey Hinton、Yann LeCun 与 Yoshua Bengio
    在这里插入图片描述
    在这里插入图片描述

神经网络简史

YearWhoEvent
1943Warren McCulloch &Walter Pitts神经网络开山之作
1958Frank Rosenblatt“感知机”模型
1959Hubel&WieselVisual cortex
1969Marvin Lee Minsky & Seymour Papert指出“感知机”缺陷
1982John Hopfield霍普菲尔德网络
1986Rumelhart & McCelland反向传播
1995VapnikSVM
2006Geoffery Hinton深度学习
2009李飞飞ImageNet
2012吴恩达谷歌大脑
  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值