C1-Introduction

Some Concepts

  • The hierarcy of concepts allows the computer to learn complicated concepts by building them out of simpler ones. --> deep learning
  • A computer can reason about statements in these formal languages automatically using logical inference rules. --> knowledge base
  • AI system need the ability to acquire their own knowledge, by extracting patterns from raw data. --> machine learning --> representation of the data ++(feature)++have an enormous effect on the performance of ML
    • eg1.logistic regression
    • eg2.naive Bayes
    • representation learning(by ML) --> separate the factors of variation that explain the observed data --> solved by DL
      Fig1
      • eg1.autoencoder: the combination of an encoder function (converts the input data into a different representation) and a decoder function (converts the new representation back into the orginal format)
  • Deep learning
    Fig2
    • eg1.feedforward deep network
    • eg2.multilayer perceptron
    • two perspective:
      • learning the right representation for data
      • depth allows the computer to learn a multi-step computer program
      • measuring the depth of a model
        • the number of sequential instructions
        • not the depth of the computational graph but the depth of the graph(usually used in deep probabilistic)
          Fig3

summerise

Fig4
Fig5

Challenges

  • how to get informal knowledge(knowledge about world) into a computer
  • many of the factors of variantion influence every single piece of data we observe

Organize of the book

Fig6

Historical Trends in Deep Learning

  • The 1940s, Deep learning appare to be new.
  • Known as cybernetics in the 1940s-1960s.
  • Known as connectionism in the 1980s-1990s.
  • Known as Deep learning in 2006.
    Fig7
  • The neural perspective on DL:
    • the brain provides a proof by eaxmple that intelligent behavior is possible, and a conceptually straightforward path to building intelligence is to reverse engineer the computational principles behind the brain and duplicate its functionality.
    • it would be deeply interesting to understand the brain and the principles that underlie human intelligence.
  • a more general principle of learning multiple levels of composition.
  • the earliest predecessors is simple linear models motivated from a neuroscientific perspective
  • hand-controlled weight for classifer
  • In the 1950s, the perceptron became the first model that could learn the weights defining the categories given examples of inputs from each category.
  • adaptive linear element (ADALINE) ++ proposed the same time ++
  • the training algorithm for ADALINE is stochastic gradient descent (SGD)
  • perceptron and ADALINE are linear models.Cannot learn XOR function
  • Diminished role of neuroscience --> we cannot have enough information about the brain to use it as a guide.
  • Neocognitron (1980) is the basis of mordern convolutional network (1998).most NN based on a model neuron called the rectified linear unit.
  • Cognitron (1975)
  • viewpoint
    • Nair and Hinton (2010) and Glorot (2011a) --> neuroscience
    • Jarrett (2009) --> engineering-oriented
  • connectionism or parallel distributed processing (1986 and 1995)
    • the central idea : a large number of simple computational units can achieve intelligent behavior when networked together.
    • distributed representation (1986)
    • successful use of back-propagation to train deep neural network with internal representations and the popularization of the back-propagation algorithm (1986a and 1987)
    • some of fundamental mathematical difficulties in modeling long sequences are identified (1991 and 1994)
    • the long short-term memory or LSTM network to solve above difficulties (1997)
    • Kernel machines (1992,1995 and 1999) and Graphical models (1998) become popluar
    • In 1998b and 2001, Canadian Institute for Advanced Research (CIFAR) keep NN research alive.
  • In 2006, Deep Belief Network can be trained using a strategy called greedy layer-wise pretraining.
    • greedy layer-wise pretraining is used to train many kinds of deep network (2007)
    • deep learning forcus the depth (2007,2011,2014a and 2014)

Increasing Dataset Sizes

  • 1950s, first experiment of ANN conducted; 1990s, used in commerical applications
    Fig8

Increasing Model Sizes

Fig9
Fig10

Increasing Accuracy, Complexity and Real-World Impact

  • 1986a, earlist deep models for individual objects in tightly cropped, extremely small images.
  • 2012, modern object recognition networks with high-resolution photographs and uncropped photos.–>error from 26.1% to 15.3% --> down to 3.6%
    Fig11
  • 2010,2010b,2011 and 2012a, error rate of peech recongnition have a sudden drop with DL
  • 2013, DL have successes for pedestrian detection and image segmentation
  • 2012, DL have superhuman performance in traffic sign classification.
  • 2014d,NN can output an entire sequence of characters transcribed from an image.
  • 2013, need labeling of the individual elements of the sequence.
  • 2014 and 2015, RNN–>machine translation
  • 2015,extension of DL is reinforcement learning.
  • more other application such as medicince(2014)…
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值