Deep Learning for Computer Vision (1)

Lecture 1 Introduction to Deep Learning

1. Definition of Computer Vision and Deep Learning

Computer Vision: Building artificial systems that process, perceive, and reason about visual data.

Deep Learning: Hierarchical learning algorithms with many “layers”, (very) loosely inspired by the brain.

2. Brief History of Computer Vision

Early Exploration of the Visual Cortex, Hubel and Wiesel, 1959

Hubel and Wiesel first observed that neurons were sensitive to moving marginal stimuli in the cat visual cortex experiment in 1958, and defined simple and complex cells and discovered the visual functional column structure. This research is considered the beginning of computer vision for two reasons:

  1. This was the first to emphasize the edge of orientation, which was widely used in later computer vision architectures.
  2. They found the information processing of cat’s visual cortex is a hierarchical manner. Fisrtly, simple cells responsd to light orientation, and then through neuron transmission, more and more complex cells can respond to more and more complex abstract information.
    Early Exploration of the Visual Cortex, Hubel and Wiesel,  1959
Stage of Visual Representation, David Marr,1970s

Marr thought that computer vision is the use of effective symbols to describe images of the external world. Its core is to deduce the external world structure from the image structure. Vision begins with images, goes through a series of processing and transformation and finally reaches the recognition of the external reality world.

  1. Primal Sketch (2-D sketch): The primitive sketch is obtanied from the input image. It refers to the location where the image intensity changes dramatically and its geometric distribution and organizational structure.
  2. 2.5 D Sketch: It refers to the normal direction, approximate depth, and discontinuous contours of visible surfaces in an observer- centered coordinate system.
  3. 3-D model representation: It refers to the spatial organization form of shapes described by using hierarchical representations in terms of surface and volumetric primitives in the object-centered coordinate system.
    在这里插入图片描述
Recognition via Edge Detection, John Canny, David Lowe, 1980s

Canny algorithm is a classical algorithm of edge detection, which was proposed by John F. Canny in 1986. In 1987, David Lowe proposed a more complex corresponding edge detection theory. The specific steps are as follows:

  1. Gaussian blur
  2. Calculate the gradient size and direction
  3. Non-maximization suppression
  4. Double threshold to separate strong edge and weak edge
  5. Connect weak edges

在这里插入图片描述

Recognition via Matching , David Lowe, 1999

David Lowe proposed a different approach to identification through matching (SIFT) in 1999. His idea was to identify some kind of feature vector through the key points in the image. The feature vector is an appearance real - valued vector encoded in a certain way. Therefore, invariance of different images can be encoded into the feature vector. Even if the basic image has slight changes (such as brightness change, rotation, shooting from different angles), the feature vector can still be used for image recognition through matching.

在这里插入图片描述

Large Scale Visual Recognition Challenge started from 2010 and AlexNet: Deep Learning Goes Mainstream ,Krizhevsky, 2012

In 2012, AlexNet greatly reduced the error rate of image recognition, making people realize that deep learning will become the mainstream of computer vision research.

在这里插入图片描述
在这里插入图片描述

3. Brief History of Deep Learning

Perceptron, Frank Rosenblatt, 1958

Rosenblatt proposed a one of the earliest algorithms that could learn from data, which could learn to recognize letters of the alphabet. Today we recognize it as linear classifier.

Backprop, Rumelhart, Hinton, 1985

Rumelhart proposed a backpropagation algorithms for computing gradients in neural network and successfully trained perceptrons with multiple layers.

在这里插入图片描述

Convolutional networks: LeNet, LeCun et al, 1998

It applied backprop algorithm to a Neocognitron-like architecture, which can learn to recognize handwritten digits and was deployed in a commercial system by NEC, processed handwritten check.

在这里插入图片描述

ConvNets are everywhere, 2012 to Present

Nowadays convolutional networks are widely used in computer vision:

  1. Image Classification
  2. Image Retrieval
  3. Object Detection
  4. Video Classification
  5. Pose Recognition

4. Summary

In 2012, Deep Learning opened the door of computer vision and gave a very bright performance. If we want to explore what happened in that year, while let’s take a 50-year perspective. In my opinion, it attribute to the algorithms, data, computing have evolved over the last 50 years. It is the improvement of algorithms, the massive increase of data and the development of GPUs that enable new applications represented by convolutional networks to magically change the world.

在这里插入图片描述

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值