Deep Learning for Computer Vision (1)

风带走了时间

已于 2022-06-01 14:11:15 修改

阅读量256

点赞数

分类专栏： Computer Vision 文章标签：深度学习计算机视觉

于 2022-06-01 14:03:58 首次发布

原文链接：https://www.bilibili.com/video/BV1AV411q7vi?spm_id_from=333.880.my_history.page.click

版权

Computer Vision 专栏收录该内容

1 篇文章 0 订阅

订阅专栏

Lecture 1 Introduction to Deep Learning

1. Definition of Computer Vision and Deep Learning

Computer Vision: Building artificial systems that process, perceive, and reason about visual data.

Deep Learning: Hierarchical learning algorithms with many “layers”, (very) loosely inspired by the brain.

2. Brief History of Computer Vision

Early Exploration of the Visual Cortex, Hubel and Wiesel, 1959

Hubel and Wiesel first observed that neurons were sensitive to moving marginal stimuli in the cat visual cortex experiment in 1958, and defined simple and complex cells and discovered the visual functional column structure. This research is considered the beginning of computer vision for two reasons:

This was the first to emphasize the edge of orientation, which was widely used in later computer vision architectures.
They found the information processing of cat’s visual cortex is a hierarchical manner. Fisrtly, simple cells responsd to light orientation, and then through neuron transmission, more and more complex cells can respond to more and more complex abstract information.

Stage of Visual Representation, David Marr,1970s

Marr thought that computer vision is the use of effective symbols to describe images of the external world. Its core is to deduce the external world structure from the image structure. Vision begins with images, goes through a series of processing and transformation and finally reaches the recognition of the external reality world.

Primal Sketch (2-D sketch): The primitive sketch is obtanied from the input image. It refers to the location where the image intensity changes dramatically and its geometric distribution and organizational structure.
2.5 D Sketch: It refers to the normal direction, approximate depth, and discontinuous contours of visible surfaces in an observer- centered coordinate system.
3-D model representation: It refers to the spatial organization form of shapes described by using hierarchical representations in terms of surface and volumetric primitives in the object-centered coordinate system.

Recognition via Edge Detection, John Canny, David Lowe, 1980s

Canny algorithm is a classical algorithm of edge detection, which was proposed by John F. Canny in 1986. In 1987, David Lowe proposed a more complex corresponding edge detection theory. The specific steps are as follows:

Gaussian blur
Calculate the gradient size and direction
Non-maximization suppression
Double threshold to separate strong edge and weak edge
Connect weak edges

在这里插入图片描述

Recognition via Matching , David Lowe, 1999

David Lowe proposed a different approach to identification through matching (SIFT) in 1999. His idea was to identify some kind of feature vector through the key points in the image. The feature vector is an appearance real - valued vector encoded in a certain way. Therefore, invariance of different images can be encoded into the feature vector. Even if the basic image has slight changes (such as brightness change, rotation, shooting from different angles), the feature vector can still be used for image recognition through matching.

在这里插入图片描述

Large Scale Visual Recognition Challenge started from 2010 and AlexNet: Deep Learning Goes Mainstream ,Krizhevsky, 2012

In 2012, AlexNet greatly reduced the error rate of image recognition, making people realize that deep learning will become the mainstream of computer vision research.

在这里插入图片描述

3. Brief History of Deep Learning

Perceptron, Frank Rosenblatt, 1958

Rosenblatt proposed a one of the earliest algorithms that could learn from data, which could learn to recognize letters of the alphabet. Today we recognize it as linear classifier.

Backprop, Rumelhart, Hinton, 1985

Rumelhart proposed a backpropagation algorithms for computing gradients in neural network and successfully trained perceptrons with multiple layers.

在这里插入图片描述

Convolutional networks: LeNet, LeCun et al, 1998

It applied backprop algorithm to a Neocognitron-like architecture, which can learn to recognize handwritten digits and was deployed in a commercial system by NEC, processed handwritten check.

在这里插入图片描述

ConvNets are everywhere, 2012 to Present

Nowadays convolutional networks are widely used in computer vision:

Image Classification
Image Retrieval
Object Detection
Video Classification
Pose Recognition

4. Summary

In 2012, Deep Learning opened the door of computer vision and gave a very bright performance. If we want to explore what happened in that year, while let’s take a 50-year perspective. In my opinion, it attribute to the algorithms, data, computing have evolved over the last 50 years. It is the improvement of algorithms, the massive increase of data and the development of GPUs that enable new applications represented by convolutional networks to magically change the world.

在这里插入图片描述