Understanding CNN

最新推荐文章于 2024-08-01 20:59:33 发布

War Children

最新推荐文章于 2024-08-01 20:59:33 发布

阅读量154

点赞数 1

分类专栏： CV CNN 文章标签： CV CNN AI 人工智能图像识别

CV 同时被 2 个专栏收录

1 篇文章 0 订阅

订阅专栏

CNN

1 篇文章 0 订阅

订阅专栏

[CNN][CV] 需要了解的9个神经网络Understanding CNN-Important papers and networks

Reference 本文基于此文总结

Machine Learning Glossary

ZF Net (2013)

Visualizing and Understanding Deep Neural Networks by Matt Zeiler

Winner of ImageNet in 2013. Inspiring idea in visualizing feature maps and structures which excite a given feature map using deconvnet.–intuition.

VGG Net (2014)

Input 224*224 RGB, uses all 3*3 filters. Very deep.

Work well on both image classification and localization.

GoogLeNet (2015)

Not general approach of simply stacking conv and pooling layers on top of each other in a sequential structure. Acknowledge the consideration on power and memory cost.

-Inception module

Use same convolution, and concatenate the result of each feature map.

Naïve Inception module to Full Inception module: adding 1*1 filters.

–1*1 filter:

Method of dimensionality reduction. e.g.

$\xrightarrow{20\,1*1 \,filter} 100*100*20$

This can be thought of as a “pooling of features” because we are reducing the depth of the volume, similar to how we reduce the dimensions of height and width with normal max pooling layers.

ResNet (2015)

152 layers.

Residual Block

Region Based CNNS (R-CNNS)

R-CNN

Fast R-CNN

Faster R-CNN

To solve the problem of object detection. The process can be split into two general components, the region proposal(Selective Search) step and the classification step.

Generative Adversarial Networks (2014)

For example, let’s consider a trained CNN that works well on ImageNet data. Let’s take an example image and apply a perturbation, or a slight modification, so that the prediction error is maximized. Thus, the object category of the prediction changes, while the image itself looks the same when compared to the image without the perturbation. From the highest level, adversarial examples are basically the images that fool ConvNets.

GAN:Let’s think of two models, a generative model and a discriminative model. The discriminative model has the task of determining whether a given image looks natural (an image from the dataset) or looks like it has been artificially created. The task of the generator is to create images so that the discriminator gets trained to produce the correct outputs. This can be thought of as a zero-sum or minimax two player game. The analogy used in the paper is that the generative model is like “a team of counterfeiters, trying to produce and use fake currency” while the discriminative model is like “the police, trying to detect the counterfeit currency”. The generator is trying to fool the discriminator while the discriminator is trying to not get fooled by the generator. As the models train, both methods are improved until a point where the “counterfeits are indistinguishable from the genuine articles”.

–create artificial pictures

Generating Image Descriptions (2014)

Combine CNNs with RNNs

Given image and text descriptions (weak label)

-Alignment Model

Aims to learn the align the visual and textual data.

Uses R-CNN to detect object in the image. Embed them into a 500 dimensional space. Use Bidirectional RNN to embed words into the same multimodal space. Compute similarity by inner products.

-Generation Model

Learn from the datasets created by alignment model to generate description.

Spatial Transformer Networks (2015)

The network learns transformations to the feature volumes and apply the transformation (warp) between layers.

Animation

War Children

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Understanding CNN

[CNN][CV] 需要了解的9个神经网络Understanding CNN-Important papers and networksReference 本文基于此文总结Machine Learning GlossaryAlexNet (2012)The first CNN winner in ImageNet.ZF Net (2013)Visualizing and Under...
复制链接

扫一扫