卷积神经网络如何解释和预测图像

最新推荐文章于 2024-07-22 17:32:39 发布

weixin_26752765

最新推荐文章于 2024-07-22 17:32:39 发布

阅读量1.7k

点赞数

文章标签：神经网络深度学习计算机视觉机器学习 python

原文链接：https://medium.com/@aseem.kash/a-comprehensive-guide-to-convolution-neural-networks-4bc10584cbac

版权

本文深入探讨了卷积神经网络(CNN)中关键层的工作原理，旨在直观解释如何利用深度学习对图像进行分类。文章介绍了CNN的目标，如区分不同颜色、大小和品种的猫狗，并强调算法必须具备空间不变性。同时，阐述了计算机如何通过像素值来读取和理解图像。

摘要由CSDN通过智能技术生成

深层学习基础 (DEEP LEARNING BASICS)

Aim of this article is to provide an intuitive understanding behind the inner working of key layers in a convolution neural network. The idea is to go beyond simply stating the facts and exploring how image manipulation actually works.

本文的目的是提供对卷积神经网络中关键层内部工作的直观了解。这个想法不只是简单地陈述事实 ，而是探索图像处理的实际作用 。

目标 (The Objective)

Out aim is to design a deep learning framework capable of classify cat and dog images like those shown below. Let us start by thinking about what challenges such an algorithm must overcome.

最终目的是设计一个能够对猫和狗图像进行分类的深度学习框架，如下所示。 让我们首先考虑一下这种算法必须克服的挑战。

It should be able to detect cats and dogs of different color, size, shape, and breed. It must be able to detect and classify animals even from pictures where the dog or the cat is not entirely visible. It must be sensitive to presence of more than one dog in the image. Most importantly, the algorithm must be spatially invariant — it must be able to recognise dogs physically located in any corner of the image.

它应该能够检测出不同颜色，大小，形状和品种的猫和狗。 即使从狗或猫不完全可见的图片中，它也必须能够对动物进行检测和分类。 它必须对图像中有不止一只狗的情况敏感。 最重要的是，该算法必须在空间上不变-它必须能够识别物理上位于图像任何角落的狗。

计算机如何读取图像。 (How computer reads images.)

Images are composed of pixels that have values ranging from 0–255 that depict brightness. 0 means black, 255 is white and everything else is some shade of grey. More the pixels, better the image quality.

图像由像素组成，其值在0-255之间，表示亮度。 0表示黑色，255表示白色，其他所有内容均为灰色。 像素越多，图像质量越好。