揭秘机器学习

The terms “artificial intelligence”, “machine learning”, and “neural networks” are loosely thrown around nowadays. With all of the hype that is built around it in the current news, machine learning is often praised as a silver bullet — a magical technique that can solve ANY problem, no matter how complicated.

如今,“人工智能”,“机器学习”和“神经网络”这两个术语松散地出现了。 借助当前新闻中围绕它的所有炒作,机器学习通常被誉为“银弹” —一种神奇的技术,可以解决任何问题,无论多么复杂。

Well, as you’d probably expect, that’s not completely true. What is true is that modern machine learning (ML) techniques have been able to perform tasks that were previously seemingly impossible. Not only that, compared to solutions we did have for certain tasks, but modern ML techniques have also proven to be better in almost every way: they are faster to develop, more robust, and often run faster. However, these techniques come with their own limitations that make them better suited to certain applications. So while the field is constantly evolving with hundreds of researchers all around the world expanding the state-of-the-art, it’s important to understand the fundamentals of how it works so that we can apply it to solve the problems at which it excels.

好吧,正如您可能期望的那样,这并非完全正确。 真实的是,现代机器学习(ML)技术已经能够执行以前看似不可能的任务。 不仅如此,与我们为某些任务提供的解决方案相比,现代ML技术还被证明在几乎所有方面都更好:它们开发速度更快,功能更强大,而且运行速度通常更快。 但是,这些技术有其自身的局限性,使它们更适合某些应用程序。 因此,尽管该领域不断发展,世界各地的数百名研究人员正在扩展最先进的技术,但重要的是要了解其工作原理,以便我们将其应用于解决其擅长的问题。

定义条款 (Define the terms)

Let’s first define the 3 most commonly used “buzz word” terms in this industry:

让我们首先定义该行业中最常用的3个“流行语”术语:

Artificial Intelligence — The theory and development of computer systems able to perform tasks that normally require human intelligence [1]

人工智能— 能够执行通常需要人类智能的任务的计算机系统的理论和发展 [1]

Machine Learning — Machine Learning is the study of computer algorithms that improve automatically through experience [2]

机器学习- 机器学习是对计算机算法的研究,这些算法会根据经验自动提高[2]

Neural Network/Deep Learning — A machine learning technique, very loosely modeled on the structure of the human brain, that is effective at learning complex patterns in data.

神经网络/深度学习- 一种机器学习技术,非常松散地模拟了人脑的结构,可以有效地学习数据中的复杂模式。

So, machine learning is a subset of artificial intelligence and neural networks are a subset of machine learning. Much of the news and advancements in artificial intelligence and machine learning have been due to neural networks, which is why the terms have been used interchangeably. Over the past couple of years, we have shown that neural nets are capable of highly complicated, nuanced, and diverse tasks. For example, they excel at image-based tasks such as object detection and classification, human pose detection, and human mood detection. They have been used for audio tasks, such as speech-to-text translation and music generation. All modern language translation services apply neural nets to extract the meaning from phrases and convert those phrases to different languages. Some recent notable advancements include beating the world champions in the board game Go (a game that is notoriously hard because it requires long-term strategy) and generating paragraphs of text on provided topics that are almost indistinguishable from human-generated text.

因此,机器学习是人工智能的子集,而神经网络是机器学习的子集。 人工智能和机器学习的许多新闻和进步都归因于神经网络,这就是为什么这些术语可以互换使用的原因。 在过去的几年中,我们已经证明神经网络能够执行高度复杂,细微而多样的任务。 例如,他们擅长于基于图像的任务,例如对象检测和分类,人体姿势检测和人类情绪检测。 它们已用于音频任务,例如语音到文本翻译和音乐生成。 所有现代语言翻译服务都使用神经网络从短语中提取含义并将这些短语转换为不同的语言。 最近的一些显着进步包括,在棋盘游戏Go中击败了世界冠军(Go这款游戏之所以困难是因为它需要长期的策略),并且在提供的主题上生成了与人类生成的文本几乎无法区分的文本段落。

Because of their outstanding performance, all of the big tech companies have been investing heavily in applying neural nets in their products. Google uses them for their search engine, translation, ad targeting, photo tagging, generating maps, and so much more. All of the big social media companies use it for recommending content to its users and understanding user sentiment. Self-driving car companies apply them to processing data about their surrounding environment so the car can make safe decisions. This list is by no means comprehensive.

由于其出色的性能,所有大型高科技公司都在其产品上应用神经网络进行了大量投资。 Google将它们用于搜索引擎,翻译,广告定位,照片标记,生成地图等等。 所有大型社交媒体公司都使用它来向用户推荐内容并了解用户情绪。 自动驾驶汽车公司将其应用于处理有关其周围环境的数据,以便汽车可以做出安全的决策。 此列表绝不是全面的。

Image for post
Example: Neural network performing object-detection [3]
示例:执行对象检测的神经网络[3]

Since most people refer to deep learning and neural networks when they talk about machine learning, we will focus on only that in this article.

由于大多数人在谈论机器学习时都提到了深度学习和神经网络,因此在本文中我们将仅着重于此。

神经网络如何工作? (How does a Neural Net Work?)

Neural nets can process complex data such as images, audio clips, and videos. To us humans, we perceive this data with our senses as colors and sounds. To computers, however, images are just a collection of brightness values. As a result, to process and understand the contents of these rich data sources, we need to apply mathematical techniques. Neural nets are really just big and complicated math functions, like y = mx + b or y = e^x or y = sin(x). They take in a collection of numbers representing the data and they output another collection of numbers, describing the answer you “taught” them to give you.

神经网络可以处理复杂的数据,例如图像,音频片段和视频。 对于我们人类来说,我们以感知到的颜色和声音来感知这些数据。 但是,对于计算机而言,图像只是亮度值的集合。 因此,要处理和理解这些丰富数据源的内容,我们需要应用数学技术。 神经网络实际上只是庞大而复杂的数学函数,例如y = mx + b或y = e ^ x或y = sin(x)。 他们采用一组代表数据的数字,并输出另一组数字,描述您“教”给他们的答案。

As we mentioned previously, neural nets are loosely based on the structure of the brain (consider the brain a useful analogy, not an exact representation of how they work). Neural nets consist of multiple layers of “neurons”, where each layer is multiple “neurons” wide. Each neuron represents a relatively simple mathematical operation, think mx + b (where x is the number input to the neuron), followed by a non-linear function, like sin(x) (the actual function used depends on the specific task you’re doing). The output from each neuron gets fed into each of the neurons in the next layer. When these layers of neurons get stacked very deep (this is where the term deep learning comes from), the result is a function capable of describing very complicated relationships in the data.

如前所述,神经网络松散地基于大脑的结构(认为大脑是有用的类比,而不是它们的工作原理的精确表示)。 神经网络由多层“神经元”组成,其中每一层是多个“神经元”宽。 每个神经元代表一个相对简单的数学运算,请考虑mx + b(其中x是输入到神经元的数字),然后是一个非线性函数,例如sin(x)(所用的实际函数取决于您所执行的特定任务)重新做)。 每个神经元的输出被馈送到下一层的每个神经元中。 当这些神经元层堆积得非常深时(这就是“深度学习”一词的来历),其结果是能够描述数据中非常复杂的关系的函数。

Image for post
Neural net architecture, circles represent neurons, lines represent data flow (source: Wikimedia Commons)
神经网络架构,圆圈代表神经元,线条代表数据流(来源:Wikimedia Commons)

So we’ve covered what is neural net IS, but we haven’t talked about what “training” it means. How does a neural net “learn”? Remember how we mentioned that each neuron applies mx + b to its inputs? The “m” and “b” in that formula are actually learnable parameters. In other words, we tune the value of “m” and “b” in each neuron to change what the neural network does. Does this sound familiar? It is actually the same as the process of linear regression in statistics! In linear regression, we try to find a best fit line for our data by finding the correct parameters “m” and “b”. Neural nets are just doing this at a massive scale. Instead of finding 2 parameters for a line, we are finding millions or even billions of parameters for a very complicated function. So neural nets are sort of best fit lines for your data. It’s a little weird because instead of a converting a single input to a single output like in linear regression, neural nets convert from data like images to labels describing what the image contains. However, they are fundamentally the same thing.

因此,我们已经介绍了什么是神经网络IS,但我们没有谈论它的“训练”含义。 神经网络如何“学习”? 还记得我们曾提到过每个神经元将mx + b应用于其输入吗? 该公式中的“ m”和“ b”实际上是可学习的参数。 换句话说,我们调整每个神经元中“ m”和“ b”的值以更改神经网络的功能。 这听起来很熟悉吗? 它实际上与统计中的线性回归过程相同! 在线性回归中,我们尝试通过找到正确的参数“ m”和“ b”来找到最适合我们数据的线。 神经网络正在大规模地这样做。 我们没有为一条线找到2个参数,而是为一个非常复杂的函数找到了数百万甚至数十亿个参数。 因此,神经网络是最适合您数据的线。 这有点奇怪,因为神经网络没有像线性回归那样将单个输入转换为单个输出,而是将图像等数据转换为描述图像所包含内容的标签。 但是,它们本质上是同一件事。

Because neural nets are so complicated, we can’t find the m’s and b’s (called weights) the same way we would for linear regression, so we had to devise other methods. The reason neural nets are structured as we describe above is very intentional. They are built so that the whole neural net is differentiable (the whole neural net has a derivative, crazy right?). The details of what exactly that means is unimportant. But the consequence is that we can use a technique called “gradient descent” to select the correct weights. Gradient descent has proven to be a extremely successful method to find those weights. In fact, the gradient descent technique is the reason that neural nets have blown up in popularity. They would largely be useless without it.

由于神经网络非常复杂,因此我们无法找到与线性回归相同的m和b(称为权重),因此我们不得不设计其他方法。 如上所述,神经网络结构化的原因是非常故意的。 建立它们是为了使整个神经网络具有可区分性(整个神经网络具有派生的,疯狂的权利吗?)。 确切含义的细节并不重要。 但是结果是我们可以使用一种称为“梯度下降”的技术来选择正确的权重。 事实证明,梯度下降是找到这些权重的非常成功的方法。 实际上,梯度下降技术是神经网络Swift普及的原因。 没有它们,他们将大体上毫无用处。

The idea behind gradient descent is very simple and is actually similar to how humans learn (going back to the brain analogy). Let’s talk about a specific task — determining if an image contains a dog or not. We feed a bunch of images (some images contain dogs, some don’t) into the neural net and get the outputs for each of the images. At the start of the training, the weights are set randomly, so the output of the neural net is meaningless. After feeding a couple of images through, we compare the output of the neural net to the correct output for each image. We do this comparison using something called a loss function, which tells the algorithm how “wrong” the neural net was. Through some mathematical magic (really just multi-variable calculus), we then calculate how to change each weight (remember, just a lot of m’s and b’s) based on the loss function so that the neural net gets closer to the right answer. We apply those changes calculated from each image and then try again on a new set of images. As we repeat this process, the neural net gets better and better at identifying dogs. After a couple of thousands of cycles, the neural net gets very good at the task! So just like humans, the neural net “practices”, trying again and again and getting better every time.

梯度下降背后的想法非常简单,实际上类似于人类的学习方式(回到大脑的类比)。 我们来讨论一个特定的任务-确定图像是否包含狗。 我们将一堆图像(有些图像包含狗,有些不包含狗)输入神经网络,并获取每个图像的输出。 在训练开始时,权重是随机设置的,因此神经网络的输出毫无意义。 在馈送了几张图像之后,我们将神经网络的输出与每个图像的正确输出进行比较。 我们使用称为损失函数的某种东西进行比较,该函数告诉算法神经网络有多么“错误”。 通过一些数学魔术(实际上只是多变量演算),我们然后基于损失函数计算如何更改每个权重(记住,只有很多m和b),以便神经网络更接近正确的答案。 我们应用从每张图像计算出的更改,然后重试一组新图像。 当我们重复此过程时,神经网络在识别狗方面变得越来越好。 经过数以千计的循环后,神经网络可以很好地完成任务! 因此,就像人类一样,神经网络“练习”,一次又一次地尝试,每次都变得更好。

That’s it! Those are the fundamental concepts of how a neural net learns.

而已! 这些是神经网络如何学习的基本概念。

我们可以从中得出什么结论? (What Conclusions Can We Gather from this?)

Now that you have a basic understanding of what a neural net is, let’s discuss some important points about applying neural nets to your tasks.

既然您对神经网络是一个基本的了解,让我们讨论有关将神经网络应用于任务的一些重要要点。

神经网络依赖数据-大量数据 (Neural nets rely on data — lots of data)

When you’re doing linear regression, you need lots of data points to get a good best fit line. If your dataset is too small, you run the risk of getting a bad best fit line resulting in poor estimates.

在进行线性回归时,需要大量数据点才能获得最佳拟合线。 如果数据集太小,则可能会出现最佳拟合线不佳而导致估算不佳的风险。

Image for post
Orange: Best fit line with 2 data points, Blue: Best fit line with 50 data points (including orange)
橙色:具有2个数据点的最佳拟合线,蓝色:具有50个数据点的最佳拟合线(包括橙色)

The same is true for neural nets, except on a massive scale. Neural nets need thousands of examples to learn from. The more diverse and varied your dataset is, the better the neural network will perform on new, unseen examples (called generalization).

除了大规模之外,神经网络也是如此。 神经网络需要成千上万的例子来学习。 您的数据集越多样化和变化,神经网络在新的,看不见的示例(称为泛化)上的表现就越好。

Over the past couple of years, massive datasets have become available for a large variety of problems. However, the data requirement is a limiting factor for applying neural networks to brand-new tasks as it is not feasible to collect that many examples AND record the correct output for each example. Luckily, in the recent years, there has been significant advances in reducing the amount of data needed to train them. One popular technique is called transfer learning, where a neural network is trained to do a certain task and then “fine-tuned” to do another similar task with less data available.

在过去的几年中,海量数据集已经可以解决各种各样的问题。 但是,数据要求是将神经网络应用于全新任务的限制因素,因为收集这么多示例并记录每个示例的正确输出是不可行的。 幸运的是,近年来,在减少训练数据所需的数据量方面取得了重大进展。 一种流行的技术称为转移学习,其中训练神经网络执行某项任务,然后进行“微调”以使用较少的可用数据完成另一项相似的任务。

神经网络需要功能强大的计算机才能运行 (Neural Nets require powerful computers to run)

As we mentioned earlier, modern neural nets have millions or billions of weights and do millions of multiplication and addition operations to calculate the outputs. This makes it difficult, sometimes impossible, to run them on older or less powerful computers. There is a significant research effort dedicated to making neural networks smaller and running on them on small computers.

正如我们前面提到的,现代神经网络具有数百万或数十亿的权重,并进行数百万的乘法和加法运算以计算输出。 这使得在较旧或更弱的计算机上运行它们变得困难,有时甚至是不可能。 有大量的研究工作致力于使神经网络更小并在小型计算机上运行。

结语 (Wrapping it up)

Remember how we said neural nets were not a catch-all solution to all of your problems? Most of the time, one of the two conclusions above are a limiting factor to their usefulness. As time goes on, these limits will reduce, but will never completely go away. Before you settle on deep learning for a new problem, consider other solutions first. There are often other machine learning/statistics tools that will provide satisfactory results that require less data or computational power. However, if you do have access to a large dataset and computational power, deep learning has the potential to build an unparalleled solution.

还记得我们怎么说神经网络不是您所有问题的万能解决方案吗? 在大多数情况下,以上两个结论之一是其实用性的限制因素。 随着时间的流逝,这些限制将减少,但永远不会完全消失。 在对新问题进行深入学习之前,请先考虑其他解决方案。 通常还有其他机器学习/统计工具可以提供令人满意的结果,而所需的数据或计算能力却更少。 但是,如果您确实有权访问大型数据集和强大的计算能力,则深度学习有可能构建无与伦比的解决方案。

Hopefully, this article helped to de-mysitfy some of concepts behind the buzz-words that are thrown around and to clarify when these techniques should be considered.

希望本文有助于消除流行语背后的一些概念的神秘性,并阐明何时应考虑使用这些技术。

资料来源: (Sources:)

[1] “artificial intelligence.” Oxford Reference. ; Accessed 27 Jul. 2020.

[1]“人工智能”。 牛津参考。 ; 于2020年7月27日访问。

https://www.oxfordreference.com/view/10.1093/oi/authority.20110803095426960.

https://www.oxfordreference.com/view/10.1093/oi/authority.20110803095426960。

[2] Mitchell, Tom M. Machine Learning. New York, NY: McGraw Hill, 2017.

[2] Mitchell,汤姆M. 机器学习 。 纽约,纽约:麦格劳·希尔(McGraw Hill),2017年。

[3] https://commons.wikimedia.org/wiki/File:Detected-with-YOLO--Schreibtisch-mit-Objekten.jpg

[3] https://commons.wikimedia.org/wiki/File:Detected-with-YOLO--Schreibtisch-mit-Objekten.jpg

翻译自: https://medium.com/swlh/de-mystifying-machine-learning-482049ee8c02

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值