js alert 丑陋_优点缺点丑陋监督无监督强化学习

最新推荐文章于 2021-06-03 07:43:50 发布

weixin_26722099

最新推荐文章于 2021-06-03 07:43:50 发布

阅读量170

点赞数

文章标签：强化学习 python 人工智能 java

原文链接：https://towardsdatascience.com/the-good-the-bad-and-the-ugly-supervised-unsupervised-and-reinforcement-learning-2ccf814c6bab

版权

js alert 丑陋

Hello dear reader! In the post you’re about to read, I will cover in a very simple manner what the three main types of learning in Machine Learning are: Supervised, Unsupervised, and Reinforcement Learning.

亲爱的读者您好！在您将要阅读的文章中，我将以一种非常简单的方式介绍机器学习中三种主要的学习类型：有监督学习，无监督学习和强化学习。

As there are millions of posts out there on the differences between these three, where they can be used, and all the typical topics, I will try to go a fair end further, explore them in a novel manner, give my opinions from an Industry and commercial perspective, and throw in a little bit of humour, while also neatly explaining what each of them is about.

由于存在数以百万计的关于这三个之间的区别(可以使用它们的地方)以及所有典型主题的文章，我将尽一切努力，以新颖的方式探索它们，并提出我的行业观点。和商业角度来看，并散发出一点幽默感，同时还巧妙地解释了它们各自的含义。

Let's go!

我们走吧！

优点：监督学习 (The Good: Supervised Learning)

Image for post — FlatIcon and FlatIcon和 DLpng. DLpng中的图标。

The guy that everybody likes. Thanks to him your voice assistant can call you an Uber to pick you up at night. He can rank the visitors to your website so that you can easily see who is more likely to buy those pretty sun-glasses you sell and target them with marketing campaigns. He makes it so that you can easily reply to emails by pressing tab and autocomplete sentences. He predicts house prices so that real state companies can adjust their offers and make the maximum profits.

每个人都喜欢的人 。多亏了他，您的语音助手才能在夜间呼叫您的Uber接您。他可以对您网站的访问者进行排名，以便您可以轻松地看到谁更有可能购买这些漂亮的太阳镜，并通过营销活动来定位它们。他做到了，这样您就可以通过按tab和自动完成句子轻松地回复电子邮件。他预测了房价，以便真正的国有公司可以调整报价并获得最大的利润。

Supervised learning is about making predictions from data. He might seem very clever from the awesome tasks he can pull off, however, if he can do all of this it is because he has learned using data which has a golden, and sometimes hard to get piece of information: labels.

监督学习是关于根据数据进行预测 。他可能会从出色的任务中看起来很聪明，但是，如果他能做到所有这些，那是因为他已经学会了使用具有黄金信息，有时很难获得信息的数据：标签。

No, not the labels that go in your clothes. These labels are much more valuable. They are the pieces of information that tell Supervised Learning algorithms the exact variable that they will try to predict later on.

不，不是衣服上的标签 。这些标签更有价值。它们是告诉Supervised Learning算法确切信息的信息，它们稍后将尝试预测。

For example, if we wanted supervised learning to predict house prices, we would need to train it using data that came with the characteristics of such houses (squared meters, rooms, floors, bathrooms, and all that) and most importantly, the variable which we want to later predict: the house price.

例如，如果我们想通过监督学习来预测房价，则需要使用具有此类房屋特征(平方米，房间，楼层，浴室等)的数据进行训练，最重要的是，我们稍后要预测：房价。

Our great Supervised Learning is used to make predictions like:

我们出色的监督学习用于做出如下预测：

Predicting house prices using house features like the ones we mentioned above.
使用我们上面提到的房屋特征来预测房价。
Predicting if a transaction is fraudulent or not using characteristics such as time-stamp, vendor, spent money, and previous transactions.
使用时间戳，供应商，已花费的资金和以前的交易等特征来预测交易是否为欺诈行为。
Predicting future sales using previous sales, trends, and characteristics of the time period which we are interested on.
使用我们感兴趣的时间段的先前销售额，趋势和特征来预测未来销售额。
Predicting the next word you are going to type using the previous words you have typed.
使用先前键入的单词来预测您要键入的下一个单词。
And many, many more…
还有很多……

These predictions however, can only happen if Supervised learning is given in the training phase the same information that it will later try to predict, which in the previous four examples are: house prices, identified fraudulent and non-fraudulent transactions, measured sales registries in previous time periods, and a large text corpus of words in the autocomplete example.

但是，只有在培训阶段为监督学习提供了以后将要尝试预测的相同信息时，才可能发生这些预测，在前四个示例中是：房价，已识别的欺诈性和非欺诈性交易，已计量的销售登记处。以前的时间段，以及自动完成示例中的大型文本语料库。

This information sometimes being costly, not available, or impossible to obtain is the main drawback of supervised learning.

有时这些信息昂贵，无法获得或无法获得，是监督学习的主要缺点。

At the moment, most of the economic value provided by Machine Learning models comes from this family of learning. Saniye Alabeyi, Senior Director Analyst at Garnet states:

目前， 机器学习模型提供的大多数经济价值都来自这种学习家族 。石榴石高级分析师Saniye Alabeyi指出：

“Through 2022, supervised learning will remain the type of ML utilised most by enterprise IT leaders”

“到2022年，监督学习将仍然是企业IT领导者最常使用的ML类型”

It is so, because it provides value in many relevant business scenarios that range from fraud detection to sales forecasting or inventory optimisation.

之所以如此，是因为它在从欺诈检测到销售预测或库存优化的许多相关业务场景中提供了价值。

Yet, supervised learning might not always be the best fit for certain problems. This is where its bad little brother comes along: Un-Supervised learning.

然而，有监督的学习可能并不总是最适合某些问题。这就是它的坏小兄弟出现的地方： 无监督学习。

坏：无监督学习 (The Bad: Unsupervised Learning)

Remember the main problem about Supervised-Learning? The costly, and valuable labels? Well, unsupervised learning comes to sort of solve that problem.

还记得有关监督学习的主要问题吗？昂贵且有价值的标签？好吧， 无监督学习可以解决这个问题。

His main skill is that he can segment, group, and cluster data all without needing these annoying labels. He knows how to group customers by their purchase behaviour, separate houses depending on their characteristics, or detect anomalies within a group of data. He is also capable of reducing the dimensions of our data.

他的主要技能是可以对数据进行分段，分组和聚类，而无需这些烦人的标签。他知道如何根据客户的购买行为对它们进行分组，根据客户的特征将房子分开或在一组数据中检测异常情况。他还能够缩小我们数据的范围。

The main takeaway is that it can group and segment data by finding patterns that are common to the different groups, without needing this data to have an specific label.

主要的优点是它可以通过查找不同组所共有的模式来对数据进行分组和分段，而无需为此数据添加特定的标签。

In our housing example, we would throw at unsupervised learning our whole data set (without the house prices) and it would tell us something like: ‘Hey, here you have 5 main groups of houses’:

在我们的住房示例中，我们将不受监督地学习我们的整个数据集(不包含房价)，这将告诉我们： “嘿，这里有5个主要的房屋组”：

Houses with a garden and small pools that have room for more than 5 people and that are usually in good neighbourhoods.
带花园和小型游泳池的房屋可容纳5人以上，并且通常都位于附近地区。
Small flats with space for a couple, an American kitchen and a small balcony.
小公寓，可容纳情侣，美式厨房和小阳台。
Wide ground floor spaces with very little rooms that look like shops.
宽敞的底楼空间，很少有像商店一样的房间。
Apartments with more than 500 squared meters with many rooms, more than 4 bathrooms, a dining hall, and a chimney.
公寓面积超过500平方米，设有许多房间，超过4间浴室，一个饭厅和一个烟囱。
Last floor flats for a couple on buildings without elevator: don’t buy this for grandpa and grandma!
在没有电梯的建筑物上，一对夫妇的楼上公寓：不要为爷爷和奶奶购买！

As cool and easy this might look, it is not so straightforward as it might seem.

看起来很酷，很容易，但看起来并不那么简单。

The output or response of the unsupervised algorithm is not actually a series of texts like the previous ones, but rather the data with its characteristics and the 5 different groups. It is up to us to look at the different groups, and extract for them the common characteristics of the houses that allow us to create the previous texts.

无监督算法的输出或响应实际上不是像之前的文本那样的一系列文本，而是具有其特征和5个不同组的数据。我们需要研究不同的群体，并为他们提取房屋的共同特征，以使我们能够创造出先前的文字。

We have to analyse the output ourselves and extract valuable insights from it. Unsupervised learning has a more reduced reach in industrial and commercial applications than supervised learning at the moment, but it is still able to provide a decent commercial value and it is becoming ever more demanded.

我们必须自己分析输出并从中提取有价值的见解。 与目前的无监督学习相比，无监督学习在工业和商业应用中的普及率要低得多，但是无监督学习仍然能够提供体面的商业价值，并且需求也越来越大。

Lastly, we have the promise, the ugly cousin that has gathered some amazing achievements but that is still seen by many as the black sheep: Reinforcement Learning.

最后，我们有一个希望，丑陋的堂兄已经取得了一些惊人的成就，但仍然被许多人视为败类： 强化学习。

丑陋：强化学习 (The Ugly: Reinforcement Learning)

Reinforcement learning is a different kind of guy. He works with no previous data pretty much, and still manages to pull off some amazing records like beating human experts at chess and video games, or teaching robots how to move in different environments.

强化学习是另一种类型的人。 他几乎没有以前的数据，并且仍然设法取得了惊人的记录，例如在国际象棋和视频游戏上击败人类专家，或者教机器人如何在不同的环境中移动。

He learns to do all this by using a punishment-reward system, a final goal, and a policy. Knowing the goal he wants to achieve, he acts according to the policy that he has learned, and gets rewarded positively or negatively. Then, the policy is updated depending on this reward.

他学会通过使用惩罚-奖励系统，最终目标和政策来做到这一切。 知道自己想要实现的目标后，他就会按照所学的方针行事，并得到正面或负面的奖励。然后，根据此奖励更新策略。

Take the example of chess. The chess-playing reinforcement learning model starts with a very basic policy and the goal of getting to check-mate the opposing team’s king and defend his own king from checks. This large, ultimate goal, can be divided into smaller more short-term goals like not having his pieces captured, and capturing as many opposing pieces as possible, or having control over the centre of the board.

以国际象棋为例。下象棋的强化学习模型始于一项非常基本的政策，其目标是与对方球队的国王结对，并从检查中捍卫自己的国王。这个巨大的最终目标可以分为更小的短期目标，例如不捕获其棋子，捕获尽可能多的相对棋子或控制棋盘中心。

As the algorithm plays more and more times and gets punished and rewarded for certain actions, it will reinforce those actions that have led it to obtain a reward (hence the name Reinforcement learning).

随着算法的发挥越来越多，并且因某些动作而受到惩罚和奖励，它将强化导致其获得奖励的那些动作(因此被称为“ 强化学习” )。

In the game domain, these kinds of algorithms have excelled. Let's try to discover why.

在游戏领域，这类算法非常出色。 让我们尝试找出原因。

It is said that to become an expert on a certain topic you have to dedicate about 1000 hours to it. People who excel at video games, chess, or any other task of that nature, aside from having a natural ability for it, dedicate an insane amount of time in order to achieve mastery. Playing game after game of chess, day after day, will inevitably make you better.

据说要成为某个主题的专家，您需要花费大约1000个小时。擅长于视频游戏，国际象棋或其他类似任务的人，除了具有天生的能力之外，还需要花费大量的时间才能精通。日复一日地下棋，不可避免地会使您变得更好。

However, there is a limit to the number of games of chess which you can play on a day. Each day only has 24h and you can only play at a certain speed: first of all, you have to think while you are playing, and then you have to eat, sleep, and take care of yourself.

但是，一天可以玩的国际象棋游戏数量受到限制。每天只有24小时，您只能以一定的速度玩耍：首先，您在玩耍时必须思考，然后必须吃饭，睡觉并照顾好自己。

A software using reinforcement learning doesn’t have to sleep or eat, and it can play much, much faster than a human. By doing this, and having reinforcement learning systems play against themselves, they can reach the outstanding performances that have allowed them to beat the best human players in many games.

使用强化学习的软件无需睡觉或吃饭，它的播放速度比人类快得多 。通过这样做，并使强化学习系统与自己对抗，他们可以达到出色的表现，使他们在许多游戏中击败最优秀的人类玩家。

As cool as reinforcement learning might seem, it still has limited practical applications: because of what we have previously discussed, RL is best used in areas that can be fully simulated, like games, which makes its reach be very short handed in the business arena. However, it is a family of learning for which research is rapidly growing and that is very promising for tasks like robotics or autonomous vehicles.

尽管强化学习看起来很酷，但它在实际应用中仍然有限 ：由于我们前面已经讨论过，强化学习最好用于可以完全模拟的领域，例如游戏，这使得其在商业领域的作用非常短。然而，这是一个学习快速发展的学习家族，对于诸如机器人技术或自动驾驶汽车这样的任务非常有前途。

结论 (Conclusion)

In this post, we have seen what each of the three main members of machine learning family can do best. There are other members of this family like semi-supervised learning, or self-supervised learning, but we will speak about those in the future.

在这篇文章中，我们看到了机器学习家族的三个主要成员各自可以做的最好的事情。这个家庭中还有其他成员，例如半监督学习或自我监督学习，但我们将在将来讨论。

Supervised, unsupervised, and reinforcement learning can and should be used to complete different kind of tasks. There is no silver bullet to solve every problem, and problems of different natures have to be faced using different tools.

有监督，无监督和强化学习可以并且应该用于完成不同类型的任务。没有解决所有问题的灵丹妙药，必须使用不同的工具来面对不同性质的问题。

Despite the difference in their commercial and industrial uses, all of these three branches are very relevant for building efficient and high-value Artificial Intelligence applications, and they are increasingly being used simultaneously for solving incredibly complex tasks and tackling new challenges.

尽管它们在商业和工业用途上有所不同，但是这三个分支对于构建高效和高价值的人工智能应用都非常重要，并且越来越多地同时用于解决难以置信的复杂任务和应对新挑战。

That is it! As always, I hope you enjoyed the post. If you did feel free to follow me on Twitter at @jaimezorno. Also, you can take a look at my other posts on Data Science and Machine Learning here, and subscribe to my newsletter to get awesome goodies and notifications on new posts!

这就对了！ 一如既往，希望您 喜欢这个职位。 如果您有 兴趣在 Twitter上 关注我，请 访问@jaimezorno 。 此外，您还可以看看我的关于数据科学和机器学习等职位 这里 ，和订阅我的通讯，以获得新的职位真棒糖果和通知！

翻译自: https://towardsdatascience.com/the-good-the-bad-and-the-ugly-supervised-unsupervised-and-reinforcement-learning-2ccf814c6bab

js alert 丑陋

weixin_26722099

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
js alert 丑陋_优点缺点丑陋监督无监督强化学习

js alert 丑陋Hello dear reader! In the post you’re about to read, I will cover in a very simple manner what the three main types of learning in Machine Learning are: Supervised, Unsupervised, and Reinfo...
复制链接

扫一扫