机器学习实际应用_机器学习的实际好处是什么？

最新推荐文章于 2024-05-28 12:53:55 发布

weixin_26746401

最新推荐文章于 2024-05-28 12:53:55 发布

阅读量588

点赞数

文章标签：机器学习 python 人工智能 java

原文链接：https://towardsdatascience.com/what-are-the-practical-benefits-of-machine-learning-c9820dbdd67c

版权

机器学习实际应用

Some of my previous introductory posts to machine learning and data science were a bit technical. However, my purpose of this post is to explain some of the practical use-cases of ML solely from a non-technical savvy layman’s perspective who has had nil exposure to it previously. To satisfy your curiosity, I will also mention the specific ML algorithms that are generally applicable to each use-case if you want to learn more about them.

我以前的一些机器学习和数据科学入门文章有些技术性。但是，我的这篇文章的目的是仅从非技术过硬的外行的角度解释ML的一些实际用例，而以前他几乎没有接触过它。为了满足您的好奇心，如果您想了解更多有关它们的信息，我还将提到通常适用于每个用例的特定ML算法。

What type of problems does ML help us with? Irrespective of the specific domain, what answers or actionable insights it offers? Instead of the ‘how’, our focus here will be more on ‘what’ and ‘why’.

机器学习可以帮助我们解决哪些类型的问题？无论特定领域如何，它提供了哪些答案或可行的见解？除了“如何”，我们在这里的重点将更多地放在“什么”和“为什么”上。

这是什么？ A还是B？ (What is This? A or B?)

This family of ML algorithms predicts in which one of the only two possible categories an observation belongs to. There is no other third potential option. Consider that the management wants to predict which of your existing customers will churn. The answer can only be whether a specific customer will churn or not. Other practical examples include:

这个ML算法系列可预测观察结果仅属于两种可能的类别之一。没有其他第三种可能的选择。考虑到管理层希望预测您现有的哪些客户会流失。答案只能是特定客户是否会流失。其他实际示例包括：

Is this email spam or not?
这是垃圾邮件吗？
Will this customer default or not?
该客户是否会违约？
Are these symptoms symptomatic of a specific disease or not?
这些症状是否是特定疾病的症状？
Will this customer continue with a purchase or not?
该客户会继续购物吗？
Is this an image of a boy or a girl?
这是男孩还是女孩的画像？

Formally known as Binary Classification, the relevant algorithms include:

正式称为二进制分类 ，相关算法包括：

Logistic Regression
逻辑回归
Support Vector Machine
支持向量机
k-Nearest Neighbor
k最近邻居
Classification Decision Tree
分类决策树

这是什么？ A或B或C或D(或其他)？ (What is This? A or B or C or D (Or Something Else)?)

An extension of binary classification, here, the number of potential categories can be more than two. Consider that you are working on a face recognition model; the person in a specific picture can be any of the individuals in your database. The number of possible correct answers is only limited to the amount of data used during model development. Other practical examples include:

二进制分类的扩展，在这里，潜在类别的数量可以超过两个。考虑您正在开发人脸识别模型；特定图片中的人可以是您数据库中的任何人。可能的正确答案的数量仅限于模型开发期间使用的数据量。其他实际示例包括：

Optical Character Recognition: which character is this?
光学字符识别：这是哪个字符？
Which animal is in this image?
该图像中的哪只动物？
Which genre does this movie belong to?
这部电影属于哪种类型？
Sentiment Analysis: what is the feeling associated with this tweet?
情绪分析：此推文有什么感觉？
Whose voice is it in this audio recording?
这段录音是谁的声音？

Formally known as Multi-Class Classification, the relevant algorithms include:

正式称为多类分类 ，相关算法包括：

Random Forests
随机森林
Classification Decision Tree
分类决策树
XGBoost
XGBoost
k-Nearest Neighbor
k最近邻居
Artificial Neural Networks
人工神经网络

有多少期望值？ (How Much or How Many of Something To Expect?)

This family of ML algorithms predicts quantities of something as a continuous output or number (i.e., the prediction can be any of the unlimited numbers of possible outcomes). There are no fixed possible categories that can be predicted — for example, predicting sales volume for the next quarter. That sales prediction can be 1,000 units, 10,000 units, 1,200 units, or any other positive real number.

这个ML算法系列以连续的输出或数量的形式预测某物的数量(即，该预测可以是无限数量的可能结果)。没有可以预测的固定可能类别，例如，预测下一季度的销量。该销售预测可以是1,000个单位，10,000个单位，1,200个单位或任何其他正实数。

The output of these algorithms can be any real number (positive, negative, zero, fractions); however, your specific use-case will determine whether negatives or fractions can be expected and accepted. For example, a sales forecast cannot be negative.

这些算法的输出可以是任何实数(正数，负数，零，分数)。但是，您的特定用例将确定是否可以预期和接受负数或分数。例如，销售预测不能为负。

Other practical use-cases of this class of algorithms include:

此类算法的其他实际用例包括：

What will be tomorrow’s temperature?
明天的温度是多少？
How many prospects can we sign up as customers in the next quarter?
在下一季度，我们可以签约多少潜在客户？
What will be our energy consumption next month?
下个月我们的能源消耗是多少？
How long will it take for an event to occur?
事件发生需要多长时间？

Formally known as Regression, the relevant algorithms include:

正式称为回归，相关算法包括：

Linear Regression
线性回归
Regression Decision Tree
回归决策树
XGBoost
XGBoost
Artificial Neural Networks
人工神经网络

该数据正常还是异常？ (Is This Data Normal or Abnormal?)

Oftentimes, we are more interested in whether a specific observation is atypical, abnormal, or anomaly. Or is it merely a normal and usual observation. We can have historical observations classified as abnormal or not. Or it could be the case that such historical classification does not exist, and an ML algorithm will be used to detect any outliers.

通常，我们对特定观察结果是非典型，异常还是异常更感兴趣。还是仅仅是正常和通常的观察。我们可以将历史观测分为异常与否。或者可能是不存在这种历史分类的情况，并且将使用ML算法来检测任何异常值。

Typical use-cases include:

典型的用例包括：

Is this purchase materially different from the customer’s past purchases?
这次购买与客户过去的购买有实质性的不同吗？
Is this traffic pattern from a computer network typical?
来自计算机网络的这种流量模式是否典型？
Are these outputs from a piece of industrial equipment atypical?
这些来自工业设备的输出是否非典型？

Formally known as Outlier or Anomaly Detection, the relevant algorithms include:

正式称为异常值或异常检测 ，相关算法包括：

Isolation Forest
隔离林
Density-Based Spatial Clustering of Applications with Noise (DBSCAN)
基于密度的噪声应用空间聚类(DBSCAN)
Z-Scores (not technically an ML algorithm, instead a statistical test to identify outliers)
Z分数(从技术上讲不是ML算法，而是统计测试以识别异常值)
One-Class Support Vector Machine
一类支持向量机

我们如何组织这些数据？ (How Can We Organize this Data?)

Are there any underlying identifiable characteristics that can be used to categorize and organize data into specific groups (also known as clusters or segments)? These unique characteristics are not known to us, and often, even the number of potential clusters is unknown. Clustering or organizing your data may assist you with further analysis or developing cluster-specific strategies.

是否存在可用于将数据分类和组织为特定组(也称为群集或段)的潜在可识别特征？这些独特的特征对我们来说是未知的，而且甚至潜在簇的数量通常也是未知的。对数据进行聚类或组织可以帮助您进一步分析或制定特定于聚类的策略。

For example, we may segment our customers into distinct groups based on their age, gender, purchase history, etc. to devise segment-specific sales, marketing, or promotion strategies.

例如，我们可能会根据客户的年龄，性别，购买历史记录等将客户划分为不同的群体，以制定针对特定细分市场的销售，营销或促销策略。

Other practical use-cases of this class of algorithms include:

此类算法的其他实际用例包括：

Which of our subscribers like similar movies or songs?
我们哪个订阅者喜欢类似的电影或歌曲？
How can we categorize several text documents or audio recordings?
我们如何对几个文本文档或录音进行分类？
How can we better segment our products or services?
我们如何更好地细分我们的产品或服务？
Which model of a specific machine is more prone to breakdowns?
特定机器的哪种型号更容易发生故障？

Formally known as Clustering, the relevant algorithms include:

正式称为聚类，相关算法包括：

k-Means Clustering
k均值聚类
Mean-Shift Clustering
均值漂移聚类
Density-Based Spatial Clustering of Applications with Noise (DBSCAN)
基于密度的噪声应用空间聚类(DBSCAN)
Agglomerative Hierarchical Clustering
聚集层次聚类
Balanced Iterative Reducing and Clustering using Hierarchies (BIRCH)
使用层次结构(BIRCH)进行平衡的迭代减少和聚类

接下来做什么？ (What To Do Next?)

This is where ML gets really interesting, whereby the ML algorithm not only predicts but also tells us what to do given its prediction. This family of ML algorithms might not be mature enough yet for all use-cases; however, substantial progress has been made recently in the light of advanced deep learning algorithms and the greater processing power available to us.

这是ML真正令人感兴趣的地方，据此ML算法不仅可以预测，而且可以告诉我们根据其预测该怎么做。对于所有用例，这种ML算法系列可能还不够成熟。但是，鉴于先进的深度学习算法和我们可以使用的更大处理能力，最近已经取得了实质性进展。

These algorithms rely on trial and error and multiple feedback loops while not being as heavily dependant upon data as other algorithms. Mostly applicable in automated systems, the recommended action is usually taken by the machine.

这些算法依赖反复试验和多个反馈循环，而没有像其他算法那样严重依赖数据。通常适用于自动化系统，建议的操作通常由机器执行。

Formally known as Reinforcement Learning, it is usually implemented through deep neural networks.

正式称为强化学习 ，通常是通过深度神经网络来实现的。

Some practical applications of reinforcement learning include:

强化学习的一些实际应用包括：

What should the robot do next in its situation in an industrial concern?
在工业方面，机器人在其情况下下一步该怎么做？
Should we adjust the temperature or leave it untouched?
我们应该调节温度还是保持不变？
How should a self-driving car react (accelerate, decelerate, apply brakes, etc.) given the hazard ahead?
鉴于前方存在危险，无人驾驶汽车应如何React(加速，减速，刹车等)？

结论 (Conclusion)

There you have it: a practical, no-nonsense introduction to functional scenarios where ML assists us in plain language.

在那里，您可以找到实用的，实用的功能介绍，其中ML以简单的语言帮助我们。

Free free to comment or reach out to me if you would like to discuss anything further related to machine learning, data analytics, risk scoring, and financial analysis.

如果您想讨论与机器学习，数据分析，风险评分和财务分析有关的任何其他内容，可以免费发表评论或与我联系。

Till next time, code on!

直到下一次，编码！