机器学习中的随机过程_机器学习过程

最新推荐文章于 2024-05-14 06:30:00 发布

weixin_26752765

最新推荐文章于 2024-05-14 06:30:00 发布

阅读量1.7k

点赞数

文章标签：机器学习 python 人工智能 java 深度学习

原文链接：https://towardsdatascience.com/machine-learning-process-7beab5c4f31b

版权

机器学习中的随机过程If you would like to get a general introduction to Machine Learning before this, check out this article: 如果您想在此之前获得机器学习的一般介绍，请查看本文: Now that we understand what Machine Learning is, let us n...

摘要由CSDN通过智能技术生成

机器学习中的随机过程

If you would like to get a general introduction to Machine Learning before this, check out this article:

如果您想在此之前获得机器学习的一般介绍，请查看本文：

Now that we understand what Machine Learning is, let us now learn about how Machine Learning is applied to solve any problem.

现在我们了解了什么是机器学习，现在让我们了解如何将机器学习应用于解决任何问题。

This is the basic process which is used to apply machine learning to any problem :-

这是将机器学习应用于任何问题的基本过程：

资料收集 (Data Gathering)

The first step to solving any machine learning problem is to gather relevant data. It could be from different sources and in different formats like plain text, categorical or numerical. Data Gathering is important as the outcome of this step directly affects the nature of our problem.

解决任何机器学习问题的第一步是收集相关数据。它可能来自不同的来源，并且格式不同，例如纯文本，分类或数字。数据收集非常重要，因为此步骤的结果直接影响我们问题的性质。

In most cases, data is not handed to us on a silver platter all ready-made, that is, it is not usually the case that the data we have decided is relevant may be available right away. It is very much possible that we may have to perform some sort of an exercise or a controlled experiment to gather data that we can work with. We must also keep in mind that the data we are collecting is from legitimate and legal processes such that all the parties involved are well aware of what is being collected.

在大多数情况下，数据并没有全部准备就绪地交给我们，也就是说，通常我们不会立即获得我们认为相关的数据。我们很有可能必须执行某种练习或受控实验来收集可以使用的数据。我们还必须记住，我们正在收集的数据来自合法和合法的流程，因此所有相关方都非常了解所收集的内容。

Let us, for the purpose of this article, assume that we have gathered data about cars and that we are trying to predict the price of a new car with the help of machine learning.

出于本文的目的，让我们假设我们已经收集了有关汽车的数据，并且我们正在尝试借助机器学习来预测新车的价格。

数据预处理 (Data Preprocessing)

Now that we have gathered data that is relevant to the problem in hand, we must bring it to a homogeneous state. The present form of our data could include datasets of various types, maybe a table made up of a thousand rows and multiple columns of car data, or maybe pictures of cars from different angles. It is always advisable to keep thing simple and work with data of one particular type, that is, we should decide before we start working on our algorithm whether we want to work with image data, text data, or video data if we are feeling a little too adventurous!

现在，我们已经收集了与手头问题相关的数据，我们必须将其置于同类状态。我们数据的当前形式可能包括各种类型的数据集，可能是由一千行和多列汽车数据组成的表格，也可能是来自不同角度的汽车图片。始终建议保持简单并使用一种特定类型的数据，也就是说，如果我们感觉要使用图像数据，文本数据或视频数据，则应该在开始算法之前就决定要使用图像数据，文本数据还是视频数据。有点冒险！

Image for post — *Types of Data.* Photo by Author. *数据类型。* 图片由作者提供。

Like every computer program, Machine Learning algorithms also only understand 1s and 0s. So in order to run any such algorithm, we have to first convert the data into a machine-readable format. It simply won’t understand if we put on a slideshow of our pictures!We can go with any type of data -numerical, image, video or text- but we will have to configure it such that it is machine understandable. We make sure this happens by Encoding the data — a process in which we take all the types of data and represent them numerically.

像每个计算机程序一样，机器学习算法也只能理解1和0。因此，为了运行任何此类算法，我们必须首先将数据转换为机器可读格式。它根本无法理解是否要对图片进行幻灯片放映！我们可以处理任何类型的数据-数字，图像，视频或文本-但我们必须对其进行配置，以使机器可以理解。我们通过对数据进行编码来确保做到这一点-在此过程中，我们将获取所有类型的数据并以数字形式表示它们。

For a simple and comprehensible introduction to Data Preprocessing and all the steps involved, check out this article :

有关数据预处理及其涉及的所有步骤的简单而易懂的介绍，请查看本文：

训练和测试数据 (Train and Test Data)

Before we start building a Machine Learning model, we have to first identify our features and decide on our goal. Features are the attributes of our data which tell us about the different entities in the data. For instance, we could be having a huge dataset about cars to predict the price of a new car using machine learning. With these cars being the entities, features, in this case, might be the engine power, mileage, top speed, color, seating capacity, type of car etc. etc..The goal or the Target variable, in this case, would be the price of the car.

在开始构建机器学习模型之前，我们必须首先确定我们的功能并确定我们的目标。特征是数据的属性，可告诉我们数据中的不同实体。例如，我们可能拥有大量有关汽车的数据集，以便使用机器学习来预测新车的价格。以这些汽车为实体，在这种

最低0.47元/天解锁文章

weixin_26752765

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
机器学习中的随机过程_机器学习过程

机器学习中的随机过程If you would like to get a general introduction to Machine Learning before this, check out this article: 如果您想在此之前获得机器学习的一般介绍，请查看本文: Now that we understand what Machine Learning is, let us n...
复制链接

扫一扫