P1 【机器学习】机器学习简介

最新推荐文章于 2025-04-04 23:27:06 发布

张小怪的碗

最新推荐文章于 2025-04-04 23:27:06 发布

阅读量176

点赞数

分类专栏：机器学习【吴恩达】系列课程笔记文章标签：机器学习监督学习无监督学习

本文链接：https://blog.csdn.net/Rosamund233/article/details/120236258

版权

机器学习【吴恩达】系列课程笔记专栏收录该内容

5 篇文章

订阅专栏

1.什么是机器学习 What is Machine Learning

2.有监督学习 Supervised Learning

2.1生活中有趣的例子

2.2有监督学习的定义

3.无监督学习 Unsupervised Learning

3.1无监督学习的定义

3.2聚类算法的实例

1.什么是机器学习 What is Machine Learning

关于机器学习的定义，一种更早的的定义是Arthur Samuel提出的，他利用机器学习让计算机自己对弈上万次，最终学会如何下跳棋。另一种是Tom提出的，用E表示学习过程的经验，T表示目标任务，P表示评估性能的度量，机器学习就是我们可以用P来观测，通过E来提高T的表现。

Two definitions of Machine Learning are offered. Arthur Samuel described it as: "the field of study that gives computers the ability to learn without being explicitly programmed." This is an older, informal definition.

Tom Mitchell provides a more modern definition: "A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E."

Example: playing checkers.

E = the experience of playing many games of checkers

T = the task of playing checkers.

P = the probability that the program will win the next game.

In general, any machine learning problem can be assigned to one of two broad classifications:

Supervised learning and Unsupervised learning.

有监督学习和无监督学习是机器学习的两大类，也是最常用的两种算法。有监督学习是在已知正确结果的情况下让机器通过学习得到预测的结果，而无监督学习则是不知道结果的前提下让机器找出数据中的规律，得出预测。

2.有监督学习 Supervised Learning

2.1生活中有趣的例子

【连续值的例子】拿房价预测来举例，如下图，坐标轴中红色的X表示实际中已知的数据，即不同平方大小的房子的价格，那么Learning Algorithm 的作用就是基于数据进行模型的拟合，可能是一条直线，也可能是一条曲线，通过机器学习拟合出的模型，我们可以根据不同的size预测房价的大致结果。

【离散值的例子】拿医院里判断肿瘤是否恶性来举例，我们已知图中蓝色的XO代表较小的肿瘤以及红色的X代表较大的肿瘤在以往病例中是恶性/良性的数据，在这个例子中的分类只有简单的两个，但在现实生活中往往有不止两个的分类，但基本原理都是利用机器学习算法基于以往大量数据，通过建立模型来预测新的size的分类结果。

同时，横坐标的影响因素也可能不止一个，例如下图age和size都是可能影响肿瘤良性/恶性的因素，黑色的线就是机器学习算法拟合出来的，用于分类的模型，这样以来，我们可以通过输入size和age来预测肿瘤良性/恶性的概率。

2.2有监督学习的定义

通过上面的例子，可以更加清晰地了解下列关于有监督学习的定义。有监督学习有两类问题，分别是回归问题和分类问题，二者解决的分别是连续值问题和离散值问题。

In supervised learning, we are given a data set and already know what our correct output should look like, having the idea that there is a relationship between the input and the output.

Supervised learning problems are categorized into "regression" and "classification" problems. In a regression problem, we are trying to predict results within a continuous output, meaning that we are trying to map input variables to some continuous function. In a classification problem, we are instead trying to predict results in a discrete output. In other words, we are trying to map input variables into discrete categories.

3.无监督学习 Unsupervised Learning

3.1无监督学习的定义

如图，给定一个数据集，通过无监督学习的算法，可以判定该数据集中有两个不同的簇，并将之分成两个类，这个过程就是“无监督学习”或者“聚类”。

Unsupervised learning allows us to approach problems with little or no idea what our results should look like. We can derive structure from data where we don't necessarily know the effect of the variables.

We can derive this structure by clustering the data based on relationships among the variables in the data.

With unsupervised learning there is no feedback based on the prediction results.