机器学习学习笔记——1.1.1.4.2 Linear regression model part 2（线性回归模型——第2部分）

预见未来to50

于 2024-09-14 15:12:15 发布

阅读量489

点赞数 8

分类专栏：机器学习、深度学习（ML/DL) 文章标签：机器学习学习笔记

本文链接：https://blog.csdn.net/hpdlzu80100/article/details/142259241

版权

机器学习、深度学习（ML/DL) 专栏收录该内容

133 篇文章 12 订阅

订阅专栏

Linear regression model part 2（线性回归模型——第2部分）

Let's look in this video at the process of how supervised learning works. Supervised learning algorithm will input a dataset and then what exactly does it do and what does it output? Let's find out in this video. Recall that a training set in supervised learning includes both the input features, such as the size of the house and also the output targets, such as the price of the house. The output targets are the right answers to the model we'll learn from.

To train the model, you feed the training set, both the input features and the output targets to your learning algorithm. Then your supervised learning algorithm will produce some function. We'll write this function as lowercase f, where f stands for function. Historically, this function used to be called a hypothesis, but I'm just going to call it a function f in this class. The job with f is to take a new input x and output and estimate or a prediction, which I'm going to call y-hat, and it's written like the variable y with this little hat symbol on top.

In machine learning, the convention is that y-hat is the estimate or the prediction for y. The function f is called the model. X is called the input or the input feature, and the output of the model is the prediction, y-hat. The model's prediction is the estimated value of y. When the symbol is just the letter y, then that refers to the target, which is the actual true value in the training set. In contrast, y-hat is an estimate. It may or may not be the actual true value. Well, if you're helping your client to sell the house, well, the true price of the house is unknown until they sell it. Your model f, given the size, outputs the price which is the estimator, that is the prediction of what the true price will be. Now, when we design a learning algorithm, a key question is, how are we going to represent the function f? Or in other words, what is the math formula we're going to use to compute f?

For now, let's stick with f being a straight line. You're function can be written as f_w, b of x equals, I'm going to use w times x plus b. I'll define w and b soon. But for now, just know that w and b are numbers, and the values chosen for w and b will determine the prediction y-hat based on the input feature x. This f_w b of x means f is a function that takes x as input, and depending on the values of w and b, f will output some value of a prediction y-hat. As an alternative to writing this, f_w, b of x, I'll sometimes just write f of x without explicitly including w and b into subscript. Is just a simpler notation that means exactly the same thing as f_w b of x. Let's plot the training set on the graph where the input feature x is on the horizontal axis and the output target y is on the vertical axis.

Remember, the algorithm learns from this data and generates the best-fit line like maybe this one here. This straight line is the linear function f_w b of x equals w times x plus b. Or more simply, we can drop w and b and just write f of x equals wx plus b. Here's what this function is doing, it's making predictions for the value of y using a streamline function of x. You may ask, why are we choosing a linear function, where linear function is just a fancy term for a straight line instead of some non-linear function like a curve or a parabola? Well, sometimes you want to fit more complex non-linear functions as well, like a curve like this. But since this linear function is relatively simple and easy to work with, let's use a line as a foundation that will eventually help you to get to more complex models that are non-linear. This particular model has a name, it's called linear regression. More specifically, this is linear regression with one variable, where the phrase one variable means that there's a single input variable or feature x, namely the size of the house.

Another name for a linear model with one input variable is univariate linear regression, where uni means one in Latin, and where variate means variable. Univariate is just a fancy way of saying one variable. In a later video, you'll also see a variation of regression where you'll want to make a prediction based not just on the size of a house, but on a bunch of other things that you may know about the house such as number of bedrooms and other features.

By the way, when you're done with this video, there is another optional lab. You don't need to write any code. Just review it, run the code and see what it does. That will show you how to define in Python a straight line function. The lab will let you choose the values of w and b to try to fit the training data. You don't have to do the lab if you don't want to, but I hope you play with it when you're done watching this video. That's linear regression. In order for you to make this work, one of the most important things you have to do is construct a cost function. The idea of a cost function is one of the most universal and important ideas in machine learning, and is used in both linear regression and in training many of the most advanced AI models in the world. Let's go on to the next video and take a look at how you can construct a cost function.

让我们在这个视频中看看监督学习的工作流程。监督学习算法会输入一个数据集，然后它会做什么，又会输出什么呢？让我们在这个视频中找到答案。回忆一下，在监督学习中的训练集包括输入特征，比如房子的大小，以及输出目标，比如房子的价格。输出目标是模型将要学习的“正确答案”。

为了训练模型，你需要将训练集（包括输入特征和输出目标）输入到你的机器学习算法中。然后你的监督学习算法会产生一些函数。我们将这个函数写成小写f，其中f代表函数。历史上，这个函数曾被称为假设，但在这个课程中我只会称它为函数f。f的工作是接受一个新的输入x并输出一个估计或预测，我将称之为y-hat，用变量y上的这个小帽子符号表示。

在机器学习中，惯例是y-hat是对y的估计或预测。函数f称为模型。X称为输入或输入特征，模型的输出是预测，y-hat。模型的预测是y的估计值。当符号只是字母y时，那是指目标，即训练集中的实际真实值。相比之下，y-hat是一个估计。它可能与实际真实值相同，也可能不同。如果你正在帮助你的客户卖房，那么房子的真实价格在他们卖掉之前是未知的。你的模型f，根据大小，输出价格，这是估计器，也就是对真实价格的预测。现在，当我们设计一个学习算法时，一个关键问题是，我们如何表示函数f？换句话说，我们将使用什么数学公式来计算f？

现在，让我们先坚持使用直线作为f。你的函数可以写成f_w, b of x等于，我将使用w乘以x加上b。我很快就会定义w和b。但现在，只需知道w和b是数字，为w和b选择的值将基于输入特征x确定预测y-hat。这个f_w b of x意味着f是一个函数，它接受x作为输入，根据w和b的值，f将输出一些预测y-hat的值。作为另一种写法，f_w, b of x，我有时会只写f of x而不明确包含w和b作为下标。这只是一个简单的记号，与f_w b of x完全一样的意思。让我们把训练集画在图上，其中输入特征x在横轴上，输出目标y在纵轴上。

记住，算法从这些数据中学习并生成最佳拟合线，可能是这里的这一条。这条直线就是线性函数f_w b of x等于w乘以x加上b。或者更简单地说，我们可以省略w和b，只写f of x等于wx加b。这就是这个函数的作用，它使用x的流线型函数来预测y的值。你可能会问，为什么我们选择一个线性函数，线性函数只是直线的一个花哨术语，而不是像曲线或抛物线这样的非线性函数呢？有时候你也想要拟合更复杂的非线性函数，就像这样一条曲线。但由于线性函数相对简单且易于处理，让我们使用一条线作为基础，这将最终帮助你达到更复杂的非线型模型。这个特定的模型有一个名称，它称为线性回归。更具体地说，这是一个单变量线性回归，其中“单变量”这个词意味着有一个单一的输入变量或特征x，即房子的大小。

对于只有一个输入变量的线性模型的另一个名称是单变量线性回归，其中uni在拉丁语中意为一，而variate意为变量。单变量只是说一个变量的花哨方式。在后面的视频中，你还将看到一个回归的变体，在那里你想要根据关于房子你可能知道的其他一些事情进行预测，比如卧室数量和其他特征。

顺便说一下，当你看完这个视频后，还有另一个可选的实验室。你不需要编写任何代码。只需要回顾它，运行代码并查看其功能。这将向你展示如何在Python中定义一条直线函数。该实验室将让你选择w和b的值来尝试拟合训练数据。如果你不想做实验室，你不必做，但我希望你在观看完这个视频后能尝试一下。那就是线性回归。为了让这个工作起来，你必须做的最重要的事情之一就是构建一个成本函数。成本函数的概念是机器学习中最普遍和最重要的概念之一，用于线性回归和世界上许多最先进的AI模型的训练中。让我们继续下一个视频，看看你如何构建一个成本函数。