Optional Lab: Model Representation(Linear Regression with One Variable)

gravity_w

已于 2023-12-18 11:40:44 修改

阅读量927

点赞数 15

分类专栏：机器学习文章标签：线性回归算法回归机器学习笔记 python numpy

于 2023-12-17 19:02:18 首次发布

本文链接：https://blog.csdn.net/gravity_wy/article/details/135044376

版权

机器学习专栏收录该内容

9 篇文章 2 订阅

订阅专栏

Goals

In this lab you will learn to implement the model $f_{w,b}$ for linear regression with one variable.

Tools

In this lab you will make use of:

Numpy, a popular library for scientific computing
Matplotlib, a popular library for plotting data

import numpy as np
import matplotlib.pyplot as plt
plt.style.use('./deeplearning.mplstyle')

plt.style.use()是使用matplotlib自带或自定义的几种美化样式，就可以很轻松的对生成的图形进行美化

# 获取所有的美化样式并输出
print(plt.style.available)

Problem Statement

房价预测，使用一个由两个点组成的数据集，分别为 (1.0, 300) 和 (2.0, 500)
x是 size(1000 square feet-sqft)，y是 price(1000s of dollars)
通过这两个点，找到一个合适的线性回归模型，并预测1200sqft的房子的价格

使用NumPy的一维数组创建x和y变量，使用f-string格式进行输出

# x_train is the input variable (size in 1000 square feet)
# y_train is the target (price in 1000s of dollars)
x_train = np.array([1.0, 2.0])
y_train = np.array([300.0, 500.0])
print(f"x_train = {x_train}")
print(f"y_train = {y_train}")

输出如下

x_train = [1. 2.]
y_train = [300. 500.]

Number of training examples m

使用m来表示训练示例的数量
Numpy数组有一个.shape参数，x_train.shape返回一个python元组tuple，x_train.shape[0]表示数组的长度和示例的数量
或者使用函数len()显示长度

# m is the number of training examples
print(f"x_train.shape: {x_train.shape}")
m = x_train.shape[0]
print(f"Number of training examples is: {m}")

# m is the number of training examples
m = len(x_train)
print(f"Number of training examples is: {m}")

输出如下

x_train.shape: (2,)
Number of training examples is: 2

Number of training examples is: 2

Training example x_i, y_i

Use (x $^{(i)}$ , y $^{(i)}$ ) to denote the $i^{th}$ training example.Since Python is zero indexed, (x $^{(0)}$ , y $^{(0)}$ ) is (1.0, 300.0) and (x $^{(1)}$ , y $^{(1)}$ ) is (2.0, 500.0).
用索引获得数组中单个元素值，如获得第0个x值，用x_train[0]

i = 0 # Change this to 1 to see (x^1, y^1)

x_i = x_train[i]
y_i = y_train[i]
print(f"(x^({i}), y^({i})) = ({x_i}, {y_i})")

输出如下

(x^(0), y^(0)) = (1.0, 300.0)

Plotting the data

可以用matplotlib库中的函数scatter()来绘制两个点的散点图
s：形状的大小，默认20，可以是数组，每个参数为对应点大小
c：形状的颜色，b-blue g-green r-red c-cyan m-magenta y-yellow k-black w-white
marker：常见的点的形状

标记	符号	标记	符号
.	点	*	星形
,	像素点	h	1号六角形
o	圆形	H	2号六角形
v	朝下三角形	+	+号标记
^	朝上三角形	x	x号标记
<	朝左三角形	D	菱形
>	朝右三角形	d	小型菱形
s	正方形	\|	垂直线形
p	五边形	_	水平线形

# Plot the data points
plt.scatter(x_train, y_train, marker='x', c='r')
# Set the title
plt.title("Housing Prices")
# Set the y-axis label
plt.ylabel('Price (in 1000s of dollars)')
# Set the x-axis label
plt.xlabel('Size (1000 sqft)')
plt.show()

在这里插入图片描述
将鼠标放在图片上时，图片会实时在右上角显示x和y的坐标值

Model function

As described in lecture, the model function for linear regression (which is a function that maps from x to y) is represented as

$f_{w,b}(x^{(i)}) = wx^{(i)} + b$

The formula above is how you can represent straight lines - different values of $w$ and $b$ give you different straight lines on the plot.

Let’s start with $w = 100$ and $b = 100$ .

w = 100
b = 100
print(f"w: {w}")
print(f"b: {b}")

输出如下

w: 200
b: 100

Now, let’s compute the value of $f_{w,b}(x^{(i)})$ for your two data points. You can explicitly write this out for each data point as -

for $x^{(0)}$ , f_wb = w * x[0] + b

for $x^{(1)}$ , f_wb = w * x[1] + b

对于大量数据点，一个一个写会重复和冗余，可以使用for循环计算函数的输出值

Note: The argument description (ndarray (m,)) describes a Numpy n-dimensional array of shape (m,). (scalar) describes an argument without dimensions, just a magnitude. np.zero(n) will return a one-dimensional numpy array with $n$ entries.

def compute_model_output(x, w, b):
    """
    Computes the prediction of a linear model
    Args:
      x (ndarray (m,)): Data, m examples 
      w,b (scalar)    : model parameters  
    Returns
      y (ndarray (m,)): target values
    """
    m = x.shape[0]
    f_wb = np.zeros(m)
    for i in range(m):
        f_wb[i] = w * x[i] + b
        
    return f_wb

Now let’s call the compute_model_output function and plot the output.

可以用matplotlib库中的函数plot()来绘制两个点的曲线图
plt.plot(x, y, "格式控制字符串", 关键字 = 参数)，格式控制字符串最多可以包括三个部分，“颜色”、“点型”、“线型”
例如plt.plot(x, y, “ob:”) ，"b"为蓝色，“o"为圆点，”:"为点线
可以使用关键字控制属性，如color = “blue”、linewidth = 20、marker = “o”、markersize = 50、markerfacecolor = “red”、markeredgewidth = 6、markeredgecolor = “grey”、linestyle = "solid"或linestyle = “-”、lable = “our”

参数	线型	参数	线型
: dotted	点线	– dashed	短划线/虚线
-. dashdot	点画线	- solid	实线

tmp_f_wb = compute_model_output(x_train, w, b,)

# Plot our model prediction
plt.plot(x_train, tmp_f_wb, c='b',label='Our Prediction')

# Plot the data points
plt.scatter(x_train, y_train, marker='x', c='r',label='Actual Values')

# Set the title
plt.title("Housing Prices")
# Set the y-axis label
plt.ylabel('Price (in 1000s of dollars)')
# Set the x-axis label
plt.xlabel('Size (1000 sqft)')
plt.legend() # 添加图例，自动选择最佳位置
plt.show()

在这里插入图片描述

As you can see, setting $w = 100$ and $b = 100$ does not result in a line that fits our data.
具体合适的 $w$ 和 $b$ 需要用cost function来求解

Prediction

Now that we have a model, we can use it to make our original prediction. Let’s predict the price of a house with 1200 sqft. Since the units of $x$ are in 1000’s of sqft, $x$ is 1.2.

w = 200                         
b = 100    
x_i = 1.2
cost_1200sqft = w * x_i + b    

print(f"${cost_1200sqft:.0f} thousand dollars")

输出如下

$340 thousand dollars

Congratulations!

In this lab you have learned:
Linear regression builds a model which establishes a relationship between features and targets

In the example above, the feature was house size and the target was house price
for simple linear regression, the model has two parameters $w$ and $b$ whose values are ‘fit’ using training data.
once a model’s parameters have been determined, the model can be used to make predictions on novel data.

gravity_w

关注

15
点赞
踩
17

收藏

觉得还不错? 一键收藏
打赏
1
评论
Optional Lab: Model Representation(Linear Regression with One Variable)

吴恩达机器学习optional lab model_representation_soln
复制链接

扫一扫