Programming Exercise 1: Linear Regression with one variable

本文介绍如何通过可视化数据理解食物卡车收益与人口的关系,并使用Python进行线性回归分析,利用梯度下降法逐步优化模型参数,以帮助餐厅选择扩张城市。你将学习如何计算成本函数并监控模型收敛,最终绘制预测线和训练数据点。
摘要由CSDN通过智能技术生成

Linear Regression with one variable

Introduction

In this part of this exercise, you will implement linear regression with one variable to predict profits for a food truck. Suppose you are the CEO of a restaurant franchise and are considering different cities for opening a new outlet. The chain already has trucks in various cities and you have data for profits and populations from the cities.You would like to use this data to help you select which city to expand to next. The file ex1data1.txt contains the dataset for our linear regression problem. The first column is the population of a city and the second column is the profit of a food truck in that city. A negative value for profit indicates a loss.

Code

Plotting the Data

Before starting on any task, it is often useful to understand the data by visualizing it. For this dataset, you can use a scatter plot to visualize the data, since it has only two properties to plot (profit and population). (Many other problems that you will encounter in real life are multi-dimensional and can’t be plotted on a 2-d plot.)

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt


path = 'ex1data1.txt'
data = pd.read_csv(path,header=None,names=['Population','Profit'])
data.head()
# print(data.head())

Population Profit
0 6.1101 17.5920
1 5.5277 9.1302
2 8.5186 13.6620
3 7.0032 11.8540
4 5.8598 6.8233

data.describe()
# print(data.describe())
      Population     Profit

count 97.000000 97.000000
mean 8.159800 5.839135
std 3.869884 5.510262
min 5.026900 -2.680700
25% 5.707700 1.986900
50% 6.589400 4.562300
75% 8.578100 7.046700
max 22.203000 24.147000

data.plot(kind='scatter',x='Population',y='Profit',figsize=(12,8))
# plt.show()

在这里插入图片描述
Figure 1: Scatter plot of training data

Gradient Descent

In this part, you will fit the linear regression parameters θ to our dataset using gradient descent.

  • Upadata Equations
    The obiective of linear regression is to minimize the cost function
    J( θ 0 , θ 1 \theta_0, \theta_1 θ0,θ1) = 1 2 m \dfrac {1}{2m} 2m1 ∑ i = 1 m \displaystyle \sum_{i=1}^m i=1m ( y ^ i − y i ) 2 \left ( \hat{y}_{i}- y_{i} \right)^2 (y^iyi)2 = 1 2 m \dfrac {1}{2m} 2m1 ∑ i = 1 m \displaystyle \sum _{i=1}^m i=1m ( h θ ( x ( i ) ) − y ( i ) ) 2 \left (h_\theta (x^{(i)}) - y^{(i)} \right)^2 (hθ(x(i))y(i))2
    Where the hypothesis h θ ( x ) h_\theta (x) hθ(x) is given by the linear model
    h θ ( x ) = θ T x = θ 0 + θ 1 h_\theta(x)=\theta^Tx=\theta_0+\theta_1 hθ(x)=θTx=θ0+θ1
    Recall that the parameters of your model are the θ j \theta_j θj values. These are the values you will adjust to minimize cost J ( θ ) J(θ) J(θ). One way to do this is to use the batch gradient descent algorithm. In batch gradient descent, each iteration performs the update
    θ j : = θ j − α 1 m ∑ i = 1 m ( h θ ( x ( i ) − y ( i ) ) x j ( i ) \theta_j := \theta_j - \alpha\dfrac{1}{m}\displaystyle\sum_{i=1}^m(h_\theta(x^{(i)}-y^{(i)})x^{(i)}_j θj:=θjαm1i=1m(hθ(x(i)y(i))xj(i) (simultaneously updata θ j \theta_j θj for all j j j).
    With each step of gradient descent, your parameters θ j θ_j θj come closer to the optimal values that will achieve the lowest cost J ( θ ) J(θ) J(θ).
    Note:We store each example as a row in the the X matrix in Python. To take into account the intercept term ( θ 0 θ_0 θ0), we add an additional first column to X and set it to all ones. This allows us to treat θ 0 θ_0 θ0 as simply another ‘feature’.
  • Implementation
    In the following lines, we add another dimension[维度] to our data to accommodate the θ 0 θ_0 θ0 intercept term. We also initialize the initial parameters to 0 and the learning rate alpha to 0.01.
  • Computing the cost J ( θ ) J(\theta) J(θ)
    As you perform gradient descent to learn minimize the cost function J ( θ ) J(θ) J(θ), it is helpful to monitor the convergence by computing the cost. In this section, you will implement a function to calculate J ( θ ) J(θ) J(θ) so you can check the convergence of your gradient descent implementation.
def computeCost(X,y,theta):
    # $$J\left( \theta \right)=\frac{1}{2m}\sum\limits_{i=1}^{m}{{{\left( {{h}_{\theta }}\left( {{x}^{(i)}} \right)-{{y}^{(i)}} \right)}^{2}}}$$

    inner = np.power(((X*theta.T)-y),2)
    return np.sum(inner)/(2*len(X))

# 在训练集中添加一列,以便我们可以使用向量化的解决方案来计算代价和梯度
data.insert(0,'Ones',1)
# Set X(training data) and y (target variable)
cols = data.shape[1]
X = data.iloc[:,0:cols-1]
y = data.iloc[:,cols-1:cols]
X.head()
y.head()
# print(X.head())
# print(y.head())

# 代价函数是numpy矩阵,所以将转换X和y
# 初始化X,y,theta
X = np.matrix(X.values)
y = np.matrix(y.values)
theta = np.matrix(np.array([0,0]))
'''
查看theta,X,y的矩阵维度,以及代价函数(theta初始值为0)
print(theta)
print(X.shape,theta.shape,y.shape)
print(computeCost(X,y,theta))
'''

X.head()
Ones Population
0 1 6.1101
1 1 5.5277
2 1 8.5186
3 1 7.0032
4 1 5.8598
y.head()
Profit
0 17.5920
1 9.1302
2 13.6620
3 11.8540
4 6.8233

  • batch gradient decent
    θ j : = θ j − α ∂ ∂ θ j J ( θ \theta_j := \theta_j - \alpha \frac{\partial}{\partial \theta_j} J(\theta θj:=θjαθjJ(θ)
def gradientDescent(X,y,theta,alpha,iters):
    temp = np.zeros(theta.shape)
    parameters = int(theta.ravel().shape[1])
    cost = np.zeros(iters)

    for i in range(iters):
        error = (X*theta.T)-y

        for j in range(parameters):
            term = np.multiply(error,X[:,j])
            temp[0,j] = theta[0,j]-((alpha/len(X)))*np.sum(term)

        theta = temp
        cost[i] = computeCost(X,y,theta)

    return theta,cost
  • Debugging

# 初始化学习速率alpha和要执行的迭代次数
alpha = 0.01
iters = 1000
g,cost = gradientDescent(X,y,theta,alpha,iters)
# print(g)
computeCost(X,y,g)
# print(computeCost(X,y,g))

# 2.3 绘制线性模型以及数据
x = np.linspace(data.Population.min(),data.Population.max(),100)
f = g[0,0]+(g[0,1]*x)

fig,ax = plt.subplots(figsize=(12,8))
ax.plot(x,f,'r',label='Prediction')
ax.scatter(data.Population,data.Profit,label='Traning Data')
ax.legend(loc=2)
ax.set_xlabel('Population')
ax.set_ylabel('Profit')
ax.set_title('Predicted Profit vs.Population Size')
plt.show()

fig,ax =plt.subplots(figsize=(12,8))
ax.plot(np.arange(iters),cost,'r')
ax.set_xlabel('IIterations')
ax.set_ylabel('Cost')
ax.set_title('Error vs,Training Epoch')
plt.show()

在这里插入图片描述

在这里插入图片描述

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值