吴恩达机器学习编程练习1:线性回归(python)

一:返回一个5阶单位矩阵

import numpy as np 
def warmupExercise():
	E5=np.eye(5)
	print('这是一个五阶单位矩阵')
	print(E5)

warmupExercise()

二:线性回归

1.含有一个变量,大意是:假如你是一个饭店老板,要在其他城市拓展业务,现有数据在ex1data.txt第一列是人口,第二列是收益
导包

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

将数据读取,进行展示

data = pd.read_csv('ex1data1.txt',names=['Population','Profit'])
data.describe()
data.plot(x='Population',y='Profit',kind='scatter')
plt.show()

在这里插入图片描述

data.describe()

在这里插入图片描述
定义损失函数:
在这里插入图片描述
在这里插入图片描述

#代价函数
    # * 在matrix类型中是矩阵的叉乘,multiply是对应元素相乘
    # * 在ndarray类型中,dot或 @ 是叉乘,* 是对应元素相乘
def computeCost(X,y,theta):
    inner = np.power(((X*theta.T) - y ),2)
    return np.sum(inner)/(2*len(X))

能够直接矩阵相乘,增加一列1

#增加x0
data.insert(0,'Ones',1)

将数据分割出来,0-1列是变量x,2列是y

cols = data.shape[1]
print(cols)
X = data.iloc[:,0:cols-1]
y = data.iloc[:,cols-1:cols]

3

X.head()

在这里插入图片描述

y.head()

在这里插入图片描述
转化成matrix类型

X = np.matrix(X.values)
y = np.matrix(y.values)
theta = np.matrix([0,0])
X.shape,y.shape,theta.shape 

在这里插入图片描述
计算代价函数

computeCost(X,y,theta)

在这里插入图片描述
设置梯度下降
公式:在这里插入图片描述

def gradientDescent(X, y, theta, alpha, epoch):
    """reuturn theta, cost"""
    
    temp = np.matrix(np.zeros(theta.shape))  # 初始化一个 θ 临时矩阵(1, 2)
    parameters = int(theta.flatten().shape[1])  # 参数 θ的数量
    cost = np.zeros(epoch)  # 初始化一个ndarray,包含每次epoch的cost
    m = X.shape[0]  # 样本数量m

    for i in range(epoch):
        # 利用向量化一步求解
        temp =theta - (alpha / m) * (X * theta.T - y).T * X
# 以下是不用Vectorization求解梯度下降
#         error = (X * theta.T) - y  # (97, 1)
        
#         for j in range(parameters):
#             term = np.multiply(error, X[:,j])  # (97, 1)
#             temp[0,j] = theta[0,j] - ((alpha / m) * np.sum(term))  # (1,1)        

            
        theta = temp
        cost[i] = computeCost(X, y, theta)
        
    return theta, cost

设置学习率和迭代次数

alpha = 0.01
epoch = 1000
final_theta,cost = gradientDescent(X,y,theta,alpha,epoch)

计算最后的损失

computeCost(X ,y ,final_theta)

在这里插入图片描述
绘制线性模型以及数据,直观地看出它的拟合。

np.linspace()在指定的间隔内返回均匀间隔的数字。

x = np.linspace(data.Population.min(),data.Population.max(),100)
f = final_theta[0,0] + (final_theta[0,1]*x) #预测值
fig, ax = plt.subplots(figsize=(6,4))
ax.plot(x, f,'r', label = 'Prediction')
ax.scatter(data['Population'],data.Profit,label='Traing Data')
ax.legend(loc=2)
ax.set_xlabel('Population')
ax.set_ylabel('Profit')
ax.set_title('Predicted Profit vs. Population Size')
plt.show()

在这里插入图片描述
将cost绘制出来

fig, ax = plt.subplots(figsize=(8,4))
ax.plot(np.arange(epoch),cost,'r')
ax.set_xlabel('Iterations')
ax.set_ylabel('Cost')
ax.set_title('Error vs. Training Epoch')
plt.show()

在这里插入图片描述
2.多个变量:ex2data.txt第一列房子大小,第二列房子卧室数量,第三列房子价格。预测房价。

path = 'ex1data2.txt'
data2 = pd.read_csv(path, names=['Size', 'Bedrooms','Price'])
data2.head()

在这里插入图片描述
预处理步骤 - 特征归一化

data2 = (data2 - data2.mean())/data2.std()
data2.head()
# add ones column
data2.insert(0, 'Ones', 1)

# set X (training data) and y (target variable)
cols = data2.shape[1]
X2 = data2.iloc[:,0:cols-1]
y2 = data2.iloc[:,cols-1:cols]

# convert to matrices and initialize theta
X2 = np.matrix(X2.values)
y2 = np.matrix(y2.values)
theta2 = np.matrix(np.array([0,0,0]))

# perform linear regression on the data set
g2, cost2 = gradientDescent(X2, y2, theta2, alpha, epoch)

# get the cost (error) of the model
computeCost(X2, y2, g2), g2

在这里插入图片描述
绘制代价函数

fig, ax = plt.subplots(figsize=(12,8))
ax.plot(np.arange(epoch),cost2,'r')
ax.set_xlabel('Iterations')
ax.set_ylabel('Cost')
ax.set_title('Error vs. Training Epoch')
plt.show()

在这里插入图片描述
利用sklearn自带的线性回归

from sklearn import linear_model
model = linear_model.LinearRegression()
model.fit(X,y)
x =np.array(X[:,1].A1)
f = model.predict(X).flatten()

fig,ax = plt.subplots(figsize=(8,5))
ax.plot(x, f , 'r',label='Prediction')
ax.scatter(data.Population,data.Profit,label='Traning Data')
ax.legend(loc=2)
ax.set_xlabel('Population')
ax.set_ylabel('Profit')
ax.set_title('Predicted Profit vs. Population Size')
plt.show()

在这里插入图片描述
直接求解的方法

#正规方程
def normalEpn(X,y):
    theta = np.linalg.inv(X.T@X)@X.T@y
    return theta
final_theta2 = normalEpn(X,y)
final_theta

在这里插入图片描述

  • 1
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
Programming Exercise 1: Linear Regression Machine Learning Introduction In this exercise, you will implement linear regression and get to see it work on data. Before starting on this programming exercise, we strongly recom- mend watching the video lectures and completing the review questions for the associated topics. To get started with the exercise, you will need to download the starter code and unzip its contents to the directory where you wish to complete the exercise. If needed, use the cd command in Octave/MATLAB to change to this directory before starting this exercise. You can also find instructions for installing Octave/MATLAB in the “En- vironment Setup Instructions” of the course website. Files included in this exercise ex1.m - Octave/MATLAB script that steps you through the exercise ex1 multi.m - Octave/MATLAB script for the later parts of the exercise ex1data1.txt - Dataset for linear regression with one variable ex1data2.txt - Dataset for linear regression with multiple variables submit.m - Submission script that sends your solutions to our servers [?] warmUpExercise.m - Simple example function in Octave/MATLAB [?] plotData.m - Function to display the dataset [?] computeCost.m - Function to compute the cost of linear regression [?] gradientDescent.m - Function to run gradient descent [†] computeCostMulti.m - Cost function for multiple variables [†] gradientDescentMulti.m - Gradient descent for multiple variables [†] featureNormalize.m - Function to normalize features [†] normalEqn.m - Function to compute the normal equations ? indicates files you will need to complete † indicates optional exercises
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值