吴文达机器学习第一次作业（单特征值线性归化）

最新推荐文章于 2024-07-16 15:12:22 发布

不摆就是好孩子

最新推荐文章于 2024-07-16 15:12:22 发布

阅读量61

点赞数

文章标签：机器学习 python 人工智能

本文链接：https://blog.csdn.net/weixin_47100795/article/details/133656194

版权

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

'''
一.linear regression问题描述：
在本部分的练习中，您将使用一个变量实现线性回归，以预测食品卡车的利润。
假设你是一家餐馆的首席执行官，正在考虑不同的城市开设一个新的分店。该连锁店已经在各个城市拥有卡车，
而且你有来自城市的利润和人口数据。 您希望通过使用这些数据来帮助您扩展到下一个城市；
'''


'''
查看数据准备成功与否；分别查看其前5行、后5行数据情况以及数据分析情况。
print(data.head())     
print(data.tail())
print(data.describe())
'''



data = pd.read_csv('E:/Code/machine learning/ex1data1.txt', sep=',', names=['population', 'profit'])

# 生成数据散点图
# 1.将Dataframe中的数据提取成数组格式放入x_aixs、y_aixs中；
# 2.使用scatter()函数生成散点图：x轴为population数据，y轴为profit数据，参数c表示散点颜色，s表示散点大小；
x_axis = data['population'].values
y_axis = data['profit'].values
plt.scatter(x_axis, y_axis, c='b', s=20)
plt.xlabel('Population')
plt.ylabel('Profit')
# plt.show()


# 组装数据，生成X矩阵和Y矩阵，X矩阵为m * 2，第一列为新添加列，列名为one，其values均为1；第二列为population
# Y为m*1矩阵，唯一列名为profit；m为数据集的样本数量。Y为m*1矩阵，唯一的列values为profit。

data.insert(loc=0, column='one', value=1)
col = data.shape[1]
x_matrix = np.matrix(data.iloc[:, :-1].values).reshape(97,2)
y_matrix = np.matrix(data.iloc[:, col-1:col].values).reshape(97,1)
theta = np.matrix([0,0])

# 代价函数
def cost_function(X, y, theta):
    m = data.shape[0]
    h_theta = np.power(((X * theta.T)  - y), 2)
    cost = np.sum(h_theta)/(2*m)
    return cost

# 梯度下降函数（生成代价函数显示图）
def gradientDescent(X, y, theta, alpha, epoch):
    '''

    :param X: 输入矩阵（ones，population）
    :param y: 利润矩阵（profit）
    :param theta: 参数
    :param alpha: 学习率
    :param epoch: 迭代次数
    :return:
    '''
    # 样本个数
    m = data.shape[0]
    # 参数的临时变量
    theta_temp = np.matrix(np.zeros(theta.shape))
    # 参数个数
    paramaeters_num = int(theta.flatten().shape[1])
    # 代价矩阵
    cost = np.zeros(epoch)
    # 记录每一轮的theta
    counterTheta = np.zeros((epoch, 2))
    for i in range(epoch):
        theta_temp = theta - (alpha/m) * ((X * theta.T) - y).T * X
        theta = theta_temp
        counterTheta[i] = theta
        cost[i] = cost_function(x_matrix, y_matrix, theta)
    return counterTheta, theta, cost
# 设置学习率和迭代次数
alpha = 0.01
epoch = 3000
counterTheta, final_theta, cost = gradientDescent(x_matrix, y_matrix, theta, alpha, epoch)
cost_function(x_matrix, y_matrix, final_theta)
# 生成拟合函数图和代价函数图
x = np.linspace(start=data.population.min(), stop=data.population.max(), num=100)
f = final_theta[0,0] + final_theta[0,1] * x    # theta[0,0]表示第一行第一列；theta[0,1]表示第一行第二列
figure, ax = plt.subplots(nrows=1, ncols=2)
# 线性回归图
ax[0].plot(x, f, 'r', label='Prediction')
ax[0].scatter(data.population, data.profit, label='Training Data')
ax[0].set_xlabel('Population')
ax[0].set_ylabel('Profit')
ax[0].legend(loc=2)
ax[0].set_title('Prediction Profit vs. Population Size')
# 代价函数迭代图
ax[1].plot(np.arange(epoch), cost, 'r')
ax[1].set_xlabel('Iteration')
ax[1].set_ylabel('Cost')
ax[1].set_title('Cost vs. Iteration')
plt.show()
运行结果：
1.散点图

2.线性回归图 和 代价函数图






# if __name__ == '__main__':
#     print(cost_function(x_matrix,y_matrix,theta))
#     print(np.zeros((2,3)))
#     # 跑模型并预测
#     alpha = 0.01
#     epoch = 3800
#     counterTheta, final_theta, cost = gradientDescent(x_matrix, y_matrix, theta, alpha, epoch)
#     cost_function(x_matrix, y_matrix, final_theta)
#
#     predict1 = [1, 3.5] * final_theta.T
#     print("predict1:", predict1)
#     predict2 = [1, 7] * final_theta.T
#     print("predict2:", predict2)

不摆就是好孩子

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
吴文达机器学习第一次作业（单特征值线性归化）

f = final_theta[0,0] + final_theta[0,1] * x # theta[0,0]表示第一行第一列；# 2.使用scatter()函数生成散点图：x轴为population数据，y轴为profit数据，参数c表示散点颜色，s表示散点大小；# 组装数据，生成X矩阵和Y矩阵，X矩阵为m * 2，第一列为新添加列，列名为one，其values均为1；# Y为m*1矩阵，唯一列名为profit；m为数据集的样本数量。:param X: 输入矩阵（ones，population）
复制链接

扫一扫

吴文达机器学习第一次作业（单特征值线性归化）

“相关推荐”对你有帮助么？