Ex1_机器学习_吴恩达课程作业(Python):线性回归(Linear Regression)

Ex1_机器学习_吴恩达课程作业(Python):线性回归(Linear Regression)

使用说明:

本文章为关于吴恩达老师在Coursera上的机器学习课程的学习笔记。

  • 本文第一部分首先介绍课程对应周次的知识回顾以及重点笔记,以及代码实现的库引入。
  • 本文第二部分包括代码实现部分中的自定义函数实现细节。
  • 本文第三部分即为与课程练习题目相对应的具体代码实现。

0. Pre-condition

This section includes some introductions of libraries, as well as some indispensable points you need to know before implementing.

A. Notes

  • 机器学习(Machine Learning)

    英文:

    A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.

    中文:

    机器学习(Machine Learning)是研究计算机怎样模拟或实现人类的学习行为,以获取新的知识或技能,重新组织已有的知识结构使之不断改善自身的性能。一个程序被认为能从经验E中学习,解决任务 T,达到性能度量值P,当且仅当,有了经验E后,经过P评判, 程序在处理T时的性能有所提升。

  • 监督学习(Supervised Learning)

    英文:

    In supervised learning, we are given a data set and already know what our correct output should look like, having the idea that there is a relationship between the input and the output.

    Supervised learning problems are categorized into “regression” and “classification” problems.

    • Regression(回归):In a regression problem, we are trying to predict results within a continuous output, meaning that we are trying to map input variables to some continuous function. (i.e. House price problem)
    • Classification(分类):In a classification problem, we are instead trying to predict results in a discrete output. In other words, we are trying to map input variables into discrete categories. (i.e. Tumor problem)

    中文:

    监督学习(Supervised Learning):对于数据集中每一个样本都有对应的标签,包括回归(regression)和分类(classification);

  • 无监督学习(Unsupervised Learning)

    英文:

    Unsupervised learning allows us to approach problems with little or no idea what our results should look like. We can derive structure from data where we don’t necessarily know the effect of the variables.

    We can derive this structure by clustering the data based on relationships among the variables in the data.

    With unsupervised learning there is no feedback based on the prediction results.

    • Clustering(聚类): Take a collection of 1,000,000 different genes, and find a way to automatically group these genes into groups that are somehow similar or related by different variables, such as lifespan, location, roles, and so on.
    • Non-clustering(非聚类): The “Cocktail Party Algorithm”, allows you to find structure in a chaotic environment. (i.e. identifying individual voices and music from a mesh of sounds at a cocktail party).

    中文:
    无监督学习(Unsupervised Learning):数据集中没有任何的标签,包括聚类(clustering),著名的一个例子是鸡尾酒晚会。

  • 单变量线性回归
    • 模型表示(Model Representation)

      线性回归模型:

      在这里插入图片描述

      给定训练样本 ( x i , y i ) (x^i , y^i) (xi,yi)​,其中:$ i = 1 , 2 , . . . , m; i=1,2,…,m$​。 x x x​ 表示特征, y y y​ 表示输出目标,监督学习算法的工作方式如图所示:
      在这里插入图片描述

    • 假设函数(Hypothesis)

      是一个从输入 x x x 到输出 y y y 的映射, h ( x ) = θ 0 + θ 1 h(x) = θ_0 + θ_1 h(x)=θ0+θ1 θ 0 θ_0 θ0 θ 1 θ_1 θ1​ 都是模型参数。

    • 代价函数(Cost Function)

      We can measure the accuracy of our hypothesis function by using a cost function. This takes an average difference (actually a fancier version of an average) of all the results of the hypothesis with inputs from x’s and the actual output y’s. This function is otherwise called the “Squared error function”, or “Mean squared error”.

      我们可以通过使用代价函数来衡量假设函数的准确性。它取所有假设结果的平均差值,输入是 x x x,实际输出是 y y y

      在这里插入图片描述

      代价函数的另外一个图形表示是等高图,如图所示:

      在这里插入图片描述

    • 梯度下降(Gradient Descent)

      在这里插入图片描述
      在这里插入图片描述
      在这里插入图片描述

  • 多变量线性回归
    • 特征缩放(Feature Scaling)

      目的:通过特征缩放,可以保证特征处于相似的尺度上,有利于加快梯度下降算法运行速度,加快收敛到全局最小值。

      方法:通常如下图所示:减去平均值,并且除以标准差或者极差。

      在这里插入图片描述

    • 学习率 α α α​​(Learning rate)

      α α α 为学习速率(learning rate):如果 α α α 太小,梯度下降会变得缓慢;如果 α α α 太大,梯度下降可能无法收敛甚至发散。
      在这里插入图片描述

    • 正规方程(Normal Equation)

      公式:

      在这里插入图片描述

      方程存在条件:

      X T X X^T X XTX​ 是可逆矩阵,若不可逆,可计算广义可逆矩阵。

      在这里插入图片描述

  • 梯度下降 vs 正规方程(Gradient Descent vs Normal Equation)

    在这里插入图片描述

B. Libs introduction

# Programming exercise 1 for week 2

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import ex1_function as func  # A customized class

00. Self-created Functions

This section includes self-created functions.

  • computeCost(X, y, theta):计算损失

    在这里插入图片描述

    # Compute the cost 计算损失
    def computeCost(X, y, theta):
        inner = np.power((np.dot(X, np.transpose(theta))) - y, 2)
        return np.sum(inner) / (2 * len(X))
    
  • gradientDescent(X, y, theta, alpha, iters):梯度下降实现

    在这里插入图片描述
    在这里插入图片描述

    在这里插入图片描述

    # Use Gradient Descent to modify parameters 利用梯度下降修正参数
    def gradientDescent(X, y, theta, alpha, iters):
        row = X.shape[0]
        cost = np.zeros(iters)
        parameters = int(theta.flatten().shape[1])
        temp = np.matrix(np.zeros(theta.shape))
        for i in range(iters):
            # 通过Vectorization求解
            temp = theta - (alpha / row) * (X * theta.T - y).T * X
            # 不通过Vectorization求解
            # error = (np.dot(X, theta.T) - y)
            # for j in range(parameters):
            #     term = np.multiply(error, X[:, j])
            #     temp[0, j] = theta[0, j] - ((alpha / row) * np.sum(term))
            theta = temp
            cost[i] = computeCost(X, y, theta)
        return theta, cost
    
  • Feature Normalization(X):特征正规化

    (注意,此处是否包含标签仍待考虑。)

    # Feature Normalization 特征正规化
    def featureNormalize(df):
        row = df.shape[0]
        col = df.shape[1]
        mean = np.mean(df.iloc[:, 0: col])
        std = np.std(df.iloc[:, 0: col])
        df.iloc[:, 0: col] -= mean
        df.iloc[:, 0: col] /= std
        return df
    
  • Normal Equation(X, y):正规方程求 theta

    在这里插入图片描述

    # Normal Equation 正规方程
    def normalEquation(X, y):
        # theta = np.linalg.pinv(np.transpose(X).dot(X)).dot(np.transpose(X)).dot(y)
        theta = np.linalg.pinv(X.T@(X))@(X.T)@(y)
        return theta
    

1. Simple function

Implement a simple function using python.

# 1. Simple function

A = np.eye(5)
print(A)

2. Linear Regression with one variable

Implement linear regression with one variable using the given data.

# 2. Linear regression with one variable

path_data1 = 'ex1data1.txt'
df_data1 = pd.read_csv(path_data1, names=['Population', 'Profit'])
print(df_data1.describe())  # 获取数据相关信息
print(df_data1.head(10))    # 读前n行,默认为5
print(df_data1.info())      # 查看索引、数据类型和内存信息

2.1 Plotting data

# 2.1 Plot the data 绘图

df_data1.plot(kind='scatter', x='Population', y='Profit', figsize=(8, 5),
              title='Predictions on Profit based on Population')

2.2 Gradient Descent

# # 2.2 Gradient Descent 梯度下降

# 为了便于运算,于首列前插入一列全1的向量
df_data1.insert(0, 'ONE', 1)
# 行数,列数,预测值,参照值,训练参数,学习率,迭代次数
row = df_data1.shape[0]
col = df_data1.shape[1]
X = np.matrix(df_data1.iloc[:, 0: col - 1])
y = np.matrix(df_data1.iloc[:, col - 1: col])
theta = np.matrix([0, 0])
alpha = 0.01
iters = 1500
# 梯度下降处理
res_theta, res_cost = func.gradientDescent(X, y, theta, alpha, iters)

2.3 Debugging

No code.

2.4 Visualizing

# 2.3 Visualization 可视化

# Figure about the linear regression prediction 线性回归预测图
x_data = np.linspace(df_data1.Population.min(), df_data1.Population.max(), 100)
hypo = res_theta[0, 0] + (res_theta[0, 1] * x_data)  # 假设函数
fig, fig_prediction = plt.subplots(figsize=(8, 5))
fig_prediction.plot(x_data, hypo, 'r', label='Prediction')
fig_prediction.scatter(df_data1['Population'], df_data1['Profit'], label='Training data')
fig_prediction.legend(loc=2)  # legend 为显示图例函数,loc 按照象限设置图例显示位置
fig_prediction.set_xlabel('Population')
fig_prediction.set_ylabel('Profit')
fig_prediction.set_title('Predictions on Profit based on Population data')

# Figure about the changes of the cost 损失值变化图
fig, fig_cost = plt.subplots(figsize=(8, 5))
x_cost = np.arange(iters)  # np.arange() 返回等差数组
fig_cost.plot(x_cost, res_cost, 'r')
fig_cost.set_xlabel('Iteration')
fig_cost.set_ylabel('Cost')
fig_cost.set_title('Value of cost of every iteration during training')

2.5 Optional lib_scikit-learn

Use an additional library to help do the linear regression.

The library is “scikit-learn”.

# 2.4 Optional Lib_scikit-learn

from sklearn import linear_model

# Model fitting
model = linear_model.LinearRegression()
model.fit(X, y)
# Visualization
x = np.array(X[:, 1].A1)
f = model.predict(X).flatten()
fig, ax = plt.subplots(figsize=(8,5))
ax.plot(x, f, 'r', label='Prediction')
ax.scatter(df_data1['Population'], df_data1['Profit'], label='Training Data')
ax.legend(loc=2)
ax.set_xlabel('Population')
ax.set_ylabel('Profit')
ax.set_title('Predicted Profit vs. Population Size')

3. Linear Regression with multiple variables

Implement linear regression with multiple variables using the given data.

# 3. Linear regression with multiple variables

path_data2 = 'ex1data2.txt'
df_data2 = pd.read_csv(path_data2, names=['Size', 'Bedrooms', 'Price'])

3.1 Feature Normalization

# 3.1 Feature Normalization 特征正规化

df_data2 = func.featureNormalize(df_data2)

3.2 Gradient Descent

# 3.2 Gradient Descent 梯度下降

df_data2.insert(0, 'ONE', 1)
row2 = df_data2.shape[0]
col2 = df_data2.shape[1]
X2 = np.matrix(df_data2.iloc[:, 0: col2 - 1])
y2 = np.matrix(df_data2.iloc[:, col2 - 1: col2])
theta2 = np.matrix([0, 0, 0])
alpha2 = 0.03  # max value: 1
iters2 = 1000
res_theta2, res_cost2 = func.gradientDescent(X2, y2, theta2, alpha2, iters2)

# Visualization
fig, ax = plt.subplots(figsize=(8, 5))
ax.plot(np.arange(iters2), res_cost2, 'r')
ax.set_xlabel('Iterations')
ax.set_ylabel('Cost')
ax.set_title('Value of cost in every iteration during training')
plt.show()

3.3 Normal Equation

# 3.3 Normal Equation 正规方程

res = func.normalEquation(X2, y2)
  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值