吴恩达机器学习代码及相关知识点总结--ex1（1.单变量线性回归）

最新推荐文章于 2024-05-14 22:19:28 发布

AsteriaJoJo

最新推荐文章于 2024-05-14 22:19:28 发布

阅读量268

点赞数

文章标签：机器学习 python

本文链接：https://blog.csdn.net/qq_41462598/article/details/104499498

版权

小白开始机器学习之路，此文章主要供自己练习总结用，大部分代码参考吴恩达机器学习作业解答，文章思路可能有些混乱，如有问题还望批评指正(ง •_•)ง

1.导入

import pandas as pd
import seaborn as sns
sns.set(context="notebook", style="whitegrid", palette="dark")
import matplotlib.pyplot as plt
import tensorflow as tf
import numpy as np

2.读取数据

df=pd.read_csv("code/ex1-linear regression/ex1data1.txt",names=["Population","Profit"])

路径自己更换

df.head(5)//查看前五行

在这里插入图片描述

df.info()//查看详细信息

在这里插入图片描述

df.plot(kind="scatter",x='Population', y='Profit', figsize=(12,8))
plt.show()//画图

在这里插入图片描述

读取数据

data = pd.read_csv('code/ex1-linear regression/ex1data1.txt', names=['population', 'profit'])#读取数据，并赋予列名
data.insert(0, 'Ones', 1)
data.head()#看下数据前5行
cols = data.shape[1]
X = data.iloc[:,0:cols-1]#X是所有行，去掉最后一列
y = data.iloc[:,cols-1:cols]#X是所有行，最后一列
X = np.matrix(X.values)
y = np.matrix(y.values)

3.数据归一化

在这里插入图片描述
在使用梯度下降的方法求解最优化问题时，归一化/标准化后可以加快梯度下降的求解速度，即提升模型的收敛速度。

def normalize_feature(df):
#     """Applies function along input axis(default 0) of DataFrame."""
    return df.apply(lambda column: (column - column.mean()) / column.std())#特征缩放

关于pandas.apply()函数👆
apply函数是pandas里面所有函数中自由度最高的函数。该函数如下：

DataFrame.apply(func, axis=0, broadcast=False, raw=False, reduce=None, args=(), **kwds)

该函数最有用的是第一个参数，这个参数是函数，相当于C/C++的函数指针。

这个函数需要自己实现，函数的传入参数根据axis来定，比如axis = 1，就会把一行数据作为Series的数据结构传入给自己实现的函数中，我们在函数中实现对Series不同属性之间的计算，返回一个结果，则apply函数会自动遍历每一行DataFrame的数据，最后将所有结果组合成一个Series数据结构并返回。
举个栗子：

import numpy as np
import pandas as pd

if __name__ == '__main__':
    f = lambda x : x.max() - x.min()
    df = pd.DataFrame(np.random.randn(4, 3), columns=list('bde'), index=['utah', 'ohio', 'texas', 'oregon']) #columns表述列标， index表述行标
    print(df)

    t1 = df.apply(f) #df.apply(function, axis=0)，默认axis=0，表示将一列数据作为Series的数据结构传入给定的function中
    print(t1)

    t2 = df.apply(f, axis=1)
    print(t2)

结果：

b         d         e
utah    1.950737  0.318299  0.387724
ohio    1.584464 -0.082965  0.984757
texas   0.477283 -2.774454 -0.532181
oregon -0.851359 -0.654882  1.026698


b    2.802096
d    3.092753
e    1.558879
dtype: float64

utah      1.632438
ohio      1.667428
texas     3.251737
oregon    1.878057
dtype: float64

4.计算代价函数

在这里插入图片描述


theta = np.matrix(np.array([0,0]))
def computeCost(X, y, theta):
    inner = np.power(((X * theta.T) - y), 2)
    return np.sum(inner) / (2 * len(X))
computeCost(X, y, theta)

在这里插入图片描述

5.batch gradient decent（批量梯度下降）

在这里插入图片描述

def gradientDescent(X,y,theta,alpha,iters):
    temp=np.matrix(np.zeros(theta.shape))
    parameters=int(theta.ravel().shape[1])
    cost=np.zeros(iters)
    for i in range(iters):
        error=(X*theta.T)-y
        for j in range(parameters):
            term = np.multiply(error, X[:,j])
            temp[0,j] = theta[0,j] - ((alpha / len(X)) * np.sum(term))
            theta = temp
            cost[i] = computeCost(X, y, theta)
    return theta,cost
theta1, cost = gradientDescent(X, y, theta, alpha=0.01, iters=1000)
print(theta1)
computeCost(X,y,theta1)

输出：
在这里插入图片描述
关于np.ravel() 和 flatten():
首先声明两者所要实现的功能是一致的,将多维数组降位一维。这两者的区别在于返回拷贝（copy）还是返回视图（view），numpy.flatten()返回一份拷贝，对拷贝所做的修改不会影响（reflects）原始矩阵，而numpy.ravel()返回的是视图（view，也颇有几分C/C++引用reference的意味），会影响（reflects）原始矩阵。
在这里插入图片描述
原文链接中有两者更详细的区别：https://www.jianshu.com/p/dcb5b5e15ac9

6.可视化

pic_x=np.linspace(data.Population.min(),data.Population.max(),100)
pic_y=theta1[0,0]+(theta1[0,1]*pic_x)
pic,ax=plt.subplots(figsize=(12,8))
ax.plot(pic_x,pic_y,'r',label="prediction")
ax.scatter(data.Population,data.Profit,label="training data")
ax.set_xlabel('Population')
ax.set_ylabel('Profit')
ax.set_title('Predicted Profit & Population Size')
plt.show()

在这里插入图片描述

fig, ax = plt.subplots(figsize=(12,8))
ax.plot(np.arange(iters), cost, 'r')
ax.set_xlabel('Iterations')
ax.set_ylabel('Cost')
ax.set_title('Error vs. Training Epoch')
plt.show()