深度学习与图像识别:原理与实践笔记Day_01

最新推荐文章于 2024-11-09 11:25:14 发布

努力卷

最新推荐文章于 2024-11-09 11:25:14 发布

阅读量172

点赞数

文章标签：深度学习机器学习算法

本文链接：https://blog.csdn.net/qq786558544/article/details/120809301

版权

1、线性回归

1.1 一元线性回归

一元线性回归是用来描述自变量和因变量都只有一个的情况，可以用y=a*x+b.学习一元线性模型的过程就是通过训练数据集得到最合适的a和b的过程，也就是说该一元线性模型的参数即a和b。

如何判断一个模型的好与不好

看我们的预测值和真实值之间的差距

我们使用平均值 $x=\frac{1}{n}\sum_{i=1}^{n}(y^{i}-y_predict^{i}])^2$ 来表示

可以通过最小二乘法(又称最小平方法)来寻找最优的参数a和b

$a=\frac{\sum_{i=1}^{n}(x^i-x.mean)(y^i-y.mean)}{\sum_{i=1}^{n}(x^i-x.mean)^2}$

$b=y.mean-a*x.mean$

一元线性回归算法的实现代码：

import numpy as np
import matplotlib.pyplot as plt



class SimpleLinearRegressionSelf:
    def __init__(self):
        self.a_ = None
        self.b_ = None

    def fix(self,x_train, y_train):
        assert x_train.ndim ==1, \
        "一元线性回归模型仅能处理向量，不能处理矩阵"
        x_mean = np.mean(x_train)
        y_mean = np.mean(y_train)
        # 分母
        denominator = 0.0
        # 分子
        numberator = 0.0
        for x_i, y_i in zip(x, y):
            numberator += (x_i - x_mean) * (y_i - y_mean)
            denominator += (x_i - x_mean) ** 2
        self.a_ = numberator / denominator
        self.b_ = y_mean - self.a_ * x_mean

        return self

    # x_test_group是向量的集合
    def predict(self, x_test_group):
        # 此处是for循环的列表解析式
        '''
        例如
        a = [1,2,3,4,5,6,7,8,9,10]
        b = [x**2 for x in a]
        print(b)

        运行结果
        [1, 4, 9, 16, 25, 36, 49, 64, 81, 100]
        在这个代码中，输入序列是a，运算表达式是x**2 循环表达式相当于是for x in a**2

        '''
        # 对x_test_group进行遍历，然后把得到的预测值变成一个列表
        return np.array([self._predict(x_test) for x_test in x_test_group])

    def _predict(self,x_test):
        return self.a_ * x_test + self.b_ # 求取每一个输入的x_test得到的预测值

    def mean_squared_error(self,y_true,y_predict):
        return np.sum((y_true - y_predict) **2) / len(y_true)

    # 误差衡量 R Squared
    def r_square(self, y_true, y_predict):
        # var（） 求方差
        return 1 - (self.mean_squared_error(y_true,y_predict) / np.var(y_true))


if __name__ == '__main__':
    x = np.array([1, 2, 4, 6, 8])
    y = np.array([2, 5, 7, 8, 9])
    lr = SimpleLinearRegressionSelf()
    lr.fix(x, y)
    print(lr.predict([7]))
    print(lr.r_square([8, 9], lr.predict([6, 8])))
    # x_mean = x.mean()
    # y_mean = y.mean()
    # # 分母
    # denominator = 0.0
    # # 分子
    # numberator = 0.0
    # for x_i,y_i in zip(x,y):
    #     numberator += (x_i- x_mean) *(y_i- y_mean)
    #     denominator += (x_i- x_mean)**2
    # a = numberator / denominator
    # b = y_mean- a*x_mean
    #
    # y_pred = x*a+b
    # plt.scatter(x,y,color='b')
    # plt.plot(x,y_pred,color='r')
    # plt.xlabel('管子的长度',fontproperties = 'simHei', fontsize = 15)
    # plt.ylabel('收费',fontproperties = 'simHei', fontsize = 15)
    # plt.show()

执行结果：