机器学习的常用算法，python实现

最新推荐文章于 2023-01-03 15:48:51 发布

程序员那些破事儿

最新推荐文章于 2023-01-03 15:48:51 发布

阅读量517

点赞数

文章标签：机器学习 python 算法

本文链接：https://blog.csdn.net/wjq6940/article/details/78356282

版权

机器学习专栏收录该内容

1 篇文章 0 订阅

订阅专栏

最小二乘法

所有的深度学习算法都始于下面这个数学公式（我已将其转成 Python 代码）

 
Python
 
         1 
       
         2 
       
         3 
       
         4 
       
         5 
       
         6 
       
         7 
       
         8 
       
         9 
       
         10 
       
         11 
       
         12 
       
        # y = mx + b 
       
        # m is slope, b is y-intercept 
       
        def 
          
        compute_error_for_line_given_points 
        ( 
        b 
        , 
          
        m 
        , 
          
        coordinates 
        ) 
        : 
       
        totalError 
          
        = 
          
        0 
       
        for 
          
        i 
          
        in 
          
        range 
        ( 
        0 
        , 
          
        len 
        ( 
        coordinates 
        ) 
        ) 
        : 
       
        x 
          
        = 
          
        coordinates 
        [ 
        i 
        ] 
        [ 
        0 
        ] 
       
        y 
          
        = 
          
        coordinates 
        [ 
        i 
        ] 
        [ 
        1 
        ] 
       
        totalError 
          
        += 
          
        ( 
        y 
          
        - 
          
        ( 
        m 
          
        * 
          
        x 
          
        + 
          
        b 
        ) 
        ) 
          
        * 
        * 
          
        2 
       
        return 
          
        totalError 
          
        / 
          
        float 
        ( 
        len 
        ( 
        coordinates 
        ) 
        ) 
       
        # example  
       
        compute_error_for_line_given_points 
        ( 
        1 
        , 
          
        2 
        , 
          
        [ 
        [ 
        3 
        , 
        6 
        ] 
        , 
        [ 
        6 
        , 
        9 
        ] 
        , 
        [ 
        12 
        , 
        18 
        ] 
        ] 
        )

最小二乘法在 1805 年由 Adrien-Marie Legendre 首次提出（1805, Legendre），这位巴黎数学家也以测量仪器闻名。他极其痴迷于预测彗星的方位，坚持不懈地寻找一种可以基于彗星方位历史数据计算其轨迹的算法。

他尝试了许多种算法，一遍遍试错，终于找到了一个算法与结果相符。Legendre 的算法是首先预测彗星未来的方位，然后计算误差的平方，最终目的是通过修改预测值以减少误差平方和。而这也正是线性回归的基本思想。

读者可以在 Jupyter notebook 中运行上述代码来加深对这个算法的理解。m 是系数，b 是预测的常数项，coordinates 是彗星的位置。目标是找到合适的 m 和 b 使其误差尽可能小。

这是深度学习的核心思想：给定输入值和期望的输出值，然后寻找两者之间的相关性。

梯度下降

Legendre 这种通过手动尝试来降低错误率的方法非常耗时。荷兰的诺贝尔奖得主 Peter Debye 在一个世纪后（1909 年）正式提出了一种简化这个过程的方法。

假设 Legendre 的算法需要考虑一个参数 —— 我们称之为 X 。Y 轴表示每个 X 的误差值。Legendre 的算法是找到使得误差最小的 X。在下图中，我们可以看到当 X = 1.1 时，误差 Y 取到最小值。

Peter Debye 注意到最低点左边的斜率是负的，而另一边则是正的。因此，如果知道了任意给定 X 的斜率值，就可以找到 Y 的最小值点。

这便是梯度下降算法的基本思想。几乎所有的深度学习模型都会用到梯度下降算法。

要实现这个算法，我们假设误差函数是 Error = x ^ 5 -2x ^ 3-2。要得到任意给定 X 的斜率，我们需要对其求导，即 5x^4 – 6x^2：

如果您需要复习导数的相关知识，请观看 Khan Academy 的视频。

下面我们用 Python 实现 Debye 的算法：

Python
 
         1 
       
         2 
       
         3 
       
         4 
       
         5 
       
         6 
       
         7 
       
         8 
       
         9 
       
         10 
       
         11 
       
         12 
       
         13 
       
         14 
       
         15 
       
        current_x 
          
        = 
          
        0.5 
          
        # the algorithm starts at x=0.5 
       
        learning_rate 
          
        = 
          
        0.01 
          
        # step size multiplier 
       
        num_iterations 
          
        = 
          
        60 
          
        # the number of times to train the function 
       
        #the derivative of the error function (x**4 = the power of 4 or x^4)  
       
        def 
          
        slope_at_given_x_value 
        ( 
        x 
        ) 
        : 
          
        return 
          
        5 
          
        * 
          
        x 
        * 
        * 
        4 
          
        - 
          
        6 
          
        * 
          
        x 
        * 
        * 
        2 
       
        # Move X to the right or left depending on the slope of the error function 
       
        for 
          
        i 
          
        in 
          
        range 
        ( 
        num_iterations 
        ) 
        : 
       
        previous_x 
          
        = 
          
        current_x 
       
        current_x 
          
        += 
          
        - 
        learning_rate 
          
        * 
          
        slope_at_given_x_value 
        ( 
        previous_x 
        ) 
       
        print 
        ( 
        previous_x 
        ) 
       
        print 
        ( 
        "The local minimum occurs at %f" 
          
        % 
          
        current_x 
        )

这里的窍门在于 learning_rate。我们通过沿斜率的相反方向行进来逼近最低点。此外，越接近最低点，斜率越小。因此当斜率接近零时，每一步下降的幅度会越来越小。

num_iterations 是你预计到达最小值之前所需的迭代次数。可以通过调试该参数训练自己关于梯度下降算法的直觉。

线性回归

最小二乘法配合梯度下降算法，就是一个完整的线性回归过程。在 20 世纪 50 年代和 60 年代，一批实验经济学家在早期的计算机上实现了这些想法。这个过程是通过实体打卡 —— 真正的手工软件程序实现的。准备这些打孔卡就需要几天的时间，而通过计算机进行一次回归分析最多需要 24 小时。

下面是用 Python 实现线性回归的一个示例（我们不需要在打卡机上完成这个操作）：

 
Python
 
         1 
       
         2 
       
         3 
       
         4 
       
         5 
       
         6 
       
         7 
       
         8 
       
         9 
       
         10 
       
         11 
       
         12 
       
         13 
       
         14 
       
         15 
       
         16 
       
         17 
       
         18 
       
         19 
       
         20 
       
         21 
       
         22 
       
         23 
       
         24 
       
        #Price of wheat/kg and the average price of bread 
       
        wheat_and_bread 
          
        = 
          
        [ 
        [ 
        0.5 
        , 
        5 
        ] 
        , 
        [ 
        0.6 
        , 
        5.5 
        ] 
        , 
        [ 
        0.8 
        , 
        6 
        ] 
        , 
        [ 
        1.1 
        , 
        6.8 
        ] 
        , 
        [ 
        1.4 
        , 
        7 
        ] 
        ] 
       
        def 
          
        step_gradient 
        ( 
        b_current 
        , 
          
        m_current 
        , 
          
        points 
        , 
          
        learningRate 
        ) 
        : 
       
        b_gradient 
          
        = 
          
        0 
       
        m_gradient 
          
        = 
          
        0 
       
        N 
          
        = 
          
        float 
        ( 
        len 
        ( 
        points 
        ) 
        ) 
       
        for 
          
        i 
          
        in 
          
        range 
        ( 
        0 
        , 
          
        len 
        ( 
        points 
        ) 
        ) 
        : 
       
        x 
          
        = 
          
        points 
        [ 
        i 
        ] 
        [ 
        0 
        ] 
       
        y 
          
        = 
          
        points 
        [ 
        i 
        ] 
        [ 
        1 
        ] 
       
        b_gradient 
          
        += 
          
        - 
        ( 
        2 
        / 
        N 
        ) 
          
        * 
          
        ( 
        y 
          
        - 
          
        ( 
        ( 
        m_current 
          
        * 
          
        x 
        ) 
          
        + 
          
        b_current 
        ) 
        ) 
       
        m_gradient 
          
        += 
          
        - 
        ( 
        2 
        / 
        N 
        ) 
          
        * 
          
        x 
          
        * 
          
        ( 
        y 
          
        - 
          
        ( 
        ( 
        m_current 
          
        * 
          
        x 
        ) 
          
        + 
          
        b_current 
        ) 
        ) 
       
        new_b 
          
        = 
          
        b_current 
          
        - 
          
        ( 
        learningRate 
          
        * 
          
        b_gradient 
        ) 
       
        new_m 
          
        = 
          
        m_current 
          
        - 
          
        ( 
        learningRate 
          
        * 
          
        m_gradient 
        ) 
       
        return 
          
        [ 
        new_b 
        , 
          
        new_m 
        ] 
       
        def 
          
        gradient_descent_runner 
        ( 
        points 
        , 
          
        starting_b 
        , 
          
        starting_m 
        , 
          
        learning_rate 
        , 
          
        num_iterations 
        ) 
        : 
       
        b 
          
        = 
          
        starting 
        _b 
       
        m 
          
        = 
          
        starting_m 
       
        for 
          
        i 
          
        in 
          
        range 
        ( 
        num_iterations 
        ) 
        : 
       
        b 
        , 
          
        m 
          
        = 
          
        step_gradient 
        ( 
        b 
        , 
          
        m 
        , 
          
        points 
        , 
          
        learning_rate 
        ) 
       
        return 
          
        [ 
        b 
        , 
          
        m 
        ] 
       
        gradient_descent_runner 
        ( 
        wheat_and_bread 
        , 
          
        1 
        , 
          
        1 
        , 
          
        0.01 
        , 
          
        100 
        )