Cost Function的原理及实现(Python, matlab)

最新推荐文章于 2023-12-04 08:01:07 发布

two_star

最新推荐文章于 2023-12-04 08:01:07 发布

阅读量6.6k

点赞数 2

分类专栏：机器学习 python

本文链接：https://blog.csdn.net/qq_25024883/article/details/78566300

版权

机器学习同时被 2 个专栏收录

8 篇文章 2 订阅

订阅专栏

python

6 篇文章 0 订阅

订阅专栏

成本函数(Cost Function)

J (θ 0, θ 1) = 1 2 m \sum i = 1 m (h θ (x (i)) - y (i)) 2

$J(\theta_0,\theta_1) = \frac{1}{2m}\sum_{i=1}^m (h_\theta(x^{(i)})-y^{(i)})^2$

m $m$ : Number of training examples.

x $x$ : input.

y $y$ : output.
Parameters:

θ0 $\theta_0$ ,

θ1 $\theta_1$ .

以下为MATLAB实现方式：

function J = computeCost(X, y, theta)
    %COMPUTECOST Compute cost for linear regression
    %   J = COMPUTECOST(X, y, theta) computes the cost of using theta as the parameter for linear regression to fit the data points in X and y
    m = length(y); % number of training examples
    J = 0;
    J = sum((X * theta - y) .^2) / 2 / m;
end

以下为Python实现方式：

def computeCost(X, y, theta):
    """Cost Functions

    Parameters
    ----------
    X : np.ndarray, like (49 * 2)
    y : np.ndarray, like (49 * 1)
    theta : np.ndarray, like (2 * 1)

    Returns
    -------
    J : float, cost

    """
    y = np.transpose(y)
    J = sum((np.dot(X, theta) - y.reshape(len(y),1)) **2) / 2.0 / len(y)
    # np.dot(A, B) 矩阵乘积，A * B 矩阵点乘
    # pow(A, 2) 多次乘积，**n 点次方
    return J

注意：np中点乘和矩阵乘积，点次方与多次乘积的区别，与MATLAB不同。

梯度下降法(Gradient Descent)

r e p e a t u n t i l c o n v e r g e n c e {θ j : = θ j - α \partial J ( θ 0 , θ 1 ) \partial θ j (j = 0, 1)}

$repeat\quad until\quad convergence \{ \quad \theta_j:=\theta_j-\alpha \frac{\partial J(\theta_0, \theta_1)}{\partial \theta_j} \quad (j=0,1) \quad \}$

α $\alpha$ : learning rate.

∂ $\partial$ : 偏微分

以下为MATLAB实现：

function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)
    %GRADIENTDESCENT Performs gradient descent to learn theta
    %   theta = GRADIENTDESCENT(X, y, theta, alpha, num_iters) updates theta by 
    %   taking num_iters gradient steps with learning rate alpha

    m = length(y); % number of training examples
    J_history = zeros(num_iters, 1);

    for iter = 1:num_iters
        theta = theta -  X' * (X * theta - y) * (alpha / m);  
        J_history(iter) = computeCost(X, y, theta);
    end
end

以下为Python实现：

def gradientDescent(X, y, theta, alpha, num_iters):
    """Gradient Descent for (Multivariate) Linear Regression

    Parameters
    ----------
    X : np.ndarray, like (49 * 2)
    y : np.ndarray, like (49 * 1)
    theta : np.ndarray, like (49 * 1)
    alpha : learning rate
    num_iters : number of iter

    Returns
    -------
    tuple(J_history, theta)
    J_history : np.ndarray, like (num_iters, 1)
    theta : theta of convergence, like (2 * 1)
    """

    J_history = np.zeros((num_iters, 1))
    for n_iter in range(num_iters):
        theta = theta - np.dot(X.T, np.dot(X, theta) - y.reshape(len(y),1)) *alpha / len(y)
        J_history[n_iter, 0] = computeCost(X, y, theta)
    return J_history, theta