Locally weighted linear regression

1. Guide

            

  The leftmost figure shows the result of fitting a y = θ0 + θ1x1 to a dataset. We see that the data doesn’t really lie on straight line, and so the fit is not very good. This is called underfitting.---there is only one feature, it's too few.

      So, we add an extra feature x12, and fit y = θ0 + θ1x1 + θ2x2, here x2 = x12. The middle figure shows a better fitting.

      The rightmost figure is the result of fitting 5-th order polynomial.This is called overfitting.---there is too many features.(compare to dataset)

  As discussed previously, and as shown in the example above, the choice of features is important to ensuring good performance of a learning algorithm.

 

2. LWR

  We assuming there is sufficient training data, makes the choice of features less critical.

  In the original linear regression algorithm, to make a prediction at a query point x (i.e., to evaluate h(x)), we would:

  a. Fit θ to minimize ∑i(y(i) − θT x(i))2.

  b. Output θT x.

  In contrast, the locally weighted linear regression algorithm does the following:

  a. Fit θ to minimize ∑iw(i)(y(i) − θT x(i))2.

  b. Output θT x.

  Here, the w(i)’s are non-negative valued weights. Intuitively, if w(i) is large for a particular value of i, then in picking θ, we’ll try hard to make (y(i) − θT x(i))2 small. If w(i) is small, then the (y(i) − θT x(i))2 error term will be pretty much ignored in the fit.

  A fairly standard choice for the weights is:

                          

  ps. If x is vector-valued, this is generalized to be w(i) = exp(−(x(i)−x)T (x(i)−x)/(2τ2)), or w(i) = exp(−(x(i) − x)T−1(x(i) − x)/2), for an appropriate choice of τ or .

  Note that the weights depend on the particular point x at which we’re trying to evaluate x. Moreover, if |x(i) − x| is small, then w(i) is close to 1; and if |x(i) − x| is large, then w(i) is small. Hence, θ is chosen giving a much higher “weight” to the (errors on) training examples close to the query point x. (Note also that while the formula for the weights takes a form that is cosmetically similar to the density of a Gaussian distribution, the w(i)’s do not directly have anything to do with Gaussians, and in particular the w(i) are not random variables, normally distributed or otherwise.)---if |x(i) − x| is large, then w(i) is small, then the (y(i) − θT x(i))2 error term will be pretty much ignored in the fit.

  The parameter τ controls how quickly the weight of a training example falls off with distance of its x(i) from the query point x; τ is called the bandwidth parameter.

  Locally weighted linear regression is the first example we’re seeing of a non-parametric algorithm. The (unweighted) linear regression algorithm we saw earlier is known as a parametric learning algorithm, because it has a fixed, finite number of parameters (the θi’s), which are fit to the data. Once we’ve fit the θi’s and stored them away, we no longer need to keep the training data around to make future predictions. In contrast, to make predictions using locally weighted linear regression, we need to keep the entire training set around. The term “non-parametric” (roughly) refers to the fact that the amount of stuff we need to keep in order to represent the hypothesis h grows linearly with the size of the training set.(we store the dataset)

    

转载于:https://www.cnblogs.com/ustccjw/archive/2013/04/13/3017815.html

好的,我理解了你的问题。以下是标准线性回归、局部加权线性回归和岭回归的 Python 代码示例,你可以根据需要选择其中之一。 标准线性回归: ``` import numpy as np import matplotlib.pyplot as plt # Load data data = np.loadtxt('linearRegression_data.txt', delimiter=',') x = data[:, 0] y = data[:, 1] # Build linear regression model X = np.vstack((np.ones(len(x)), x)).T theta = np.linalg.inv(X.T.dot(X)).dot(X.T).dot(y) # Plot data and model plt.scatter(x, y) plt.plot(x, X.dot(theta), 'r') plt.show() ``` 局部加权线性回归: ``` import numpy as np import matplotlib.pyplot as plt # Load data data = np.loadtxt('linearRegression_data.txt', delimiter=',') x = data[:, 0] y = data[:, 1] # Build locally weighted linear regression model def lwlr(test_point, x, y, k=1.0): x_mat = np.mat(x) y_mat = np.mat(y).T m = np.shape(x_mat)[0] weights = np.mat(np.eye((m))) for j in range(m): diff_mat = test_point - x_mat[j, :] weights[j, j] = np.exp(diff_mat * diff_mat.T / (-2.0 * k ** 2)) xTx = x_mat.T * (weights * x_mat) if np.linalg.det(xTx) == 0.0: print("This matrix is singular, cannot do inverse") return theta = xTx.I * (x_mat.T * (weights * y_mat)) return test_point * theta # Plot data and model y_predict = np.zeros_like(y) for i in range(len(x)): y_predict[i] = lwlr(x[i], x, y, k=0.03) plt.scatter(x, y) plt.plot(x, y_predict, 'r') plt.show() ``` 岭回归: ``` import numpy as np import matplotlib.pyplot as plt # Load data data = np.loadtxt('linearRegression_data.txt', delimiter=',') x = data[:, 0] y = data[:, 1] # Build ridge regression model def ridge_regression(X, y, alpha): XtX = X.T.dot(X) return np.linalg.inv(XtX + alpha * np.eye(X.shape[1])).dot(X.T).dot(y) # Plot data and model X = np.vstack((np.ones(len(x)), x)).T theta = ridge_regression(X, y, alpha=0.5) plt.scatter(x, y) plt.plot(x, X.dot(theta), 'r') plt.show() ```
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值