迭代加权最小二乘

最新推荐文章于 2022-11-16 23:18:57 发布

bebr

最新推荐文章于 2022-11-16 23:18:57 发布

阅读量1.2w

点赞数 3

分类专栏：机器学习

机器学习专栏收录该内容

74 篇文章 5 订阅

订阅专栏

实现IRLS算法用于logistic regression，并且画出来scatter。

先回顾一下IRLS算法，IRLS是iterative reweighted least squares，和OSL相比起来，多了两个单词iterative 和 reweighted。先说一下iterative

为什么要iterative呢？书上(Pattern Recognition and Machine learning by Bishop)的原话是"For logistic regression, there is no longer a closed-form solution, due to the nonlinearity of the logistic sigmoid function", 也就是说这里不像OLS那样一步到位，而是一种online renew的方式去求w。具体的公式如下(略大。。)：

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

R是什么呢？R是一个n by n 的matrix(n is the size of your dataset)， what is more，R is a diagonal matrix，对角元素Rii = yi*(1-yi).

注意这里的y是每次更新出来的output，y=W*X+W0。这就是我要强调的第二点，reweighted。之所以是reweighted，是因为对角元素在每次更新的时候都是要变化的。

好了，废话不多说了，下面开始码了：

首先定义一个irls的方法：

maxiter是最大的循环次数，当然如果前后两次的W差值低于我们的threshold，就会break这个loop

def IRLS(y, X, maxiter, w_init=1, d=0.0001, tolerance=0.001):
n, p = X.shape
# 生成数组，其中repeat函数生成n个d，reshape函数将一维数组变成二维数组，
# 一行n列的数组
delta = array(repeat(d, n)).reshape(1, n)
# w是n个1的数组
w = repeat(1, n)
# W是对角线上为w的对角矩阵
W = diag(w)
z = inv(W).dot(y)
B = dot(inv(X.T.dot(W).dot(X)),
(X.T.dot(W).dot(z)))
for _ in range(maxiter):
_B = B
_w = abs(y - X.dot(B)).T
# w = float(1) / maximum(delta, _w)
tmpx = X.dot(B)

tmpxx = tmpx * (1 - tmpx)
tmpxxx = tmpxx.reshape(1, 99)
W = diag(tmpxxx[0])
z = X.dot(B) - inv(W).dot(X.dot(B) - y)
B = dot(inv(X.T.dot(W).dot(X)),
(X.T.dot(W).dot(z)))
tol = sum(abs(B - _B))
print("Tolerance = %s" % tol)
if tol < tolerance:
return B
return B
这样我们就可以获得了W，也就是上面的B（上面代码中的W其实是weight矩阵），这里的B是beta矩阵的意思。
接下来我们获得intercept，很简单：
abs_error = sum(Y - X.dot(B)) / Y.shape[0]
(abs不是absolute的意思，只是我随意起的，没有任何意思)
好了，接下来看疗效(画scatter)：
借鉴并且修改了一下python machine learning 这本书上的源码（btw，这本书的作者，一个德国人在MSU上学，add我linkedin了。。）
如下所示：
def plot_decision_regions(XX, YY, B, abs_error,
test_idx=None, resolution=0.02):
# setup marker generator and color map
markers = ('s', 'o', 'x', '^', 'v')
colors = ('red', 'blue', 'lightgreen', 'gray', 'cyan')
cmap = ListedColormap(colors[:len(np.unique(YY))])
y = []
# tmpy = YY[:,1].split('\n')
for elem in np.transpose(YY)[0, :]:
y.append(elem)
# plot the decision surface
# X = np.mat(XX)

# ytt=YY[:,np.newaxis]
# y = np.transpose(ytt)
# y = np.mat(YY)
# x1_min =min(X[:, 0])-1
x1_min, x1_max = min(X[:, 0]) - 1, max(X[:, 0]) + 1
x2_min, x2_max = min(X[:, 1]) - 1, max(X[:, 1]) + 1
xx1, xx2 = np.meshgrid(np.arange(x1_min, x1_max, resolution), np.arange(x2_min, x2_max, resolution))

# Z = classifier.predict(np.array([xx1.ravel(), xx2.ravel()]).T)
myx = np.array([xx1.ravel(), xx2.ravel()]).T
print(myx.shape)
Z = myx.dot(B) + abs_error

myztmp = []
for tmp in Z:
if tmp > 0.5:
myztmp.append(1)
else:
myztmp.append(0)
myztmp = np.mat(myztmp)
Z = myztmp.reshape(xx1.shape)
plt.contourf(xx1, xx2, Z, alpha=0.4, cmap=cmap)
plt.xlim(xx1.min(), xx1.max())
plt.ylim(xx2.min(), xx2.max())
# plot all samples
for idx, cl in enumerate(np.unique(y)):
x_ = XX[y == cl, 0]
y_ = XX[y == cl, 1]
plt.scatter(x_, y_, alpha=0.8, c=cmap(idx), marker=markers[idx], label=cl)
# plt.scatter(1, -4.5, alpha=0.8, c=cmap(idx), marker=markers[idx], label=cl)
因为他的这个类本身是有一个argument是classifier，然而我只是定义了一个def，好尴尬。。那么我只能修改其中的Z，
Z = myx.dot(B) + abs_error
本身是人家sklearn库里面LogisticRegression这个class的predict方法的，被我annotation了。。
# Z = classifier.predict(np.array([xx1.ravel(), xx2.ravel()]).T)
然后诡异的是，这样生成的Z居然不是integer，而是double，所以还是自己搞了一套，强行转化，原理当然是sigmoid方程啦，这里不再赘述。
myztmp = []
for tmp in Z:
if tmp > 0.5:
myztmp.append(1)
else:
myztmp.append(0)
好了，看看图吧：
注意看红色的点，蓝色的圈分别代表两个类，linear 的boundary也有了，嗯，还说的过去吧~~~