目标函数:
求导并进行向量化:
编程实现
生成具有两个特征的测试用例
#生成具有两个特征的测试用例
In [87]: import numpy as np
...: import matplotlib.pyplot as plt
In [88]: X = np.empty((100,2))
...: X[:,0] = np.random.uniform(0., 100., size=100)
...: X[:,1] = 0.75 * X[:,0] + 3. + np.random.normal(0,10.,size=100)
demean均值化0
#X减去一个向量,向量是矩阵的每一列即每个特征的均值
In [92]: def demean(X):
...: return X - np.mean(X,axis=0)
梯度上升法(定义函数)
In [99]: def f(w,X):
...: return np.sum((X.dot(w)**2)) / len(X)
In [100]: def df_math(w,X):
...: return X.T.dot(X.dot(w)) * 2. / len(X)
In [101]: def df_debug(w,X,epsilon=0.0001):
...: res = np.empty(len(w))
...: for i in range(len(w)):
...: w_1 = w.copy()
...: w_1[i] += epsilon
...: w_2 = w.copy()
...: w_2[i] -= epsilon
...: res[i] = (f(w_1, X) - f(w_2, X,)) / (2*epsilon)
...: return res
#方向向量w单位化
In [102]: def direction(w):
...: return w/np.linalg.norm(w)
...: def gradient_ascent(df, X, initial_w, eta, n_iters=1e4, epsilon=1e-8):
...: w = direction(initial_w)
...: cur_iter = 0
...: while cur_iter < n_iters:
...: gradient = df(w, X)
...: last_w = w
...: w = w + eta * gradient
...: w = direction(initial_w)
...: if (abs(f(w, X) - f(last_w, X)) < epsilon):
...: break
...: cur_iter += 1
...: return w
调用定义的函数
In [103]: initial_w = np.random.random(X.shape[1]) #注意不能从0向量开始
In [105]: eta = 0.001
In [106]: #注意3:不能使用StandardScaler标准化数据,标准化后方差就为1,就不存在求最大方差
In [107]: w = gradient_ascent(df_debug,X_demean,initial_w,eta)
Out[107]: array([0.49282994, 0.87012565])
In [108]: w = gradient_ascent(df_math,X_demean,initial_w,eta)
Out[108]: array([0.49282994, 0.87012565])
绘制求得的轴w
In [113]: plt.scatter(X_demean[:,0],X_demean[:,1])
...: plt.plot([0,w[0]*30],[0,w[1]*30],color='r')