照斯坦福的机器学习课程用Octave敲了一下逻辑回归的算法 然后搬到python上各种出问题...感觉自己对向量化 还是理解不够彻底 又重新推导了一遍
用的数据集是课程的作业 training data包含100条数据 两个特征(两门课程的分数) 一个类别(能否入学)
#加载数据集
data = np.loadtxt(path, dtype = float, encoding = 'utf-8', delimiter = ',')
X = data[:, 0:2]#二维数组 100x2
y = data[:, 2]#一维数组 1x100
需要把y转换为100x1的列向量 方法有很多
#y = np.c_[data[:, 2]]#100x1
y = data[:, 2].reshape(-1, 1)#100x1
逻辑回归假设函数:
#S形函数(h(x;theta))
def sigmoid(z):
return 1 / (1 + np.exp(-z))
#100x1
X = [ ]100x3(添加x0 ==1), theta = [ ]1x3, X * theta'= [ ]100x1
代价函数:
向量化:
#代价函数
def costFunction(theta, X, y):
m = y.size#100
h = sigmoid(X.dot(theta.reshape(-1, 1)))#100x1
J = -1 * (1 / m) * (np.log(h).T.dot(y) + np.log(1 - h).T.dot(1 - y))#1x1
if np.isnan(J[0]):
return(np.inf)
return J[0]
相当于两矩阵对应行点乘之后 求和
梯度 也就是代价函数对theta(j)的偏导:
向量化:
#计算梯度
def compute_grad(theta, X, y):
m = y.size
h = sigmoid(X.dot(theta.reshape(-1, 1)))#100x1
grad = (1 / m) * ((h - y).T.dot(X))#应该是1x3的行向量 有三个theta值
#但是打印出的结果为100x3的矩阵
#===因为刚开始定义的y 是1x100的行向量
return grad#1x3
梯度算法 目的是求costFunction的最小值和最小值点的theta
用scipy.optimize中的minimize函数
from scipy.optimize import minimize#无约束最小化损失函数
res = minimize(costFunction, initial_theta, args = (X, y),
jac = compute_grad, options = {'maxiter': 400})
#test1LogisticRegression.py:43: RuntimeWarning:
#divide by zero encountered in log
# J = -1 * (1 / m) * (np.log(h).T.dot(y) + np.log(1 - h).T.dot(1 - y))#1x1
#test1LogisticRegression.py:43: RuntimeWarning: divide by zero encountered in log
# J = -1 * (1 / m) * (np.log(h).T.dot(y) + np.log(1 - h).T.dot(1 - y))#1x1
# fun: 0.20349770158950983
# hess_inv: array([[ 2.85339493e+03, -2.32908823e+01, -2.27416470e+01],
# [-2.32908823e+01, 2.04489131e-01, 1.72969525e-01],
# [-2.27416470e+01, 1.72969525e-01, 1.96170322e-01]])
# jac: array([-2.68557634e-09, 4.36433479e-07, -1.39671758e-06])
# message: 'Optimization terminated successfully.'
# nfev: 34
# nit: 25
# njev: 30
# status: 0
# success: True
# x: array([-25.16131634, 0.2062316 , 0.20147143])
最后得到的theta为
[-25.16131634, 0.2062316 , 0.20147143]
预测函数
def predict(theta, X, threshold = 0.5):
p = sigmoid(X.dot(theta.T)) >= threshold
return(p.astype('int'))
对分数为[30, 43]的同学进行预测
pre = sigmoid(np.array([1, 30, 43]).dot(res.x.T))
3.326083521580104e-05
#近似于零
参考: