关于逻辑回归向量化的一点总结_逻辑回归代价函数如何向量化-CSDN博客

本文链接：https://blog.csdn.net/Lay_ZRS/article/details/80462208

本文通过Python实现斯坦福大学机器学习课程中的逻辑回归算法，并详细记录了从数据加载到模型训练的过程，包括S形函数、代价函数及梯度计算等关键步骤。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

照斯坦福的机器学习课程用Octave敲了一下逻辑回归的算法然后搬到python上各种出问题...感觉自己对向量化还是理解不够彻底又重新推导了一遍

用的数据集是课程的作业 training data包含100条数据两个特征（两门课程的分数）一个类别（能否入学）

#加载数据集
data = np.loadtxt(path, dtype = float, encoding = 'utf-8', delimiter = ',')
X = data[:, 0:2]#二维数组 100x2
y = data[:, 2]#一维数组 1x100

需要把y转换为100x1的列向量方法有很多

#y = np.c_[data[:, 2]]#100x1
y = data[:, 2].reshape(-1, 1)#100x1

逻辑回归假设函数：

#S形函数(h(x;theta))
def sigmoid(z):
	return 1 / (1 + np.exp(-z))
#100x1

X = [ ]100x3(添加x0 ==1), theta = [ ]1x3, X * theta'= [ ]100x1

代价函数：

向量化：

#代价函数
def costFunction(theta, X, y):
	m = y.size#100
	h = sigmoid(X.dot(theta.reshape(-1, 1)))#100x1

	J = -1 * (1 / m) * (np.log(h).T.dot(y) + np.log(1 - h).T.dot(1 - y))#1x1

	if np.isnan(J[0]):
		return(np.inf)
	return J[0]

相当于两矩阵对应行点乘之后求和

梯度也就是代价函数对theta(j)的偏导:

向量化：

#计算梯度
def compute_grad(theta, X, y):
	m = y.size
	h = sigmoid(X.dot(theta.reshape(-1, 1)))#100x1

	grad = (1 / m) * ((h - y).T.dot(X))#应该是1x3的行向量 有三个theta值
	#但是打印出的结果为100x3的矩阵
	#===因为刚开始定义的y 是1x100的行向量
	return grad#1x3

梯度算法目的是求costFunction的最小值和最小值点的theta

用scipy.optimize中的minimize函数

from scipy.optimize import minimize#无约束最小化损失函数
res = minimize(costFunction, initial_theta, args = (X, y),
	jac = compute_grad, options = {'maxiter': 400})

#test1LogisticRegression.py:43: RuntimeWarning: 
#divide by zero encountered in log
#  J = -1 * (1 / m) * (np.log(h).T.dot(y) + np.log(1 - h).T.dot(1 - y))#1x1
#test1LogisticRegression.py:43: RuntimeWarning: divide by zero encountered in log
#  J = -1 * (1 / m) * (np.log(h).T.dot(y) + np.log(1 - h).T.dot(1 - y))#1x1
#      fun: 0.20349770158950983
# hess_inv: array([[ 2.85339493e+03, -2.32908823e+01, -2.27416470e+01],
#       [-2.32908823e+01,  2.04489131e-01,  1.72969525e-01],
#       [-2.27416470e+01,  1.72969525e-01,  1.96170322e-01]])
#      jac: array([-2.68557634e-09,  4.36433479e-07, -1.39671758e-06])
#  message: 'Optimization terminated successfully.'
#     nfev: 34
#      nit: 25
#     njev: 30
#   status: 0
#  success: True
#        x: array([-25.16131634,   0.2062316 ,   0.20147143])

最后得到的theta为

[-25.16131634,   0.2062316 ,   0.20147143]

预测函数

def predict(theta, X, threshold = 0.5):
	p = sigmoid(X.dot(theta.T)) >= threshold
	return(p.astype('int'))

对分数为[30, 43]的同学进行预测

pre = sigmoid(np.array([1, 30, 43]).dot(res.x.T))

3.326083521580104e-05
#近似于零

参考：

https://blog.csdn.net/han_xiaoyang/article/details/49123419