-
简介
假设你是某某大学某某系的主任,你想根据每个申请者在两次考试中的成绩来确定他们的入学机会。你有以前申请者的历史数据,可以用作对数几率回归的训练集。对于每个训练示例,你都有申请人在两次考试中的分数和录取结果。
-
绘制数据
横纵坐标是申请人两次考试的成绩,录取和未录取的示例用两种记号标出。
# PLOTDATA Plots the data points X and y into a new figure
# PLOTDATA(x,y) plots the data points with + for the positive examples
# and o for the negative examples. X is assumed to be a Mx2 matrix.
from matplotlib import pyplot as plt
import numpy as np
def plotData(X, y):
exam1_0 = []
exam2_0 = []
exam1_1 = []
exam2_1 = []
for i in range(len(y)):
if y[i] == 0:
exam1_0.append(X[i][0])
exam2_0.append(X[i][1])
elif y[i] == 1:
exam1_1.append(X[i][0])
exam2_1.append(X[i][1])
plt.title("Training data")
plt.xlim(30, 100)
plt.ylim(30, 100)
plt.xticks(np.arange(30, 101, 10))
plt.yticks(np.arange(30, 101, 10))
plt.xlabel("Exam 1 score")
plt.ylabel("Exam 2 score")
plt.scatter(exam1_0, exam2_0, s=50, c='y', marker='o')
plt.scatter(exam1_1, exam2_1, s=50, c='b', marker='+')
plt.legend(scatterpoints=1, labels=['Not admitted', 'Admitted'], loc=1)
plt.show()
如图:
-
Sigmoid函数
对数几率回归的假设函数定义为:
其中函数g就是sigmoid函数,定义为:
# SIGMOID Compute sigmoid function
# g = SIGMOID(z) computes the sigmoid of z.
import numpy as np
import math
def sigmoid(z):
g = np.zeros(shape=z.shape)
g = 1/(1+math.e**(-z))
return g
迭代训练1000000次,得到梯度下降结果如下,可以看到代价函数在不断减小,并逐渐收敛于期望代价:
Running Gradient Descent ...
After 0 steps, the cost function: [0.69829069]
After 100000 steps, the cost function: [0.38738841]
After 200000 steps, the cost function: [0.31655389]
After 300000 steps, the cost function: [0.28368669]
After 400000 steps, the cost function: [0.2646348]
After 500000 steps, the cost function: [0.25216993]
After 600000 steps, the cost function: [0.24337911]
After 700000 steps, the cost function: [0.23685629]
After 800000 steps, the cost function: [0.23183607]
After 900000 steps, the cost function: [0.22786442]
Cost at theta found by gradient descent: 0.224654
Expected cost (approx): 0.203
-
代价函数和梯度
对数几率回归的代价函数定义如下:
代价函数的梯度是一个与相同长度的向量,其中第j个元素(j=0,1,...,n)定义如下:
注意,虽然这个梯度看起来与线性回归的梯度相同,但是公式实际上是不同的,因为线性回归和对数几率回归对的定义不同。
# COSTFUNCTION Compute cost and gradient for logistic regression
# J = COSTFUNCTION(theta, X, y) computes the cost of using theta as the
# parameter for logistic regression and the gradient of the cost
# w.r.t. to the parameters.
import numpy as np
import math
from sigmoid import sigmoid
def costFunction(theta, X, y):
m = len(y) # number of training examples
J = 0
grad = np.zeros(shape=(len(theta), 1))
for i in range(m):
# 以e为底
J += -y[i]*math.log(sigmoid(X[i].dot(theta))) - (1-y[i])*math.log((1-sigmoid(X[i].dot(theta))))
J = J/m
for i in range(len(theta)):
for j in range(m):
grad[i] += (sigmoid(X[j].dot(theta))-y[j]) * X[j][i]
grad[i] = grad[i]/m
return J, grad
-
梯度下降
同时更新所有的:
(看起来同线性回归的公式一模一样)
from costFunction import costFunction
import numpy as np
def gradientDescent(X, y, theta, alpha, num_iters):
m = len(y) # number of training examples
J_history = np.zeros(shape=(num_iters, 1))
for i in range(num_iters):
_, grad = costFunction(theta, X, y)
for j in range(len(theta)):
theta[j] -= alpha*grad[j]
# Save the cost J in every iteration
cost, _ = costFunction(theta, X, y)
J_history[i] = cost
if i % 100000 == 0:
print("After %d steps, the cost function:" % i, J_history[i])
# print("the gradient:", theta)
return theta, J_history[-1]
-
评价对数几率回归
评估对数几率回归得到的参数的一种方法是查看学习的模型对训练集的预测准确率。predict函数将根据给定的数据集和学习的参数向量生成“1”或“0”预测,并通过计算与示例一致的结果百分比来得到分类器的训练准确率。
# PREDICT Predict whether the label is 0 or 1 using learned logistic
# regression parameters theta
# p = PREDICT(theta, X) computes the predictions for X using a
# threshold at 0.5 (i.e., if sigmoid(theta'*x) >= 0.5, predict 1)
from sigmoid import sigmoid
import numpy as np
def predict(theta, X):
m = len(X)
p = np.zeros(shape=(m, 1))
for i in range(m):
if sigmoid(X[i].dot(theta)) > 0.5:
p[i] = 1
else:
p[i] = 0
return p
结果如下,可以看到在经过1000000次迭代训练后,训练的准确率已经与期望准确率相同:
Train Accuracy: 89.000000
Expected accuracy (approx): 89.0