目录
Exercise 2: Logistic Regression
1.3 Cost function and gradient
2. Regularized logistic regression
2.2 Cost function and gradient
Exercise 2: Logistic Regression
需要用到的库
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import scipy.optimize as opt
使用scipy的optimize库进行训练。
1. Logistic Regression
1.1 Plotting
读取数据并画图
def read_file(file):
data = pd.read_csv(file, header=None)
data = np.array(data)
return data
def plotData(X, y):
plt.figure(figsize=(6, 4), dpi=150)
X1 = X[y == 0, :]
X2 = X[y == 1, :]
plt.plot(X1[:, 0], X1[:, 1], 'yo')
plt.plot(X2[:, 0], X2[:, 1], 'k+')
plt.xlabel('Exam 1 score')
plt.ylabel('Exam 2 score')
plt.legend(['Admitted', 'Not admitted'], loc='upper right')
plt.show()
## Load Data
data=read_file('ex2data1.txt')
X = data[:, 0:2]
y = data[:, 2]
## ==================== Part 1: Plotting ====================
print('Plotting data with + indicating (y = 1) examples and o indicating (y = 0) examples.')
plotData(X, y)
print('Program paused. Press enter to continue.')
input()
画图结果
1.2 sigmoid function
与ex1的线性回归不同,logistic回归对线性回归的结果增加了sigmoid函数。
logistic regression函数的公式为:
sigmoid函数的公式为:
def sigmoid(x):
return 1 / (np.exp(-x) + 1)
1.3 Cost function and gradient
logistic regression的损失函数为
损失函数的梯度为:
def costfunction(initial_theta, X, y):
m = np.size(y, 0)
cost = (-y.T.dot(np.log(sigmoid(X.dot(initial_theta)))) - \
(1 - y).T.dot(np.log(1-sigmoid(X.dot(initial_theta))))) / m
return cost
def gradient(initial_theta, X, y):
m, n = X.shape
initial_theta = initial_theta.reshape((n, 1))
grad = X.T.dot(sigmoid(X.dot(initial_theta)) - y) / m
return grad.flatten()
## ============ Part 2: Compute Cost and Gradient ============
m, n = X.shape
X = np.c_[np.ones(m), X]
initial_theta = np.zeros((n + 1, 1))
y = y.reshape((m, 1))
#cost, grad = costFunction(initial_theta, X, y)
cost, grad = costfunction(initial_theta, X, y), gradient(initial_theta, X, y)
print('Cost at initial theta (zeros): %f' % cost);
print('Expected cost (approx): 0.693');
print('Gradient at initial theta (zeros): ');
print('%f %f %f' % (grad[0], grad[1], grad[2]))
print('Expected gradients (approx): -0.1000 -12.0092 -11.2628')
#
theta1 = np.array([[-24], [0.2], [0.2]], dtype='float64')
cost, grad = costfunction(theta1, X, y), gradient(theta1, X, y)
#cost, grad = costFunction(theta1, X, y)
print('Cost at initial theta (zeros): %f' % cost);
print('Expected cost (approx): 0.218');
print('Gradient at initial theta (zeros): ');
print('%f %f %f' % (grad[0], grad[1], grad[2]))
print('Expected gradients (approx): 0.043 2.566 2.647')
print('Program paused. Press enter to continue.')
input()
算法输出结果:
1.4 Optimize
使用scipy库里的optimize库进行训练,得到最终的theta结果。
## ============= Part 3: Optimizing using fminunc =============
initial_theta = np.zeros(n + 1)
result = opt.minimize(fun=costfunction, x0=initial_theta, args=(X, y), method='SLSQP', jac=gradient)
print('Cost at theta found by fminunc: %f' % result['fun'])
print('Expected cost (approx): 0.203')
print('theta:')
print('%f %f %f' % (result['x'][0], result['x'][1], result['x'][2]))
print('Expected theta (approx):')
print(' -25.161 0.206 0.201')
print('Program paused. Press enter to continue.')
input()
训练输出结果:
1.5 Predict
预测中大于0.5的为1,小于0.5的为0。
def predict(theta, X):
m = np.size(theta, 0)
rst = sigmoid(X.dot(theta.reshape(m, 1)))
rst = rst > 0.5
return rst
## ============== Part 4: Predict and Accuracies ==============
prob = sigmoid(np.array([1, 45, 85], dtype='float64').dot(result['x']))
print('For a student with scores 45 and 85, we predict an admission ' \
'probability of %.3f' % prob)
print('Expected value: 0.775 +/- 0.002\n')
p = predict(result['x'], X)
print('Train Accuracy: %.1f%%' % (np.mean(p == y) * 100))
print('Expected accuracy (approx): 89.0%\n')
预测输出结果:
分类可视化结果
2. Regularized logistic regression
2.1 Plotting
读取数据并画图
2.2 Cost function and gradient
损失函数和梯度
def costfunction(initial_theta, X, y, lamb):
m, n = X.shape
initial_theta = initial_theta.reshape((n, 1))
y = y.reshape((m, 1))
cost = (-y.T.dot(np.log(sigmoid(X.dot(initial_theta)))) - \
(1 - y).T.dot(np.log(1-sigmoid(X.dot(initial_theta))))) / m \
+ lamb / (2 * m) * initial_theta.T.dot(initial_theta)
return cost
def gradient(initial_theta, X, y, lamb):
m, n = X.shape
y = y.reshape((m, 1))
initial_theta = initial_theta.reshape((n, 1))
grad = X.T.dot(sigmoid(X.dot(initial_theta)) - y) / m \
+ lamb / m * initial_theta
return grad
2.3 Optimize
initial_theta = np.ones(n)
lamb = 1
cost = costfunction(initial_theta, X, y, lamb)
grad = gradient(initial_theta, X, y, lamb)
result = opt.minimize(fun=costfunction, x0=initial_theta, args=(X, y, lamb), method='SLSQP', jac=gradient)
p = predict(result['x'], X)
print('Train Accuracy: %.1f%%' % (np.mean(p.flatten() == y) * 100))
print('Expected accuracy (approx): 83.1%\n')
最终计算的准确率结果为82.2%