目标:建立分类器(求解出三个参数 $\theta_0 \theta_1 \theta_2 $)
设定阈值,根据阈值判断录取结果
要完成的模块
sigmoid
: 映射到概率的函数model
: 返回预测结果值cost
: 根据参数计算损失gradient
: 计算每个参数的梯度方向descent
: 进行参数更新accuracy
: 计算精度
sigmoid
函数
g ( z ) = 1 1 + e − z g(z) = \frac{1}{1+e^{-z}} g(z)=1+e−z1
- g : R → [ 0 , 1 ] g:\mathbb{R} \to[0, 1] g:R→[0,1]
- g ( 0 ) = 0.5 g(0)=0.5 g(0)=0.5
- g ( − ∞ ) = 0 g(-\infty) = 0 g(−∞)=0
- g ( + ∞ ) = 1 g(+\infty) = 1 g(+∞)=1
def sigmoid(z):
return 1 / (1 + np.exp(-z))
sigmoid(0)
model函数
( θ 0 θ 1 θ 2 ) × ( 1 x 1 x 2 ) = θ 0 + θ 1 x + θ 2 x 2 \begin{array}{ccc} \begin{pmatrix}\theta_{0} & \theta_{1} & \theta_{2}\end{pmatrix} & \times & \begin{pmatrix}1\\ x_{1}\\ x_{2} \end{pmatrix}\end{array}=\theta_{0}+\theta_{1}x+\theta_{2}x_{2} (θ0θ1θ2)×⎝⎛1x1x2⎠⎞=θ0+θ1x+θ2x2
def model(X, theta):
return sigmoid(np.dot(X, theta.T))
损失函数
将对数似然函数去负号
$$
D(h_\theta(x), y) = -y\log(h_\theta(x)) - (1-y)\log(1-h_\theta(x))
$$
求平均损失
J
(
θ
)
=
1
n
∑
i
=
1
n
D
(
h
θ
(
x
i
)
,
y
i
)
J(\theta)=\frac{1}{n}\sum_{i=1}^{n} D(h_\theta(x_i), y_i)
J(θ)=n1i=1∑nD(hθ(xi),yi)
def cost(x, y, theta):
left = np.multiply(-y, np.log(model(x, theta)))
right = np.multiply(1 - y, np.log(1 - model(x, theta)))
return np.sum(left - right) / (len(x))
计算梯度
θ J ∂ θ j = − 1 m ∑ i = 1 n ( y i − h θ ( x i ) ) x i j \frac{\theta J}{\partial \theta_j}=-\frac{1}{m}\sum_{i=1}^n (y_i - h_\theta (x_i))x_{ij} ∂θjθJ=−m1i=1∑n(yi−hθ(xi))xij
def gradient(x, y, theta):
grad = np.zeros(theta.shape)
error = (model(x, theta) - y).ravel()
for j in range(len(theta.ravel())):
term = np.multiply(error, x[:, j])
grad[0, j] = np.sum(term) / len(x)
return grad