# 机器学习实战-Logistic回归

Logistic回归属于优化类的算法。Logistic回归的主要思想：根据现有的数据对分类边界线建立回归公式，达到分类的目的。假设我们有一堆数据，需要划一条线(最佳直线)对其分类，这就是Logistic回归的目的了。

σ(z)=11+ez

# 2.最佳回归系数的确定

x写成向量的形式，我们记为：z=wTx$z=w^Tx$

## 2.1开始公式推导

p(y=1|x,w)=hz(x)$p(y=1|x,w)=h_z(x)$
p(y=0|x,w)=1hz(x)$p(y=0|x,w)=1-h_z(x)$

p(y|x,w)=hz(x)y(1hz(x))1y$p(y|x,w)=h_z(x)^y(1-h_z(x))^{1-y}$

L(w)=Πni=1p(yi|x,w)=Πni=1hz(xi)yi(1hz(xi))1y$L(w)=\Pi_{i=1}^np(y^i|x,w)=\Pi_{i=1}^nh_z(x^i)^{y^i}(1-h_z(x^i))^{1-y}$

l(w)=yiloghz(xi)+(1y)log(1hz(xi))$l(w)=y^ilogh_z(x^i)+(1-y)log(1-h_z(x^i))$

## 2.2梯度上升算法

f(x)=x2+2x+1$f(x)=-x^2+2x+1$

xi+1=xi+α(xi)xi

l(w)wj=l(w)g(wTx)g(wTx)wTxwTxwj$\frac{\partial l(w)}{\partial w_j}=\frac{\partial l(w)}{\partial g(w^Tx)}*\frac{\partial g(w^Tx)}{\partial w^Tx}*\frac{\partial w^Tx}{\partial w_j}$

1.l(w)g(wTx)=y1g(wTx)+(1y)11g(wTx)(1)$\frac{\partial l(w)}{\partial g(w^Tx)}=y*\frac{1}{g(wTx)}+(1-y)\frac{1}{1-g(w^Tx)}(-1)$
2.g(wTx)wTx=(1)(1+ewTx)2ewTx(1)=g(wTx)ewTx1+ewTx=g(wTx)(1g(wTx))$\frac{\partial g(w^Tx)}{\partial w^Tx}=(-1)(1+e^{w^Tx})^{-2}e^{-w^Tx}(-1)=g(w^Tx)\frac{e^{-w^Tx}}{1+e^{w^Tx}}=g(w^Tx)(1-g(w^Tx))$
3.wTxwj=(w0x0+w1x1+...+wmxm)wj=xj$\frac{\partial w^Tx}{\partial w_j}=\frac{\partial (w_0x0+w_1x_1+...+w_mx_m)}{\partial w_j}=x_j$

l(w)wj=y1g(wTx)+(1y)11g(wTx)(1)g(wTx)(1g(wTx))xj=(y(1g(wTx))(1y)g(wTx))xj=(yg(wTx))xj$\frac{\partial l(w)}{\partial w_j}=y*\frac{1}{g(wTx)}+(1-y)\frac{1}{1-g(w^Tx)}(-1)g(w^Tx)(1-g(w^Tx))x_j=(y(1-g(w^Tx))-(1-y)g(w^Tx))x_j=(y-g(w^Tx))x_j$

wj=wj+α(yhw(xi))xij$w_j=w_j+\alpha(y-h_w(x^i))x_j^i$

# 3.code

def calcgrand(dataMatin,labelMatin,cyc_num=500):
dataMat=mat(dataMatin)
labelMat=mat(labelMatin).transpose()
n=shape(dataMat)[1]
weight=ones((n,1))
alpha=0.001
for i in range(cyc_num):
h=sigmod(dataMat*weight)
weight=weight+alpha*dataMat.transpose()*(labelMat-h)
return weight

def stocalcgrand1(dataMatin,labelMatin,numiter=150):
m,n=shape(dataMatin)
alpha=0.01
weight=ones(n)
for i in range(numiter):
dataIndex=range(m)
for j in range(m):
alpha=4/(1.0+j+i)+0.01
randIndex=int(random.uniform(0,len(dataIndex)))
h=sigmod(sum(dataMatin[dataIndex[randIndex]]*weight))
error=labelMatin[dataIndex[randIndex]]-h
weight=weight+alpha*error*dataMatin[dataIndex[randIndex]]
del(dataIndex[randIndex])
return weight