- 逻辑回归既可以看做时回归算法,也可以看做是分类算法
- 通常作为分类算法用,理论上只能解决二分类问题
逻辑函数(sigmoid函数)
σ ( t ) = 1 1 + e − t \sigma(t) = \frac{1}{1+e^{-t}} σ(t)=1+e−t1
二分类
p ( y = 1 ∣ x , w ) = 1 1 + e − ( w T + b ) p(y=1|x,w) = \frac{1}{1+e^{-(w^T+b)}} p(y=1∣x,w)=1+e−(wT+b)1
p ( y = 0 ∣ x , w ) = 1 1 + e − ( w T + b ) = 1 − p ( y = 1 ∣ x , w ) p(y=0|x,w) = \frac{1}{1+e^{-(w^T+b)}} = 1-p(y=1|x,w) p(y=0∣x,w)=1+e−(wT+b)1=1−p(y=1∣x,w)
两个式子可以合并成:
p
(
y
∣
x
,
w
)
=
p
(
y
=
1
∣
x
,
w
)
y
[
1
−
p
(
y
=
1
∣
x
,
w
)
]
1
−
y
p(y|x,w) = p(y=1|x,w)^y[1-p(y=1|x,w)]^{1-y}
p(y∣x,w)=p(y=1∣x,w)y[1−p(y=1∣x,w)]1−y
损失函数
c o s t = { − l o g ( p ^ ) i f y = 1 − l o g ( 1 − p ^ ) i f y = 0 cost = \begin{cases}-log(\hat{p}) &if &y=1 \\ -log(1-\hat{p}) & if & y=0\end{cases} cost={−log(p^)−log(1−p^)ifify=1y=0
也可写成:
c
o
s
t
=
−
y
l
o
g
(
p
^
)
−
(
1
−
y
)
l
o
g
(
1
−
p
^
)
cost = -ylog(\hat{p}) - (1-y)log(1-\hat{p})
cost=−ylog(p^)−(1−y)log(1−p^)
梯度下降
- 代价函数
J ( θ ) = 1 m ∑ 1 m c o s t = − 1 m ∑ i = 1 m ( y ( i ) l o g ( σ ( X b ( i ) θ ) ) + ( 1 − y ( i ) ) l o g ( 1 − σ ( X b ( i ) θ ) ) ) J(\theta) = \frac{1}{m}\sum_{1}^{m}cost=-\frac{1}{m}\sum_{i=1}^{m}(y^{(i)}log(\sigma(X_b^{(i)}\theta))+(1-y^{(i)})log(1-\sigma(X_b^{(i)}\theta))) J(θ)=m11∑mcost=−m1i=1∑m(y(i)log(σ(Xb(i)θ))+(1−y(i))log(1−σ(Xb(i)θ)))
- 梯度
J ( θ ) θ j = 1 m ∑ i = 1 m ( σ ( X b ( i ) θ ) − y ( i ) ) X j ( i ) \frac{J(\theta)}{\theta_j} = \frac{1}{m}\sum_{i=1}^{m}(\sigma(X_b^{(i)}\theta)-y^{(i)})X_j^{(i)} θjJ(θ)=m1i=1∑m(σ(Xb(i)θ)−y(i))Xj(i)
J ( θ ) θ = 1 m X b T ( σ ( X b θ ) − y ) \frac{J(\theta)}{\theta}=\frac{1}{m}{X_b}^T(\sigma(X_b\theta)-y) θJ(θ)=m1XbT(σ(Xbθ)−y)
代码实现
- 自定义
采用梯度下降法
# coding=utf-8
import numpy as np
from sklearn.metrics import accuracy_score
class LogisticRegression:
def __init__(self):
self.coef_ = None
self.intercept_ = None
self.theta_ = None
def _sigma(self, t):
return 1/(1+np.exp(-t))
def _J(self, theta, X_b, y_train):
y_hat = self._sigma(X_b.dot(theta))
return -np.sum(y_train*np.log(y_hat)+(1-y_train)*np.log(1-y_hat))/len(y_train)
def _dJ(self, theta, X_b, y_train):
return X_b.T.dot(self._sigma(X_b.dot(theta))-y_train)/len(y_train)
def fit(self, X_train, y_train, alpha=0.01, cycle_index=1e4, interv=1e-8):
start_index = 0
X_b = np.hstack([np.ones((len(X_train), 1)), X_train])
theta = np.zeros(X_b.shape[1])
while start_index < cycle_index:
last_theta = theta
theta = theta - alpha * self._dJ(theta, X_b, y_train)
if abs(self._J(theta, X_b, y_train) - self._J(last_theta, X_b, y_train)) < interv:
break
start_index += 1
self.theta_ = theta
self.coef_ = theta[1:]
self.intercept_ = theta[0]
return self
def _predict(self, X_test):
X_b = np.hstack([np.ones((len(X_test), 1)), X_test])
return self._sigma(X_b.dot(self.theta_))
def predict(self, X_test):
prabl = self._predict(X_test)
return np.array(prabl >= 0.5, dtype=int)
def score(self, X_test, y_test):
return accuracy_score(y_test, self.predict(X_test))
def __repr__(self):
return 'This Is LogisticRegression'
- sk-learn实现
from sklearn.linear_model import LogisticRegression
"""
multi_class取值:
auto=ovr
ovr:多分类OvR
multinomial:多分类OvO
"""
log_reg = LogisticRegression(solver='lbfgs',multi_class='auto')
log_reg.fit(X_train, y_train)
解决多分类
OvR
One vs Rest,每次分为某一类和其他类,n个类别,就要进行n次分类
from sklearn.multiclass import OneVsRestClassifier
ovr = OneVsRestClassifier(分类器)
ovr.fit(X_train, y_train)
ovr.score(X_test, y_test)
OvO
One vs One,两两排列组合,投票,选取票数最高的类别
from sklearn.multiclass import OneVsOneClassifier
ovr = OneVsOneClassifier(分类器)
ovr.fit(X_train, y_train)
ovr.score(X_test, y_test)
逻辑回归中的模型正则化
- L1正则化
C ⋅ J ( θ ) + L 1 C·J(\theta)+L1 C⋅J(θ)+L1
- L2正则化
C ⋅ J ( θ ) + L 2 C·J(\theta)+L2 C⋅J(θ)+L2
"""
sk-leran逻辑回归的默认参数如下,penalty代表正则项方式,默认是l2正则化
"""
LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,
intercept_scaling=1, max_iter=100, multi_class='warn',
n_jobs=None, penalty='l2', random_state=None, solver='warn',
tol=0.0001, verbose=0, warm_start=False)