Logistic Regression

最新推荐文章于 2022-11-21 15:17:58 发布

wo1185246535

最新推荐文章于 2022-11-21 15:17:58 发布

阅读量846

点赞数 1

分类专栏：机器学习与模式识别文章标签：机器学习

本文链接：https://blog.csdn.net/yangnanhai93/article/details/51580917

版权

机器学习与模式识别专栏收录该内容

6 篇文章 0 订阅

订阅专栏

Logistic Regression

Logistic regression sometimes called the logistic model or logit model, analyzes the relationship between multiple independent variables and a categorical dependent variable and estimates the probability of occurence of an event by fitting data to a logistic curve. There are two models of an logistic regression, binary logistic regression and multinominal logistic regression. Binary logistic regression is typically used when the dependent variable is dischotomous and the independent variables are either continuous or categorial. When the dependent varible is not dichotomous and is comprised of more than two categories, a multinomial logistic regression can be employed.[1]

Definition

An explanation of logistic regression can begin with an explanation of the standard logistic function. The logistic function is useful because it can take an input with any value from negative to positive infinity, whereas the output always takes values between zero and one and hence is interpretable as a probability. The logistic function $\sigma(t)$ is defined as follows:[2]

σ (t) = e t 1 + e t = 1 1 + e - t

$\sigma(t)=\frac{e^t}{1+e^t}=\frac{1}{1+e^{-t}}$
A graph of the logistic function on the t-interval (-6,6) is shown as below:
a graph of logistic function

[from wikipedia]

In linear regresion

h θ (x) = θ T x

$h_\theta(x)=\theta^Tx$
When using logistic regresion, we want the output values between 0 and 1,so

h θ (x) = σ (θ T x) = 1 1 + e - θ T x

$h_\theta(x)=\sigma(\theta^Tx)=\frac{1}{1+e^{-\theta^Tx}}$

so we can define the cost function of logistic regression:

c o s t (h θ (x), y) = {h θ (x) . . . . . . . . . . . . . . . y = 1 1 - h θ (x) . . . . . . . . . . . y = 0

$cost(h_\theta(x),y)= \left\{\begin{matrix} h_\theta(x)...............y=1 \\ 1-h_\theta(x)...........y=0 \end{matrix}\right.$
we can simplify the objective function as:

p r o b (h θ (x), y) = y * h θ (x) + (1 - y) * (1 - h θ (x))

$prob(h_\theta(x),y)=y*h_\theta(x)+(1-y)*(1-h_\theta(x))$

then the objective function of logistic regression can be defined as below:

m a x m i z e \prod i = 1 m c o s t (h θ (x), y) = \prod i = 1 m {y * h θ (x) + (1 - y) * (1 - h θ (x))}

$maxmize \prod_{i=1}^{m}cost(h_\theta(x),y)=\prod_{i=1}^{m}\{y*h_\theta(x)+(1-y)*(1-h_\theta(x))\}$
Equivalent to

J (θ) = m i n i m i s e {- 1 m \sum i = 1 m l o g (y * h θ (x) + (1 - y) * (1 - h θ (x)))}

$J(\theta)=\mathbf{minimise}\{-\frac{1}{m}\sum_{i=1}^{m}log(y*h_\theta(x)+(1-y)*(1-h_\theta(x)))\}$
since when y=1

l o g (y * h θ (x) + (1 - y) * (1 - h θ (x))) = l o g (h θ (x))

$log(y*h_\theta(x)+(1-y)*(1-h_\theta(x)))=log(h_\theta(x))$
when y=0

l o g (y * h θ (x) + (1 - y) * (1 - h θ (x))) = l o g (1 - h θ (x))

$log(y*h_\theta(x)+(1-y)*(1-h_\theta(x)))=log(1-h_\theta(x))$
so

J (θ) = m i n i m i s e {- 1 m \sum i = 1 m y * l o g (h θ (x)) + (1 - y) * l o g (1 - h θ (x))}

$J(\theta)=\mathbf{minimise}\{-\frac{1}{m}\sum_{i=1}^{m}y*log(h_\theta(x))+(1-y)*log(1-h_\theta(x))\}$

Since

σ (t)' = σ (t) (1 - σ (t))

$\sigma(t)^{'}=\sigma(t)(1-\sigma(t))$

(1 - σ (t))' = σ (t) (σ (t) - 1)

$(1-\sigma(t))^{'}=\sigma(t)(\sigma(t)-1)$
so

δ J ( θ ) δ θ j = - 1 m \sum i = 1 m y * 1 h θ ( x ) * h θ (x)' + (1 - y) * 1 1 - h θ ( x ) * (1 - h θ (x))'

$\frac{\delta J(\theta)}{\delta\theta_j}=-\frac{1}{m}\sum_{i=1}^{m}{y*\frac{1}{h_\theta(x)}*{h_\theta(x)}^{'}+(1-y)*\frac{1}{1-h_\theta(x)}*(1-h_\theta(x))^{'}}$
then

δ J ( θ ) δ θ j = - 1 m \sum i = 1 m y * 1 h θ ( x ) * h θ (x) * (1 - h θ (x)) * x j + (1 - y) * 1 1 - h θ ( x ) * h θ (x) (h θ (x) - 1) * x j

$\frac{\delta J(\theta)}{\delta\theta_j}=-\frac{1}{m}\sum_{i=1}^{m}{y*\frac{1}{h_\theta(x)}*h_\theta(x)*{(1-h_\theta(x))*x_j}+(1-y)*\frac{1}{1-h_\theta(x)}*h_\theta(x)(h_\theta(x)-1)*x_j}$
then

δ J ( θ ) δ θ j = - 1 m \sum i = 1 m y * (1 - h θ (x)) * x j - (1 - y) * h θ (x) * x j

$\frac{\delta J(\theta)}{\delta\theta_j}=-\frac{1}{m}\sum_{i=1}^{m}{y*{(1-h_\theta(x))*x_j}-(1-y)*h_\theta(x)*x_j}$
then

δ J ( θ ) δ θ j = 1 m \sum i = 1 m (h θ (x) - y) * x j

$\frac{\delta J(\theta)}{\delta\theta_j}=\frac{1}{m}\sum_{i=1}^{m}{(h_\theta(x)-y)*x_j}$

#!/usr/bin/env python
# -*- coding: utf-8 -*-
################################
##Author: Vincent.Y
################################

import numpy as np

def sigmoid(X):
    return 1.0/(1.0+np.exp(-X))

class LogisticRegression(object):
    """
    solver : {‘newton-cg’, ‘lbfgs’, ‘liblinear’, ‘sag’,'sgd'}
        sgd support only
    alpha:float
        L1
    lam: float
        L2
    max_iter:int
        iteration of for the solvers to converge
    """
    def __init__(self,solver="sgd",alpha=0,lam=1,lr=0.2,max_iter=200,bias=False):
        self.solver=solver
        self.coef_=None
        self.bias=bias
        self.lam=lam
        self.alpha=alpha
        if self.solver=='sgd':
            self.lr=lr
            self.max_iter=max_iter

    def gradient_descent(self,X,y):
        m=len(y)
        for i in xrange(0,self.max_iter):
            pred=sigmoid(X.dot(self.coef_))
            for j in xrange(0,X.shape[1]):
                tmp=X[:,j]
                errors = np.mean((pred - y) * tmp) + 2*self.lam*(self.coef_[j] if j< X.shape[1] else 0) + self.alpha*(0 if self.coef_[j]==0 else 1)
                self.coef_[j]=self.coef_[j] - self.lr * errors
        return self.coef_

    def fit(self,X,y):
        if self.bias:
            X = np.hstack([X,np.ones((X.shape[0],1))])

        if self.solver=="ls":
            G = self.lam * np.eye(X.shape[1])
            G[-1, -1] = 0  # Don't regularize bias
            self.coef_=np.dot(np.linalg.inv(np.dot(X.T, X) + np.dot(G.T, G)),np.dot(X.T, y))
        else:
            self.coef_=np.zeros(X.shape[1])
            self.coef_=self.gradient_descent(X,y)

    def predict_proba(self,X):
        if self.bias:
            X = np.hstack([X,np.ones((X.shape[0],1))])
        return sigmoid(X.dot(self.coef_))

    def predict(self,X):
        if self.bias:
            X = np.hstack([X,np.ones((X.shape[0],1))])
        return np.array([ 1 if i>0.5 else 0 for i in sigmoid(X.dot(self.coef_))])


if __name__=="__main__":
    x=np.array([1,2,3])
    x=x.reshape(-1,1)
    y=np.array([0,0,1])

    model=LogisticRegression(bias=True,solver='sgd',max_iter=100,lam=0)
    model.fit(x,y)
    print model.coef_
    print model.predict(x)

[1]http://synapse.koreamed.org/Synapse/Data/PDFData/0006JKAN/jkan-43-154.pdf
[2]https://en.wikipedia.org/wiki/Logistic_regression
[3]https://github.com/muyinanhai/ml-learn

wo1185246535

关注

1
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
Logistic Regression

Logistic RegressionLogistic regression sometimes called the logistic model or logit model, analyzes the relationship between multiple independent variables and a categorical dependent variable and esti
复制链接

扫一扫

专栏目录