机器学习-逻辑回归(Logistic Regression)

最新推荐文章于 2023-12-20 17:38:05 发布

Santorinisu

最新推荐文章于 2023-12-20 17:38:05 发布

阅读量527

点赞数

分类专栏：机器学习

Santorinisu博客，未经授权，禁止转载!!

本文链接：https://blog.csdn.net/Santorinisu/article/details/104406892

版权

机器学习专栏收录该内容

36 篇文章 3 订阅

订阅专栏

Section I: Brief Glimpse Into Logistic Regression

Logistic regression is a classification model that is very easy to implement but performs very well on linearly separable classes. It is one of the most widely used algorithms for classification in industry. Similar to perceptron and AdaLine, the logistic regression model is also a linear model for binary classification that can be also extended to multiclass classification, for example, via OvR technique.

Section II: Model Logistic Regression Via Self-Coded and Sklearn

Step 1: Logistic sigmoid function

import matplotlib.pyplot as plt
import numpy as np

plt.rcParams['figure.dpi']=200
plt.rcParams['savefig.dpi']=200
font = {'family': 'Times New Roman',
        'weight': 'light'}
plt.rc("font", **font)

def sigmoid(z):
    return 1.0/(1.0+np.exp(-z))

z=np.arange(-7,7,0.1)
phi_z=sigmoid(z)
plt.plot(z,phi_z)
plt.axvline(0.0,color='k')
plt.ylim(-0.1,1.1)
plt.xlabel('z')
plt.ylabel('$\phi (z)$')
plt.yticks([0.0,0.5,1.0])
ax=plt.gca()
ax.yaxis.grid(True)
plt.savefig('./fig1.png')
plt.show()

在这里插入图片描述
从上图可以得知，Sigmoid函数接受负无穷到正无穷之间的实数，并将其转化为0-1之间的小数。其中，Sigmoid函数与Y轴的交叉点在[0,0.5]。

Step 2: Logistic cost function formulated via maximum log-likeligood function

import matplotlib.pyplot as plt
import numpy as np
from LogisticRegression.sigmoid import sigmoid

plt.rcParams['figure.dpi']=200
plt.rcParams['savefig.dpi']=200
font = {'family': 'Times New Roman',
        'weight': 'light'}
plt.rc("font", **font)

#Logistic regression cost function
#For one-sample training instance
#J(f(z),y:w)=-ylog(f(z))-(1-y)log(1-f(z))

def cost_1(z):
    return -np.log(sigmoid(z))

def cost_0(z):
    return -np.log(1-sigmoid(z))

z=np.arange(-10,10,0.1)
phi_z=sigmoid(z)

c1=[cost_1(x) for x in z]
plt.plot(phi_z,c1,label='J(w) if y=1')
c0=[cost_0(x) for x in z]
plt.plot(phi_z,c0,linestyle='--',label='J(w) if y=0')
plt.ylim(0.0,5.1)
plt.xlim([0,1])
plt.xlabel('$\phi$(z)')
plt.ylabel('J(w)')
plt.legend(loc='upper left')
plt.savefig('./fig2.png')
plt.show()

在这里插入图片描述
小结：
结合上图，可得知：其一，当预测类别与真实类别相同时，训练成本函数均趋近于0；其二，若预测类别与真是类别完全不同时，成本函数惩罚幅度更大。此外，有趣的是Sigmoid函数预测值为[0,1]之间的小数，可理解为输出值经激活函数计算后的预测值趋近于真实值的概率。同样，该模型也可应用OvR技术改造后，扩展于多分类应用。

Step 3: Logistic regression implementation

第一部分：Logistic Regression实现：

import numpy as np

class LogisticRegressionGD(object):
    def __init__(self,eta=0.05,n_iter=100,random_state=1):
        self.eta=eta
        self.n_iter=n_iter
        self.random_state=random_state

    def fit(self,X,y):
        rgen=np.random.RandomState(self.random_state)
        self.w_=rgen.normal(loc=0.0,scale=0.01,
                            size=1+X.shape[1])
        self.cost_=[]

        for i in range(self.n_iter):
            net_input=self.net_input(X)
            output=self.activation(net_input)
            errors=(y-output)
            self.w_[1:]+=self.eta*X.T.dot(errors)
            self.w_[0]+=self.eta*errors.sum()

            cost=(-y.dot(np.log(output))-(1-y).dot(np.log(1-output)))
            self.cost_.append(cost)
        return self

    def net_input(self,X):
        return np.dot(X,self.w_[1:])+self.w_[0]

    def activation(self,z):
        return 1.0/(1.0+np.exp(-np.clip(z,-250,250)))

    def predict(self,X):
        return np.where(self.net_input(X)>=0.0,1,0)

第二部分：调用形式：

import matplotlib.pyplot as plt
from sklearn import datasets
from sklearn.model_selection import train_test_split
from LogisticRegression import logistic_regression
from LogisticRegression.visualize import plot_decision_regions

plt.rcParams['figure.dpi']=200
plt.rcParams['savefig.dpi']=200
font = {'family': 'Times New Roman',
        'weight': 'light'}
plt.rc("font", **font)

##Section 1: Load data and split it into train/test dataset
iris=datasets.load_iris()
X=iris.data[:,[2,3]]
y=iris.target

X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.3,random_state=1,stratify=y)
X_train_01_subset=X_train[(y_train==0)|(y_train==1)]
y_train_01_subset=y_train[(y_train==0)|(y_train==1)]

lrgd=logistic_regression.LogisticRegressionGD(eta=0.05,n_iter=1000,
                                              random_state=1)
lrgd.fit(X_train_01_subset,y_train_01_subset)

plot_decision_regions(X=X_train_01_subset,
                      y=y_train_01_subset,
                      classifier=lrgd)
plt.xlabel('sepal length [cm]')
plt.ylabel('petal length [cm]')
plt.legend(loc='upper left')
plt.savefig('./fig3.png')
plt.show()

在这里插入图片描述

Step 4: Train a Logistic Regression model with Sklearn

import matplotlib.pyplot as plt
from sklearn import datasets
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from LogisticRegression.visualize_test_idx import plot_decision_regions

plt.rcParams['figure.dpi']=200
plt.rcParams['savefig.dpi']=200
font = {'family': 'Times New Roman',
        'weight': 'light'}
plt.rc("font", **font)

##Section 1: Load data and split it into train/test dataset
iris=datasets.load_iris()
X=iris.data[:,[2,3]]
y=iris.target
X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.3,random_state=1,stratify=y)

#Section 2: Preprocessing data in standardized
sc=StandardScaler()
sc.fit(X_train)
X_train_std=sc.transform(X_train)
X_test_std=sc.transform(X_test)

#Section 3: Train Logistic Regression model
lr=LogisticRegression(C=100,random_state=1)
lr.fit(X_train_std,y_train)
X_combined_std=np.vstack((X_train_std,X_test_std))
y_combined=np.hstack((y_train,y_test))

plot_decision_regions(X=X_combined_std,
                      y=y_combined,
                      classifier=lr,
                      test_idx=range(105,150))
plt.xlabel('petal length [standardized]')
plt.ylabel('petal width [standardized]')
plt.legend(loc='upper left')
plt.savefig('./fig4.png')
plt.show()

在这里插入图片描述
小结：
由上图可以得知，Sklearn中的Logistic Regression模型可以有效地对Iris花类别进行划分。

为便于观察基于Logistic Regression模型的predict,predict_proba等函数的用法，此处也给出其用法，具体如下：

#Section 4: Predict the probabilityn and class type
print("The probability belonging to each class: \n",lr.predict_proba(X_test_std[:3]))
print("Class type: \n",lr.predict_proba(X_test_std[:3,:]).argmax(axis=1))
print("Class type: \n",lr.predict(X_test_std[:3,:]))

运行结果如下：

The probability belonging to each class: 
 [[3.17983737e-08 1.44886616e-01 8.55113353e-01]
 [8.33962295e-01 1.66037705e-01 4.55557009e-12]
 [8.48762934e-01 1.51237066e-01 4.63166788e-13]]
Class type: 
 [2 0 0]
Class type: 
 [2 0 0]

Section III: Tackle Overfitting Via Regulation

过拟合是机器学习中常见问题，即模型在训练数据中经过学习，训练效果较佳，但是在未知的测试数据集中，泛化能力较差。具体地，“过拟合” 即为高方差，**“欠拟合”**即为高偏差，即当前模型参数不足以完全学习训练数据中隐含信息。
抑制拟合的方式有L1和L2两种正则化方式，L1为稀疏化，即为对大数不敏感，而L2则为稠密化，对大数较为敏感，惩罚力度较大。

import matplotlib.pyplot as plt
from sklearn import datasets
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression

plt.rcParams['figure.dpi']=200
plt.rcParams['savefig.dpi']=200
font = {'family': 'Times New Roman',
        'weight': 'light'}
plt.rc("font", **font)

##Section 1: Load data and split it into train/test dataset
iris=datasets.load_iris()
X=iris.data[:,[2,3]]
y=iris.target
X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.3,random_state=1,stratify=y)

#Section 2: Preprocessing data in standardized
sc=StandardScaler()
sc.fit(X_train)
X_train_std=sc.transform(X_train)
X_test_std=sc.transform(X_test)

#Section 3: The effect of regulation stength on weight parameters
weights,params=[],[]
for c in np.arange(-5,5):
    lr=LogisticRegression(C=10.**c,random_state=1)
    lr.fit(X_train_std,y_train)
    weights.append(lr.coef_[1])
    params.append(10.**c)

weights=np.array(weights)
plt.plot(params,weights[:,0],label='petal length')
plt.plot(params,weights[:,1],label='petal width')
plt.ylabel('weight coefficient')
plt.xlabel('C')
plt.legend(loc='upper left')
plt.xscale('log')
plt.savefig('./fig5.png')
plt.show()

这里需注意权重搜集的是Class 1类别的权重参数，纵向维度为特征空间，横向为Class类别总数。此处仅搜集第一个类别的权重参数。

lr.coef_
Out[3]: 
array([[-4.55059393e-04, -4.37654048e-04],
       [ 9.45879351e-05,  5.76462665e-05],
       [ 3.60471456e-04,  3.80007780e-04]])

在这里插入图片描述
由上图可以得知，C参数越小，权重参数趋近收缩，反之则发散而导致训练效果欠佳。此外，C参数为正则化参数lamda的倒数，C越小，则说明正则化力度越大。

参考文献：
Sebastian Raschka, Vahid Mirjalili. Python机器学习第二版. 南京：东南大学出版社，2018.

Santorinisu

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
机器学习-逻辑回归(Logistic Regression)

Section I: Brief Glimpse Into Logistic RegressionLogistic regression is a classification model that is very easy to implement but performs very well on linearly separable classes. It is one of the mo...
复制链接

扫一扫

专栏目录