机器学习笔记：Logistic回归

本文链接：https://blog.csdn.net/u011094454/article/details/76358946

Logistic回归

优点：计算代价不高，易于理解和实现

缺点：容易欠拟合，分类精度可能不高

适用数据类型：数值型和标称型数据

实现Logistic回归我们需要借助类似于阶跃函数的Sigmoid函数，sigmoid（z） = 1/（1+exp（-z））。

可以知道sigmoid函数的值域是（0，1），在足够大的定义域上此函数近似于阶跃函数。

要实现Logistic回归，在每个特征上都乘以一个回归系数，然后把所有的结果值相加，将这个结果代入Sigmoid函数中，进而得到一个范围在0-1之间的数值。任何大于0.5的数据被分入1类，小于0.5即被归入0类。

Sigmoid函数的输入记为z，由以下公式得出：

z = w0x0+w1x1+w2x2+w3x3+w4x4+......+wnxn

这个公式可以由矩阵相乘的方式简化：

z = wTx

其中x是分类器的输入数据，向量w也就是我们需要求得的最佳系数

结合梯度上升算法的迭代公式：

w：=w+a*xT*(y-h(z))

其中一定会有人疑惑a后面是怎么来的，它简而言之是在沿着梯度上升误差最小的方向，也就是离最大值最近的方向。

具体的证明可以查看 吴恩达视频的逻辑回归，推导过程可以查看最小二乘法的解法。

逻辑回归损失函数：逻辑回归损失函数

最小二乘法：最小二乘法

Logistic回归梯度上升优化算法python3.6实现：

#Logistic回归梯度上升优化算法
def gradAscent(dataMatIn,classLabels):
    #将数据集、标签集转化为矩阵形式
    dataMatrix = np.mat(dataMatIn)
    labelMat = np.mat(classLabels).transpose()
    #获得m*n的矩阵大小
    m,n = np.shape(dataMatrix)
    #a
    alpha = 0.001
    #迭代次数
    maxCycles = 500
    weights = np.ones((n,1))
    for k in range(maxCycles):
        #dataMatrix * weights 即为 wTx，是sigmoid函数的参数
        h = sigmoid(dataMatrix * weights)
        #计算标签值与实际值的差值
        error = (labelMat - h)
        #梯度上升算法，用这个差值方向调整回归系数
        weights = weights + alpha*dataMatrix.transpose()*error
    return weights

画出决策边界：

#画出数据集和Logistic回归最佳拟合直线的函数
def plotBestFit(weights):
    import matplotlib.pyplot as plt
    dataMat,labelMat = loadDataSet()
    dataArr = np.array(dataMat)
    n = np.shape(dataArr)[0]
    xcord1 = []; ycord1 = []
    xcord2 = []; ycord2 = []
    for i in range(n):
        if int(labelMat[i])==1:
            xcord1.append(dataArr[i,1])
            ycord1.append(dataArr[i,2])
        else:
            xcord2.append(dataArr[i,1])
            ycord2.append(dataArr[i,2])
    fig = plt.figure()
    ax = fig.add_subplot(111)
    ax.scatter(xcord1,ycord1,s=30,c='red',marker='s')
    ax.scatter(xcord2,ycord2,s=30,c='green')
    x = np.arange(-3.0,3.0,0.1)
    #由线性方程转换：q0+q1*x+q2*y=0
    #也即weight[0]*x0+weight[1]*x1+weight[2]*x2 = 0
    y = (-weights[0]-weights[1]*x)/weights[2]
    ax.plot(x,y)
    plt.xlabel('X1');plt.ylabel('X2')
    plt.show()

随机梯度上升算法：

def stocGradAscent(dataMatrix,classLabels,numTter):
    dataMatrix = np.array(dataMatrix)
    #得到数据集的矩阵m*n
    m,n = np.shape(dataMatrix)
    #初始化n长度的数组，元素值为1.0
    weights = np.ones(n)
    #迭代numTter次，
    for j in range(numTter):
        #样本随机算法-------与第三章方法一样，此处可避免周期性波动
        dataIndex = list(range(m))
        for i in range(m):
            #对alpha进行动态调整，可以缓解数据波动和高频波动
            #alpha虽然不断减小，但永远不会减小到0，不是严格下降
            alpha = 4/(1.0+j+i)+0.01
            randIndex = int(np.random.uniform(0,len(dataIndex)))
            h = sigmoid(sum(dataMatrix[randIndex]*weights))
            error = classLabels[randIndex] - h
            weights = weights+alpha*error*dataMatrix[randIndex]
            del(dataIndex[randIndex])
    return weights