【机器学习】8.非线性回归:logistic regression

1.概率:

    1.1定义 概率:对一件事情发生的可能性衡量

    1.2范围 0<=P<=1

    1.3计算方法

        根据个人置信

        根据历史数据

        根据模拟数据

    1.4条件概率:

                                                                        


2.逻辑回归

    2.1例子

                                                    

    

    2.2基本模型

        测试数据为X(x0,x1,x2···xn)

        要学习的参数为: Θ(θ0,θ1,θ2,···θn)

                                                    

        向量表示:

                                                                        

        处理二值数据,引入sigmoid函数时曲线平滑化

                                                                        

                                            

            预测函数:

                                                        

        用概率表示:

        正例(y=1):

                                                                

        反例(y=0):

                                                               


    2.3 Cost函数

        线性回归:

            

                                                        

                                                                

     找到合适的 θ0,θ1使上式最小

    逻辑回归:

        cost函数:

                                


    2.4解法:梯度下降

                                                                

                                                                    

                                                

            更新法则:

                                        

                同时对所有的θ进行更新,重复更新直到收敛。


3.python实现

# -*- coding: utf-8 -*-
import numpy as np
import random

##梯度下降法
def gradientDescent(x,y,theta,alpha,m,numIterations):
    xTran = np.transpose(x)                        #x转置
    for i in range(0,numIterations):
        hypothesis=np.dot(x,theta)              #内积
        loss=hypothesis-y
        cost = np.sum(loss**2)/(2*m)
        print("Ireration %d | cost:%f"%(i,cost))
        gradient=np.dot(xTran,loss)/m          
        theta=theta-alpha*gradient
    return theta
        
##产生一些数据
def genData(numPoints,bias,variance):
    x=np.zeros(shape=(numPoints,2))                     #生成矩阵 里面元素都是0,传入参数为行列数
    y=np.zeros(shape=numPoints)                         #只有一列
    for i in range(0,numPoints):
        x[i][0]=1
        x[i][1]=i
        y[i]=(i+bias)+random.uniform(0,1)+variance
    return x,y
###

x,y = genData(100, 25, 10)

m,n=np.shape(x)

numIterations=100000
alpha=0.0005
theta = np.ones(n)
theta=gradientDescent(x, y, theta, alpha, m, numIterations)
print(theta)

阅读更多
想对作者说点什么? 我来说一句

没有更多推荐了,返回首页

关闭
关闭
关闭