西瓜书3.3对率回归

西瓜书3.3

这是西瓜书第一道实践题,感觉书里对于原理讲解过于生硬,有点难以理解,所以我更多采用从andrew ng的深度学习中学到的logistic regression来描述

对率回归西瓜数据3.0alpha

数据如下

编号密度糖分好瓜
00.6970.4601
10.7740.3761
20.6340.2641
30.6080.3181
40.5560.2151
50.4030.2371
60.4810.1491
70.4370.2111
80.6660.0910
90.2430.2670
100.2450.0570
110.3430.0990
120.6390.1610
130.6570.1980
140.3600.3700
150.5930.0420
160.7190.1030

导入数据

import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split

df = pd.DataFrame({'midu':[0.697,0.774,0.634,0.608,0.556,0.403,0.481,0.437, 0.666,0.243,0.245,0.343,0.639,0.657,0.360,0.593,0.719],
                   'tang':[0.460,0.376,0.264,0.318,0.215,0.237,0.149,0.211, 0.091,0.267,0.057,0.099,0.161,0.198,0.370,0.042,0.103],
                   'hao': [1,1,1,1,1,1,1,1, 0,0,0,0,0,0,0,0,0]})
X = df[['midu','tang']].values
y = df['hao'].values
#为了使数据形式符合需要演示需要,将其进行转置
X = X.reshape([2,17])
y = y.reshape([1,17])

参数的随机生成和sigmoid激活函数定义

'''
initialize parameter
'''
def init_params():
	#根据X数据的形状来确定生成w和b
    w = np.random.randn(2, 1)
    b = 0
    return w, b
'''
sigmoid activation func
'''
def sigmoid(Z):
    s = 1 / (1+np.exp(-Z))
    return s

传播和参数优化(gradient descent)

向前传播:

  • You get X
  • A = σ ( w T X + b ) = ( a ( 1 ) , a ( 2 ) , . . . , a ( m − 1 ) , a ( m ) ) A = \sigma(w^T X + b) = (a^{(1)}, a^{(2)}, ..., a^{(m-1)}, a^{(m)}) A=σ(wTX+b)=(a(1),a(2),...,a(m1),a(m))
  • cost function:
  • J = − 1 m ∑ i = 1 m y ( i ) log ⁡ ( a ( i ) ) + ( 1 − y ( i ) ) log ⁡ ( 1 − a ( i ) ) J = -\frac{1}{m}\sum_{i=1}^{m}y^{(i)}\log(a^{(i)})+(1-y^{(i)})\log(1-a^{(i)}) J=m1i=1my(i)log(a(i))+(1y(i))log(1a(i))

由cost func对w 和b分别求导:

∂ J ∂ w = 1 m X ( A − Y ) T (7) \frac{\partial J}{\partial w} = \frac{1}{m}X(A-Y)^T\tag{7} wJ=m1X(AY)T(7)
∂ J ∂ b = 1 m ∑ i = 1 m ( a ( i ) − y ( i ) ) (8) \frac{\partial J}{\partial b} = \frac{1}{m} \sum_{i=1}^m (a^{(i)}-y^{(i)})\tag{8} bJ=m1i=1m(a(i)y(i))(8)

'''
forward prop and compute cost and back prop
'''
def f_prop(w, b, X, y):
    m = X.shape[1]#一共有m个实例
    A = sigmoid(np.dot(w.T, X) + b)
    #计算cost函数
    cost = -1/m * np.sum(y*np.log(A) + (1-y)*np.log(1-A))
    #对w和b求导数
    dw = 1/m * np.dot(X, (A-y).T)
    db = 1/m * np.sum(A - y)
    #组合到一个字典
    grads = {'dw':dw, 'db':db}
    return cost, grads
'''
optimize parameters--gradient descent
'''
def op_params(w, b, X, y, num_iter, learning_rate):
    costs = []
    
    for i in range(num_iter):
        cost ,grads = f_prop(w, b, X, y)
        dw = grads['dw']
        db = grads['db']
        #参数最优化
        w = w - learning_rate * dw
        b = b - learning_rate * db
        costs.append(cost)
    
    params = {'w':w, 'b':b}

    return costs, params    

预测函数

'''
predict
'''
def predict(params, X):
    m = X.shape[1]
    Y_prediction = np.zeros((1,m))
    
    w = params['w']
    b = params['b']
    A = sigmoid(np.dot(w.T, X)+b) 
    #大于0.5就是真,小于等于0.5就是假
    for i in range(m):
        if A[0, i] > 0.5:
            Y_prediction[0, i] = 1
        if A[0, i] <= 0.5:
            Y_prediction[0, i] = 0
    return Y_prediction

模型建立和参数获取

%matplotlib notebook 
import matplotlib.pyplot as plt
'''
merge all the func
'''
w, b = init_params()
cost, grads = f_prop(w, b, X, y)
costs, params = op_params(w, b, X, y, 1000, 0.9)
plt.plot(costs)

最后的损失函数曲线,1000次循环, 0.9学习率:

训练集预测

'''
make prediction
'''
Y_prediction = predict(params, X)

count = 0
for i in range(X.shape[1]):
    if Y_perdiction[0, i] == y[0, i]:
        count += 1

precision = count/X.shape[1]
print('准确率是:',precision)

准确率是: 0.5882352941176471
显然,这个模型使underfit的,但是基于有限的数据集,且cost func已经达到极限,所以很难再找到更优的参数,就先这样吧。

  • 2
    点赞
  • 20
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值