python实现吴恩达机器学习练习2(逻辑回归)-data2

python实现吴恩达机器学习练习2(逻辑回归)-data2

接上一篇:python实现吴恩达机器学习练习2(逻辑回归)-data1
假如对一个零件有两个测试指标,根据两个指标数值来决定是否为合格产品。我们用逻辑回归方法加入多项式项来拟合非线性决策边界。
参考链接:https://blog.csdn.net/Cowry5/article/details/80247569

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

%matplotlib inline

2 Regularized logistic regression

2.1 visualizing the data

data2 = pd.read_csv('D:/Python/exercise/samples/AndrewNg/ex2/ex2data2.csv', names = ['x1','x2','y'])
len(data2)
118
positive = data2[data2['y'] == 1]
negative = data2[data2['y'] == 0]
positive_x1 = positive['x1']
positive_x2 = positive['x2']
negative_x1 = negative['x1']
negative_x2 = negative['x2']
plt.figure(figsize = (10,10))
plt.scatter(x = positive_x1, y = positive_x2, marker = 'x', color = 'r', label = 'positive')
plt.scatter(x = negative_x1, y = negative_x2, color = 'lightgreen', label = 'negative')
plt.legend()

在这里插入图片描述

2.2 feature mapping

用二元数据生成指定次数的多项式

def polynomial_generator(x1, x2, power):
    data = {}
    for i in range(power + 1):
        for j in range(i + 1):
            data[f'{j}{i-j}'] = (x2**j) * (x1**(i-j))
    return data

用6次多项式

data = polynomial_generator(x1 = data2['x1'].values, x2 = data2['x2'].values, power = 6)
data_trans = pd.DataFrame(data)
data_trans.shape
(118, 28)
data_trans.head()
00011002112003122130...32415006152433425160
01.00.0512670.699560.0026280.0358640.4893840.0001350.0018390.0250890.342354...0.0009000.0122780.1675421.815630e-082.477505e-070.0000030.0000460.0006290.0085890.117206
11.0-0.0927420.684940.008601-0.0635230.469143-0.0007980.005891-0.0435090.321335...0.002764-0.0204120.1507526.362953e-07-4.699318e-060.000035-0.0002560.001893-0.0139810.103256
21.0-0.2137100.692250.045672-0.1479410.479210-0.0097610.031616-0.1024120.331733...0.015151-0.0490770.1589709.526844e-05-3.085938e-040.001000-0.0032380.010488-0.0339730.110047
31.0-0.3750000.502190.140625-0.1883210.252195-0.0527340.070620-0.0945730.126650...0.017810-0.0238510.0319402.780914e-03-3.724126e-030.004987-0.0066790.008944-0.0119780.016040
41.0-0.5132500.465640.263426-0.2389900.216821-0.1352030.122661-0.1112830.100960...0.026596-0.0241280.0218901.827990e-02-1.658422e-020.015046-0.0136500.012384-0.0112350.010193

5 rows × 28 columns

2.3 cost function and gradient

#定义各个函数
def sigmoid(z):
    g = 1 / (1 + np.exp(-z))
    return g
def J_func(theta, x, y):
    cost = -y * np.log(sigmoid(x.dot(theta.T))) - (1-y) * np.log(1-sigmoid(x.dot(theta.T)))
    J = cost.mean()
    return J

J ( θ ) J(\theta) J(θ)加入正则项, J ( θ ) = − 1 m [ ∑ i = 1 m y ( i ) l n h θ ( x ( i ) ) + ( 1 − y ( i ) ) l n ( 1 − h θ ( x ( i ) ) ) ] + λ 2 m ∑ j = 1 n θ j 2 J(\theta)=-\frac{1}{m}[\sum_{i=1}^my^{(i)}lnh_{\theta}(x^{(i)})+(1-y^{(i)})ln(1-h_{\theta}(x^{(i)})) ] + \frac{\lambda}{2m}\sum_{j=1}^n\theta_j^2 J(θ)=m1[i=1my(i)lnhθ(x(i))+(1y(i))ln(1hθ(x(i)))]+2mλj=1nθj2

def J_func_reg(theta, x, y, c=1):
    _theta = theta[1:]
    reg = (c/(2*len(x)))*(_theta.dot(_theta.T))
    return J_func(theta, x, y) + reg
def gradient(theta, x, y):
    gra = x.T.dot(sigmoid(x.dot(theta.T))-y) / len(x)
    return gra    

改变梯度下降公式Gradient descent(因为 θ 0 \theta_0 θ0不需要正则化,所以): r e p e a t { repeat\{ repeat{ θ 0 : = θ 0 − α 1 m ∑ i = 1 m [ h θ ( x ( i ) ) − y ( i ) ] x 0 ( i ) \theta_0 := \theta_0-\alpha\frac{1}{m}\sum_{i=1}^m[h_{\theta}(x^{(i)})-y^{(i)}]x_0^{(i)} θ0:=θ0αm1i=1m[hθ(x(i))y(i)]x0(i) θ j : = θ j − α { 1 m ∑ i = 1 m [ h θ ( x ( i ) ) − y ( i ) ] x j ( i ) + λ m θ j } \theta_j := \theta_j-\alpha\{\frac{1}{m}\sum_{i=1}^m[h_{\theta}(x^{(i)})-y^{(i)}]x_j^{(i)}+\frac{\lambda}{m}\theta_j\} θj:=θjα{m1i=1m[hθ(x(i))y(i)]xj(i)+mλθj} } \} }
gradient项变为: ∂ ∂ θ j J ( θ ) = 1 m ∑ i = 1 m ( h θ ( x ( i ) ) − y ( i ) ) x j ( i ) + λ m θ j \frac{\partial}{\partial\theta_j}J(\theta)=\frac{1}{m}\sum_{i=1}^m(h_{\theta}(x^{(i)})-y^{(i)})x_j^{(i)}+\frac{\lambda}{m}\theta_j θjJ(θ)=m1i=1m(hθ(x(i))y(i))xj(i)+mλθj ( j = 1 , 2 , 3 , . . . , n ) (j = 1,2,3,...,n) (j=1,2,3,...,n)

def gradient_reg(theta, x, y, c=1):
    reg = (c/len(x)) * theta
    reg[0] = 0
    return gradient(theta, x, y) + reg
X = data_trans.values
Y = data2.iloc[:,-1].values
theta = np.zeros(28)
2.3.1 learning parameters using fmin_tnc
import scipy.optimize as opt

用两个 λ \lambda λ值看看运行结果的区别

lambda_01 = 0.1
lambda_02 = 10
result = opt.fmin_tnc(func = J_func_reg, x0 = theta, fprime = gradient_reg, args = (X, Y, lambda_01)) #第三个参数为各方法的第四个参数
result_2 = opt.fmin_tnc(func = J_func_reg, x0 = theta, fprime = gradient_reg, args = (X, Y, lambda_02))
result = result[0]
result_2 = result_2[0]

2.4 plotting the decision boundary

a = np.arange(-1, 1.2, 0.01)
b = np.arange(-1, 1.2, 0.01)
xs, ys = np.meshgrid(a, b)

用ravel()方法把xs和ys数组扁平化,用polynomial_generator生成“列”为各种二项式组合,“行”为各x,y网格点组合的“字典”,然后再reshape回来

c_dict = polynomial_generator(xs.ravel(), ys.ravel(), 6)
c = pd.DataFrame(c_dict)
z = c.values.dot(result.T).reshape(xs.shape)
z_2 = c.values.dot(result_2.T).reshape(xs.shape)
fig, ax = plt.subplots(2, 1, figsize = (10, 20), sharex = True, sharey = True)
ax[0].contour(xs, ys, z, 0)
ax[0].scatter(x = positive_x1, y = positive_x2, marker = 'x', color = 'r')
ax[0].scatter(x = negative_x1, y = negative_x2, color = 'lightgreen')

ax[1].contour(xs, ys, z_2, 0)
ax[1].scatter(x = positive_x1, y = positive_x2, marker = 'x', color = 'r')
ax[1].scatter(x = negative_x1, y = negative_x2, color = 'lightgreen')

ax[0].set_title(f'lambda = {lambda_01}', fontsize = 30)
ax[1].set_title(f'lambda = {lambda_02}', fontsize = 30)
Text(0.5, 1.0, 'lambda = 10')

在这里插入图片描述

从结果来看,当 λ = 0.1 \lambda=0.1 λ=0.1时,拟合结果较好,当 λ = 10 \lambda=10 λ=10时,拟合结果具有明显的高偏差问题


  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值