python 使用plt.tricontour绘制逻辑回归决策边界(不规则空间下的等高线图)

前言

最近在做有关逻辑回归的作业,需要绘制决策边界。绘制原理是:
对于逻辑回归,其决策边界为 θ T X = 0 \theta^TX = 0 θTX=0处,其中 θ = [ θ 0 , θ 1 , θ 2 , ⋯   , θ n ] ; X = [ X 0 , X 1 , X 2 , ⋯   , X n ] \theta = [\theta_0,\theta_1,\theta_2,\cdots,\theta_n ]; X = [X_0,X_1,X_2,\cdots,X_n ] θ=[θ0,θ1,θ2,,θn];X=[X0,X1,X2,,Xn]。我们将训练所得的 θ \theta θ代入,再使用plt.contour(xx,yy,zz,0)即可。
在这里插入图片描述
在该题目中,所给数据的决策边界并非线性,因此需要进行一定的多项式变换。poly_feat返回两个特征的五阶组合多项式如 x 1 5 , x 1 x 2 4 , x 1 2 x 2 3 , ⋯   , x 2 5 x_1^5,x_1x_2^4, x_1^2x_2^3,\cdots,x_2^5 x15,x1x24,x12x23,,x25

from sklearn.preprocessing import PolynomialFeatures#%%poly feature transformation
poly_feat = PolynomialFeatures(degree=5, include_bias=True)
X_poly = poly_feat.fit_transform(X[:,1:])

使用五阶多项式变化,便可以将一个只有三个特征(其中第一个特征为1)的X变成一个有21个特征的样本。而且,可以绘制非线性的决策边界。

from matplotlib import pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
epochs = 1000000
lr = 0.01
lamb = 0
degree = 5
poly_feat = PolynomialFeatures(degree, include_bias=True)
theta = np.zeros((X_poly.shape[1],1))
final_theta = batch_gradient_descent(X_poly, y, theta, epoch=epochs, lr=lr, lamb=lambs)
test1 = np.array(data['Test 1'])#feature1, disorder
test2 = np.array(data['Test 2'])#feature2 disorder
Test1, Test2 = np.meshgrid(test1, test2)
score_mesh = np.zeros((test1.size, test2.size))
#construce score mesh by iteraing every element of features
for idx1, t1 in enumerate(test1):
    for idx2, t2 in enumerate(test2):
        poly = poly_feat.fit_transform(np.array([t1, t2]).reshape(1,-1))#consture polynomial features
        score_mesh[idx1, idx2] = poly@final_theta
cs = plt.contour(test1, test2, score_mesh,0)
cs.collections[0].set_label('lamb = '+str(lambs))# add label for contour
#plot data scatter    
positive = data[data['Accepted'].isin([1])]
negative = data[data['Accepted'].isin([0])]

plt.scatter(positive['Test 1'], positive['Test 2'], s=20, c='c', marker='o', label='Accepted')
plt.scatter(negative['Test 1'], negative['Test 2'], s=30, c='m', marker='x', label='Not Accepted')
plt.legend()
plt.xlabel('Test 1 Score')
plt.ylabel('Test 2 Score')

其中test1, test2如下图所示,均为不规则序列
在这里插入图片描述
最终所得决策边界如下图所示
在这里插入图片描述

可以看出,该决策边界十分混乱,出现了多条高程值为0的线。通过分析可知,这是由于等高线图的X与Y并不是规则空间。由于X与Y不是递增或者递减,所以会出现多条等高线。解决方法有以下几种:

解决方法一

构建规则格网,利用递增或者递减的X或Y构建高程格网

from matplotlib import pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
epochs = 1000000
lr = 0.01
lamb = 0
degree = 5
color = ['r','g','b']
for idx,lambs in enumerate([0]):
    poly_feat = PolynomialFeatures(degree, include_bias=True)
    X_poly = poly_feat.fit_transform(X[:,1:])
    theta = np.zeros((X_poly.shape[1],1))
    final_theta = batch_gradient_descent(X_poly, y, theta, epoch=epochs, lr=lr, lamb=lambs)
    xk = np.linspace(-1, 1, test1.size)#constuct orderly sequency by np.linspace
    yk = xk
    xx, yy = np.meshgrid(xk,yk) 
    score_mesh = np.zeros((test1.size, test2.size))
    for idx1, t1 in enumerate(xk):
        for idx2, t2 in enumerate(yk):
            poly = poly_feat.fit_transform(np.array([t1, t2]).reshape(1,-1))
            score_mesh[idx2, idx1] = poly@final_theta
    cs = plt.contour(xx, yy, score_mesh,0)
    cs.collections[0].set_label('lamb = '+str(lambs))
    
positive = data[data['Accepted'].isin([1])]
negative = data[data['Accepted'].isin([0])]

plt.scatter(positive['Test 1'], positive['Test 2'], s=20, c='c', marker='o', label='Accepted')
plt.scatter(negative['Test 1'], negative['Test 2'], s=30, c='m', marker='x', label='Not Accepted')
plt.legend()
plt.xlabel('Test 1 Score')
plt.ylabel('Test 2 Score')

所得等高线如下图所示:

在这里插入图片描述

解决方法二

依然使用不规则数据test1,test2构建高程格网,但是使用plt.tricontour函数对不规则三角网进行插值,得到等高线:

from matplotlib import pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
epochs = 1000000
lr = 0.01
lamb = 0
degree = 5
color = ['r','g','b']
for idx,lambs in enumerate([0]):
    poly_feat = PolynomialFeatures(degree, include_bias=True)
    theta = np.zeros((X_poly.shape[1],1))
    final_theta = batch_gradient_descent(X_poly, y, theta, epoch=epochs, lr=lr, lamb=lambs)
    score_mesh_flat = poly_feat.fit_transform(np.stack([test1, test2],axis = 1))@final_theta
    test1 = np.array(data['Test 1'])#feature1, disorder
    test2 = np.array(data['Test 2'])#feature2 disorder
    cs = plt.tricontour(test1, test2, score_mesh_flat.flatten(),levels = 0)
    cs.collections[0].set_label('lamb = '+str(lambs))
    
positive = data[data['Accepted'].isin([1])]
negative = data[data['Accepted'].isin([0])]

plt.scatter(positive['Test 1'], positive['Test 2'], s=20, c='c', marker='o', label='Accepted')
plt.scatter(negative['Test 1'], negative['Test 2'], s=30, c='m', marker='x', label='Not Accepted')
plt.legend()
plt.xlabel('Test 1 Score')
plt.ylabel('Test 2 Score')

在这里插入图片描述

解决方法三

依然使用原始数据与plt.contour函数,但是此时对数据进行排序(test1.sort(); test2.sort)

from matplotlib import pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
epochs = 1000000
lr = 0.01
lamb = 0
degree = 5
color = ['r','g','b']
for idx,lambs in enumerate([0]):
    poly_feat = PolynomialFeatures(degree, include_bias=True)
    X_poly = poly_feat.fit_transform(X[:,1:])
    theta = np.zeros((X_poly.shape[1],1))
    final_theta = batch_gradient_descent(X_poly, y, theta, epoch=epochs, lr=lr, lamb=lambs)
    test1 = np.array(data['Test 1'])
    test2 = np.array(data['Test 2'])
    test1.sort();test2.sort();# sort the disorder sequency
    Test1, Test2 = np.meshgrid(test1, test2)
    score_mesh = np.zeros((test1.size, test2.size))
    for idx1, t1 in enumerate(test1):
        for idx2, t2 in enumerate(test2):
            poly = poly_feat.fit_transform(np.array([t1, t2]).reshape(1,-1))
            score_mesh[idx2, idx1] = poly@final_theta
    cs = plt.contour(Test1, Test2, score_mesh,levels = 0)
    cs.collections[0].set_label('lamb = '+str(lambs))
    
positive = data[data['Accepted'].isin([1])]
negative = data[data['Accepted'].isin([0])]

plt.scatter(positive['Test 1'], positive['Test 2'], s=20, c='c', marker='o', label='Accepted')
plt.scatter(negative['Test 1'], negative['Test 2'], s=30, c='m', marker='x', label='Not Accepted')
plt.legend()
plt.xlabel('Test 1 Score')
plt.ylabel('Test 2 Score')

在这里插入图片描述

原理与思考

同样是不规则空间的等高线绘制,plt.contour函数与plt.tricontour,之所以会出现这么大的不同
是由于plt.contour的等高线算法是针对规则格网的等高线算法,要求X与Y是单调递增或递减的,而plt.contour针对不规则三角网的等高线算法。

  • 0
    点赞
  • 13
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值