《机器学习》周志华 课后习题3.3:编程实现对率回归,并给出西瓜数据集 3.0α 上的结果.

数据如下:

python 代码如下:

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Tue Jan 30 10:05:01 2018

@author: llw
"""
#logistic regression
import numpy as np
from sklearn.cross_validation import train_test_split
from sklearn.metrics import classification_report
density=np.array([0.697,0.774,0.634,0.608,0.556,0.430,0.481,0.437,0.666,0.243,0.245,0.343,0.639,0.657,0.360,0.593,0.719]).reshape(-1,1)
sugar_rate=np.array([0.460,0.376,0.264,0.318,0.215,0.237,0.149,0.211,0.091,0.267,0.057,0.099,0.161,0.198,0.370,0.042,0.103]).reshape(-1,1)
xtrain=np.hstack((density,sugar_rate))
xtrain=np.hstack((np.ones([density.shape[0],1]),xtrain))
ytrain=np.array([1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0]).reshape(-1,1)
xtrain,xtest,ytrain,ytest=train_test_split(xtrain,ytrain,test_size=0.25,random_state=33)
def sigmoid(z):
    return 1/(1+np.exp(-z))
#print(sigmoid(density))
def logit_regression(theta,x,y,iteration=100,learning_rate=0.1,lbd=0.01):
    for i in range(iteration):
        theta=theta-learning_rate/y.shape[0]*(np.dot(x.transpose(),(sigmoid(np.dot(x,theta))-y))+lbd*theta)
        cost=-1/y.shape[0]*(np.dot(y.transpose(),np.log(sigmoid(np.dot(x,theta))))+np.dot((1-y).transpose(),np.log(1-sigmoid(np.dot(x,theta)))))+lbd/(2*y.shape[0])*np.dot(theta.transpose(),theta)
        print('---------Iteration %d,cost is %f-------------'%(i,cost))
    return theta
def predict(theta,x):
    pre=np.zeros([x.shape[0],1])
    for idx,valu in enumerate(np.dot(x,theta)):
        if sigmoid(valu)>=0.5:
            pre[idx]=1
        else:
            pre[idx]=0
    return pre
                
theta_init=np.random.rand(3,1)
pre=predict(theta,xtest)
theta=logit_regression(theta_init,xtrain,ytrain,learning_rate=1)
print('predictions are',pre)
print('ground truth is',ytest)
print('theta is ',theta)
print('the accuracy is',np.mean(pre==ytest))
print(classification_report(ytest,pre,target_names=['Bad','Good']))
输出结果如下:


以下是一个简单的对数几率回归 Python 代码示例: ```python import numpy as np import matplotlib.pyplot as plt # 生成数据 np.random.seed(0) X = np.random.randn(100, 2) y = np.random.randint(0, 2, 100) # 对数几率回归模型 class LogisticRegression: def __init__(self, lr=0.01, num_iter=100000, fit_intercept=True, verbose=False): self.lr = lr self.num_iter = num_iter self.fit_intercept = fit_intercept self.verbose = verbose def __add_intercept(self, X): intercept = np.ones((X.shape[0], 1)) return np.concatenate((intercept, X), axis=1) def __sigmoid(self, z): return 1 / (1 + np.exp(-z)) def __loss(self, h, y): return (-y * np.log(h) - (1 - y) * np.log(1 - h)).mean() def fit(self, X, y): if self.fit_intercept: X = self.__add_intercept(X) # 权重初始化 self.theta = np.zeros(X.shape[1]) for i in range(self.num_iter): z = np.dot(X, self.theta) h = self.__sigmoid(z) gradient = np.dot(X.T, (h - y)) / y.size self.theta -= self.lr * gradient if self.verbose and i % 10000 == 0: z = np.dot(X, self.theta) h = self.__sigmoid(z) print(f'Loss: {self.__loss(h, y)}') def predict_prob(self, X): if self.fit_intercept: X = self.__add_intercept(X) return self.__sigmoid(np.dot(X, self.theta)) def predict(self, X, threshold=0.5): return self.predict_prob(X) >= threshold # 训练模型 model = LogisticRegression(lr=0.1, num_iter=300000) model.fit(X, y) # 预测 y_pred = model.predict(X) # 画出决策边界 plt.scatter(X[:, 0], X[:, 1], c=y_pred) x1_min, x1_max = X[:, 0].min(), X[:, 0].max(), x2_min, x2_max = X[:, 1].min(), X[:, 1].max(), xx1, xx2 = np.meshgrid(np.linspace(x1_min, x1_max), np.linspace(x2_min, x2_max)) grid = np.c_[xx1.ravel(), xx2.ravel()] probs = model.predict_prob(grid).reshape(xx1.shape) plt.contour(xx1, xx2, probs, [0.5], linewidths=1, colors='red') plt.show() ``` 上述代码使用了 NumPy 和 Matplotlib 库。首先,我们生成了随机数据 `X` 和标签 `y`。然后,定义了一个 `LogisticRegression` 类来实现对数几率回归模型。在类中,我们实现了 `__add_intercept` 方法来添加截距项,`__sigmoid` 方法来计算 sigmoid 函数,`__loss` 方法来计算损失函数,`fit` 方法来训练模型,`predict_prob` 方法来预测概率,`predict` 方法来预测标签。最后,我们使用训练好的模型来画出决策边界。
评论 12
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值