《机器学习100天》学习笔记——Day 4-6_Logistic_Regression(逻辑回归）

最新推荐文章于 2022-11-07 22:21:33 发布

#HereWeGo

最新推荐文章于 2022-11-07 22:21:33 发布

阅读量357

点赞数

分类专栏：机器学习100天文章标签：机器学习逻辑回归

本文链接：https://blog.csdn.net/qq_41929011/article/details/88899630

版权

100-Days-Of-ML-Code
中文版《机器学习100天》
GitHub ：https://github.com/MLEveryday/100-Days-Of-ML-Code

数据集 | 社交网络
部分数据集如下图所示：
在这里插入图片描述
该数据集包含了社交网络中用户的信息。这些信息涉及用户ID,性别,年龄以及预估薪资。一家汽车公司刚刚推出了他们新型的豪华SUV，我们尝试预测哪些用户会购买这种全新SUV。并且在最后一列用来表示用户是否购买。我们将建立一种模型来预测用户是否购买这种SUV，该模型基于两个变量，分别是年龄和预计薪资。因此我们的特征矩阵将是这两列。我们尝试寻找用户年龄与预估薪资之间的某种相关性，以及Ta是否购买SUV的决定。

第一步：数据预处理

（1）导入库

# Importing the Libraries
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

（2）导入数据集

# Importing the dataset
dataset = pd.read_csv('D:/PycharmProjects/DataSet/Social_Network_Ads.csv')
X = dataset.iloc[:, [2, 3]].values
y = dataset.iloc[:, 4].values

（3）将数据集划分为训练集和测试集

# Splitting the dataset into the Training set and Test set
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.25, random_state = 0)

（4）特征量化

# Feature Scaling
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

第二步：将逻辑回归应用于训练集

# Fitting Logistic Regression to the Training set
from sklearn.linear_model import LogisticRegression
classifier = LogisticRegression()
classifier.fit(X_train, y_train)

第三步：预测测试集结果

# Predicting the Test set results
y_pred = classifier.predict(X_test)

第四步：评估预测

我们已经预测了测试集，现在将评估逻辑回归模型是否正确的学习和理解。因此这个混淆矩阵将包含模型的正确和错误的预测。
（1）生成混淆矩阵

# Making the Confusion Matrix
from sklearn.metrics import confusion_matrix
from sklearn.metrics import classification_report
cm = confusion_matrix(y_test, y_pred)
print(cm)  # print confusion_matrix
print(classification_report(y_test, y_pred))   # print classification report

结果如下：
在这里插入图片描述

（2）可视化

from matplotlib.colors import ListedColormap
X_set,y_set=X_train,y_train
X1,X2=np. meshgrid(np. arange(start=X_set[:,0].min()-1, stop=X_set[:, 0].max()+1, step=0.01),
                   np. arange(start=X_set[:

最低0.47元/天解锁文章

#HereWeGo

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
《机器学习100天》学习笔记——Day 4-6_Logistic_Regression(逻辑回归）

100-Days-Of-ML-Code中文版《机器学习100天》GitHub ：https://github.com/MLEveryday/100-Days-Of-ML-Code数据集 | 社交网络部分数据集如下图所示：该数据集包含了社交网络中用户的信息。这些信息涉及用户ID,性别,年龄以及预估薪资。一家汽车公司刚刚推出了他们新型的豪华SUV，我们尝试预测哪些用户会购买这种全新SUV。...
复制链接

扫一扫

专栏目录