sklearn朴素贝叶斯分类器_sklearn学习之朴素贝叶斯

注意代码中坐标横轴分层测试集的实现:

train_sizes 

---------------------'

# ================================================
# 朴素贝叶斯预测糖尿病
# 与逻辑回归比较
# 2019-03-02
# ================================================
%matplotlib inline   
df = pd.read_csv('./pima-indians-diabetes.data', header=None)
# print(df)
y = df[8]
X = df[[0, 1, 2, 3, 4, 5, 6, 7]]
X_train, X_test, y_train, y_test = train_test_split(X, y, stratify=y, random_state=11)
print(len(X_train))

lr = LogisticRegression()
nb = GaussianNB()

lr_scores = []
nb_scores = []

print('========不断提高训练的样本数据量=======')
train_sizes = range(10, len(X_train), 10)

for train_size in train_sizes:
    X_slice, _, y_slice, _ = train_test_split(
        X_train, y_train, train_size=train_size, stratify=y_train, random_state=11)
    nb.fit(X_slice, y_slice)
    nb_scores.append(nb.score(X_test, y_test))
    lr.fit(X_slice, y_slice)
    lr_scores.append(lr.score(X_test, y_test))
    
plt.plot(train_sizes, nb_scores, label='Naive Bayes')
plt.plot(train_sizes, lr_scores, linestyle='--', label='Logistic Regression')
plt.title("Naive Bayes and Logistic Regression Accuracies")
plt.xlabel("Number of training instances")
plt.ylabel("Test set accuracy")
plt.legend()

v2-8e5fc3ae75d082e139075f15e7bb3db0_b.jpg
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值