scikit-learn初学超详细注释说明

最新推荐文章于 2024-07-19 03:14:54 发布

hao240643983

最新推荐文章于 2024-07-19 03:14:54 发布

阅读量392

点赞数

文章标签：机器学习 python

本文链接：https://blog.csdn.net/hao240643983/article/details/103963194

版权

scikit-learn初学超详细注释说明

%matplotlib inline
import matplotlib.pyplot as plt
from sklearn.datasets import load_digits

X, y = load_digits(return_X_y=True)
#X中的每行包含64个图像像素的强度。 对于X中的每个样本，我们得到表示所写数字对应的y。
plt.imshow(X[0].reshape(8, 8), cmap='gray');# 下面完成灰度图的绘制
# 灰度显示图像
plt.axis('off')# 关闭坐标轴
print('The digit in the image is {}'.format(y[0]))# 格式化打印
print(y_train)
from sklearn.model_selection import train_test_split
#X_train, X_test, y_train, y_test = train_test_split(X, y, stratify=0.8, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, stratify=y, random_state=42)

# 划分数据为训练集与测试集,添加stratify参数，以使得训练和测试数据集的类分布与整个数据集的类分布相同。
from sklearn.linear_model import LogisticRegression# 求出Logistic回归的精确度得分

clf = LogisticRegression(solver='lbfgs', multi_class='ovr', max_iter=5000, random_state=42)
# 优化算法选择参数：solver
#multi_class参数决定了我们分类方式的选择，有 ovr和multinomial两个值可以选择，默认是 ovr。
#迭代次数max_iter等，由于和其它的算法类库并没有特别不同，这里不多累述了。
clf.fit(X_train, y_train)
accuracy = clf.score(X_test, y_test)
print('Accuracy score of the {} is {:.2f}'.format(clf.__class__.__name__, accuracy))

#fit：训练模型，进行回归计算
#predict(X)：预测方法，返回X的预测值
#score：评价模型，逻辑回归模型返回平均准确度