机器学习——fashion_mnist

机器学习——fashion_mnist

数据

本实验使用Fashion-MNIST数据集,包括t-shirt(T恤),trouser(牛仔裤),pullover(套衫)等在内的10个类别的图像共计70000张。

#读取训练集、测试集的数据和标签
train_images, train_labels = load_mnist(r"E:\code\jupyter\fashion-mnist",kind='train')
test_images, test_labels = load_mnist(r'E:\code\jupyter\fashion-mnist',kind='t10k')
train_images = train_images / 255.0
test_images = test_images / 255.0

print("Train set: {} images".format(len(train_images)))
print("Val set: {} images\n".format(len(test_images)))

在这里插入图片描述

#特征值归一化处理
scaler = StandardScaler()
scaler.fit(train_images_pca)
train_images_pca = scaler.transform(train_images_pca)
scaler.fit(test_images_pca)
test_images_pca = scaler.transform(test_images_pca)
train_images = np.reshape(train_images, (train_images.shape[0], -1))
test_images = np.reshape(test_images, (test_images.shape[0], -1))
# 画出特征个数和所携带信息数的曲线图
explained_variance_ratio = []
for i in range(1,300): 
    pca = PCA(n_components=i).fit(train_images)
    explained_variance_ratio.append(pca.explained_variance_ratio_.sum())
plt.plot(range(1,300),explained_variance_ratio)
plt.show()

在这里插入图片描述

PCA

#选择维度200,pca
pca = PCA(n_components = 200)
pca.fit(train_images)
train_images_pca = pca.transform(train_images)
test_images_pca = pca.transform(test_images)
#划分训练集和测试集
train_x, test_x, train_y, test_y = train_test_split(train_images_pca, train_labels, random_state=1, train_size = 0.8)

model+predicted

#使用svm、mlp和randomforest三种model进行训练
svc = svm.SVC(kernel = 'rbf', C = 1)
svc.fit(train_x, train_y)

mlp = MLPClassifier(solver = 'lbfgs', activation = 'tanh', alpha = 1e-5, hidden_layer_sizes = [100, 100])
mlp.fit(train_x, train_y)

rdf = RandomForestClassifier(n_estimators = 500, max_depth = 12, random_state = 2)
rdf.fit(train_x, train_y)
result = pd.DataFrame()
result['index'] = ['score_train', 'score_test', 'score_train_pca', 'score_test_pca']
result['svc'] = [svc.score(train_x, train_y), svc.score(test_x, test_y), svc.score(train_images_pca, train_labels), svc.score(test_images_pca, test_labels)]
result['mlp'] = [mlp.score(train_x, train_y), mlp.score(test_x, test_y), mlp.score(train_images_pca, train_labels), mlp.score(test_images_pca, test_labels)]
result['rdf'] = [rdf.score(train_x, train_y), rdf.score(test_x, test_y), rdf.score(train_images_pca, train_labels), rdf.score(test_images_pca, test_labels)]
result

在这里插入图片描述
svc在train上自验结果最低,但是test结果最好;mlp在train上可能存在过拟合现象;rdf整体表现一般,但是运行速度快

评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

G-Jarvey

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值