在Mnist数据上使用k折交叉验证训练,pytorch代码到底怎么写

前言        

        最近学到了K折交叉验证,已经迫不及待去实验一下他的效果是不是如老师讲的一样好,特此写下本文。

        本文运行环境为:sklearn、pytorch 、jupyter notebook


k折交叉验证介绍

在这里插入图片描述

        五折交叉验证: 把数据平均分成5等份,每次实验拿一份做测试,其余用做训练。实验5次求平均值。如上图,第一次实验拿第一份做测试集,其余作为训练集。第二次实验拿第二份做测试集,其余做训练集。依此类推~ 


baseline基础模型演示

        先看一个最简单的,朴实无华,什么tricks都没加的基础模型。

(1)导入包


#---------------------------------Torch Modules --------------------------------------------------------
from __future__ import print_function
import numpy as np
import pandas as pd
import torch.nn as nn
import math
import torch.nn.functional as F
import torch
import torchvision
from torch.nn import init
import torch.optim as optim
from torchvision import datasets, transforms
from torchvision import models
import torch.nn.functional as F
from torch.utils import data
import matplotlib.pyplot as plt
import os
os.environ["KMP_DUPLICATE_LIB_OK"]="TRUE"

(2) 设定一些初始值


###-----------------------------------variables-----------------------------------------------
# for Normalization
mean = [0.5]
std = [0.5]
# batch size
BATCH_SIZE =128
Iterations = 1        # epoch
learning_rate = 0.01

(3)加载数据集 


##-----------------------------------Commands to download and perpare the MNIST dataset ------------------------------------
train_transform = transforms.Compose([
        transforms.ToTensor(),
        transforms.Normalize(mean, std)
        ])

test_transform = transforms.Compose([
        transforms.ToTensor(),
        transforms.Normalize(mean, std)
        ])

    
train_loader = torch.utils.data.DataLoader(
        datasets.MNIST('./mnist', train=True, download=True,
                       transform=train_transform),
        batch_size=BATCH_SIZE, shuffle=True) # train dataset

test_loader = torch.utils.data.DataLoader(
        datasets.MNIST('./mnist', train=False, 
                         transform=test_transform),
        batch_size=BATCH_SIZE, shuffle=False) # test dataset

         transforms是用来增强数据集的,例如:旋转,裁剪....等等

(4)可视化数据集


#visualization
def show_images(imgs, num_rows, num_cols, titles=None, scale=1.5): 
    """Plot a list of images."""
    figsize = (num_cols * scale, num_rows * scale)
    _, axes = plt.subplots(num_rows, num_cols, figsize=figsize)
    axes = axes.flatten()
    for i, (ax, img) in enumerate(zip(axes, imgs)):
      if torch.is_tensor(img):
        # Tensor Image
        ax.imshow(img.numpy())
    else:
      # PIL Image
      ax.imshow(img)
      ax.axes.get_xaxis().set_visible(False)
      ax.axes.get_yaxis().set_visible(False)
    if titles:
      ax.set_title(titles[i])
    return axes
mnist_train = torchvision.datasets.MNIST(root="../data", train=True,
transform=train_transform,
download=True)
X, y = next(iter(data.DataLoader(mnist_train, batch_size=18)))
show_images(X.reshape(18, 28, 28), 2, 9)

【output】输出 

(5)加载模型、优化器与损失函数 


model = nn.Sequential(nn.Flatten(), nn.Linear(784, 256), nn.ReLU(),nn.Linear(256, 10))
def init_weights(m):
  if type(m) == nn.Linear:
    nn.init.normal_(m.weight, std=0.01)
model.apply(init_weights);

## Loss function
criterion = torch.nn.CrossEntropyLoss() # pytorch's cross entropy loss function

# definin which paramters to train only the CNN model parameters
optimizer = torch.optim.SGD(model.parameters(),learning_rate)

(6)训练函数


# defining the training function
# Train baseline classifier on clean data
def train(model, optimizer,criterion,epoch): 
    model.train() # setting up for training
    for batch_idx, (data, target) in enumerate(train_loader): # data contains the image and target contains the label = 0/1/2/3/4/5/6/7/8/9
        data = data.view(-1, 28*28).requires_grad_()
        optimizer.zero_grad() # setting gradient to zero
        output = model(data) # forward
        loss = criterion(output, target) # loss computation
        loss.backward() # back propagation here pytorch will take care of it
        optimizer.step() # updating the weight values
        if batch_idx % 100 == 0:
            print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
                epoch, batch_idx * len(data), len(train_loader.dataset),
                100. * batch_idx / len(train_loader), loss.item()))

 (7)测试函数


# to evaluate the model
## validation of test accuracy
def test(model, criterion, val_loader, epoch,train= False):    
    model.eval()
    test_loss = 0
    correct = 0  
    
    with torch.no_grad():
        for batch_idx, (data, target) in enumerate(val_loader):
            data = data.view(-1, 28*28).requires_grad_()
            output = model(data)
            test_loss += criterion(output, target).item() # sum up batch loss
            pred = output.max(1, keepdim=True)[1] # get the index of the max log-probability
            correct += pred.eq(target.view_as(pred)).sum().item() # if pred == target then correct +=1
        
    test_loss /= len(val_loader.dataset) # average test loss
    if train == False:
      print('\nTest set: Average loss: {:.4f}, Accuracy: {}/{} ({:.4f}%)\n'.format(
          test_loss, correct, val_loader.sampler.__len__(),
          100. * correct / val_loader.sampler.__len__() ))
    if train == True:
      print('\nTrain set: Average loss: {:.4f}, Accuracy: {}/{} ({:.4f}%)\n'.format(
          test_loss, correct, val_loader.sampler.__len__(),
          100. * correct / val_loader.sampler.__len__() ))
    return 100. * correct / val_loader.sampler.__len__() 

(8)训练、测试并且评估baseline模型 


test_acc = torch.zeros([Iterations])
train_acc = torch.zeros([Iterations])
## training the logistic model
for i in range(Iterations):
    train(model, optimizer,criterion,i)
    train_acc[i] = test(model, criterion, train_loader, i,train=True) #Testing the the current CNN
    test_acc[i] = test(model, criterion, test_loader, i)
    torch.save(model,'perceptron.pt')

【output】  

        我们可以看到准确率不是很高,甚至可以说模型鲁棒性表现的不好,才83%。那么接下来就是我们的重头戏了,我们现在打算使用k折交叉验证,需要修改什么地方呢?

  •         Step1、修改数据集
  •         Step2、设定k值
  •         Step3、重新训练 

 【重点】实现K折交叉验证

        NOTE:只需要在最下方插入这一行代码即可,会覆盖掉之前的变量,而且因为是在函数内部运行,与之前的代码没有冲突。

(1)【难点:Fold的数据集处理】


#!pip install sklearn -i https://pypi.mirrors.ustc.edu.cn/simple
from sklearn.model_selection import KFold
train_init = datasets.MNIST('./mnist', train=True,
                       transform=train_transform)
    
test_init =  datasets.MNIST('./mnist', train=False, 
                         transform=test_transform)

# the dataset for k fold cross validation   
dataFold = torch.utils.data.ConcatDataset([train_init, test_init])

def train_flod_Mnist(k_split_value):
    different_k_mse = []
    kf = KFold(n_splits=k_split_value,shuffle=True, random_state=0)  # init KFold
    for train_index , test_index in kf.split(dataFold):  # split  
        # get train, val 
        train_fold = torch.utils.data.dataset.Subset(dataFold, train_index)
        test_fold = torch.utils.data.dataset.Subset(dataFold, test_index) 

        # package type of DataLoader
        train_loader = torch.utils.data.DataLoader(dataset=train_fold, batch_size=BATCH_SIZE, shuffle=True)
        test_loader = torch.utils.data.DataLoader(dataset=test_fold, batch_size=BATCH_SIZE, shuffle=True)
        # train model
        test_acc = torch.zeros([Iterations])
        train_acc = torch.zeros([Iterations])

        ## training the logistic model
        for i in range(Iterations):
            train(model, optimizer,criterion,i)
            train_acc[i] = test(model, criterion, train_loader, i,train=True) #Testing the the current CNN
            test_acc[i] = test(model, criterion, test_loader, i)
            #torch.save(model,'perceptron.pt')
        # one epoch, all acc
        different_k_mse.append(np.array(test_acc))
    return different_k_mse

        上面这个代码,解决的重点有哪些呢?

        一是将这个Mnist数据集的训练集和测试集合并,这一块的坑不少。

train_init = datasets.MNIST('./mnist', train=True,
                       transform=train_transform)
    
test_init =  datasets.MNIST('./mnist', train=False, 
                         transform=test_transform)

# the dataset for k fold cross validation   
dataFold = torch.utils.data.ConcatDataset([train_init, test_init])

        二是使用Sklearn中的KFold进行数据集划分,并且转换回pytorch类型的Dataloader。 

 kf = KFold(n_splits=k_split_value,shuffle=True, random_state=0)  # init KFold
    for train_index , test_index in kf.split(dataFold):  # split  
        # get train, val 根据索引划分
        train_fold = torch.utils.data.dataset.Subset(dataFold, train_index)
        test_fold = torch.utils.data.dataset.Subset(dataFold, test_index) 

        # package type of DataLoader
        train_loader = torch.utils.data.DataLoader(dataset=train_fold, batch_size=BATCH_SIZE, shuffle=True)
        test_loader = torch.utils.data.DataLoader(dataset=test_fold, batch_size=BATCH_SIZE, shuffle=True)

对于KFold:

提供训练集/测试集索引以分割数据。将数据集拆分为k折(默认情况下不打乱数据。

参数介绍:

n_splits:int, 默认为5。表示拆分成5折
shuffle: bool, 默认为False。切分数据集之前是否对数据进行洗牌。True洗牌,False不洗牌。
random_state:int, 默认为None 当shuffle为True时,如果random_state为None,则每次运行代码,获得的数据切分都不一样,random_state指定的时候,则每次运行代码,都能获得同样的切分数据,保证实验可重复。random_state可按自己喜好设定成整数,如random_state=42较为常用。当设定好后,就不能再更改。

例子: 

from sklearn.model_selection import KFold
import numpy as np
X = np.arange(24).reshape(12,2)
y = np.random.choice([1,2],12,p=[0.4,0.6])
kf = KFold(n_splits=5,shuffle=False)  # 初始化KFold
for train_index , test_index in kf.split(X):  # 调用split方法切分数据
    print('train_index:%s , test_index: %s ' %(train_index,test_index))

【output】 

train_index:[ 3 4 5 6 7 8 9 10 11] , test_index: [0 1 2]
train_index:[ 0 1 2 6 7 8 9 10 11] , test_index: [3 4 5]
train_index:[ 0 1 2 3 4 5 8 9 10 11] , test_index: [6 7]
train_index:[ 0 1 2 3 4 5 6 7 10 11] , test_index: [8 9]
train_index:[0 1 2 3 4 5 6 7 8 9] , test_index: [10 11]

注意:

设置shuffle=False,每次运行结果都相同

设置shuffle=True,每次运行结果都不相同

设置shuffle=True和random_state=整数,每次运行结果相同

(2)设定K值为[2,10]进行训练


testAcc_compare_map = {}
for k_split_value in range(2, 10+1):
    print('now k_split_value is:', k_split_value)
    testAcc_compare_map[k_split_value] = train_flod_Mnist(k_split_value)

        testAcc_compare_map是将不同k值下训练的结果保存起来,之后我们可以通过这个字典变量,计算出rmse ,比较不同k值下,实验结果的鲁棒性。


【番外】小实验


我想发布一下小作业,试图帮助大家体悟k折交叉验证:

 1.执行k倍交叉验证,选择k=1-10,绘制(a)平均列车对数rmse vs k(b)平均
对数rmse vs k〔20分〕

1. Perform a k fold cross validation select k = 1-10, plot (a) avg train log rmse vs k. (b) avg
log rmse vs k. [20 points]

2.当你增加k的值时会发生什么?解释两次损失的行为
随着k的增加[20点]

2. What happens when you increase the value of k? Explain the behavior of the two losses
with increasing k. [20 points]

【补充材料】

对于RMSE:

实验结果中ACC随着K值增大的变化如下【k=1就是baseline的结果,此处是k=[2,10]】:


【参考文献】

【1】五折交叉验证/K折交叉验证, python代码到底怎么写_Tina姐的博客-CSDN博客_五折交叉验证

  • 17
    点赞
  • 73
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 2
    评论
好的,以下是用Pytorch实现基于手数字识别数据MNIST的测试集(10000张图片)实验的代码: ```python import torch import torch.nn as nn import torch.optim as optim from torch.utils.data import DataLoader from torchvision.datasets import MNIST from torchvision.transforms import ToTensor from tqdm import tqdm # 定义深度信念网络模型 class DBN(nn.Module): def __init__(self, num_visible, hidden_sizes): super(DBN, self).__init__() self.num_layers = len(hidden_sizes) self.rbm_layers = nn.ModuleList() self.hidden_layers = nn.ModuleList() self.visible_layer = nn.Linear(num_visible, hidden_sizes[0]) for i in range(self.num_layers): input_size = hidden_sizes[i-1] if i > 0 else num_visible rbm_layer = nn.Sequential( nn.Linear(input_size, hidden_sizes[i]), nn.Sigmoid() ) self.rbm_layers.append(rbm_layer) hidden_layer = nn.Sequential( nn.Linear(input_size, hidden_sizes[i]), nn.Sigmoid() ) self.hidden_layers.append(hidden_layer) def forward(self, x): out = self.visible_layer(x) for i in range(self.num_layers): out = self.hidden_layers[i](out) return out # 定义训练函数 def train(model, train_loader, optimizer, criterion): model.train() train_loss = 0.0 for data, target in tqdm(train_loader): data = data.view(data.size(0), -1) optimizer.zero_grad() output = model(data) loss = criterion(output, data) loss.backward() optimizer.step() train_loss += loss.item() * len(data) train_loss /= len(train_loader.dataset) return train_loss # 定义测试函数 def test(model, test_loader, criterion): model.eval() test_loss = 0.0 correct = 0 with torch.no_grad(): for data, target in tqdm(test_loader): data = data.view(data.size(0), -1) output = model(data) test_loss += criterion(output, data).item() * len(data) pred = output.argmax(dim=1, keepdim=True) correct += pred.eq(data.argmax(dim=1, keepdim=True)).sum().item() test_loss /= len(test_loader.dataset) accuracy = correct / len(test_loader.dataset) return test_loss, accuracy # 设置超参数 num_visible = 784 hidden_sizes = [500, 500, 2000] batch_size = 128 lr = 0.01 epochs = 10 nfolds = 10 # 加载数据集 train_dataset = MNIST(root='./data', train=True, transform=ToTensor(), download=True) test_dataset = MNIST(root='./data', train=False, transform=ToTensor(), download=True) # 划分数据集并进行交叉验证 n_train = len(train_dataset) n_test = len(test_dataset) fold_size = n_train // nfolds folds = [] for i in range(nfolds): train_indices = list(range(i*fold_size, (i+1)*fold_size)) test_indices = list(set(range(n_train)) - set(train_indices)) train_sampler = torch.utils.data.sampler.SubsetRandomSampler(train_indices) test_sampler = torch.utils.data.sampler.SubsetRandomSampler(test_indices) train_loader = DataLoader(train_dataset, batch_size=batch_size, sampler=train_sampler) test_loader = DataLoader(test_dataset, batch_size=batch_size, sampler=test_sampler) folds.append((train_loader, test_loader)) # 进行交叉验证训练和测试 for i in range(nfolds): print(f"Fold {i+1}") train_loader, test_loader = folds[i] model = DBN(num_visible, hidden_sizes) criterion = nn.MSELoss() optimizer = optim.Adam(model.parameters(), lr=lr) for epoch in range(1, epochs+1): train_loss = train(model, train_loader, optimizer, criterion) test_loss, accuracy = test(model, test_loader, criterion) print(f"Epoch {epoch}: Train Loss={train_loss:.5f}, Test Loss={test_loss:.5f}, Accuracy={accuracy:.5f}") ``` 这里我们定义了一个`DBN`类作为深度信念网络模型,并定义了训练函数`train`和测试函数`test`。接着我们设置了超参数,加载了MNIST数据集,并进行十折交叉验证。在每一折交叉验证中,我们都实例化一个新的`DBN`模型,并使用MSE损失函数和Adam优化器进行训练。在每个epoch结束时,我们计算训练集和测试集的损失和准确率,并输出结果。 最终,我们可以将十折交叉验证的结果进行平均,得到最终的识别正确率。同时,根据实验结果进一步调整网络的隐藏层个数、每个隐藏层的结点数等设置,选择结果最优的设置在实验报告中汇报。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

大气层煮月亮

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值