kaggle数字识别Pytorch框架CNN解法

一、题目要求

利用训练集所有样本的特征和标签训练分类模型,将训练集样本的特征输入训练好的模型得到分类结果,将分类结果上传至官网进行打分,精度评分越高越好。

二、数据说明

官网下载3个CSV文件。

train.csv:训练集,42001行785列。第一行为标题行,介绍了第一列为样本标签,其余784列为样本特征。训练数据集共有42000个样本,每个样本有784个特征,即一张28*28的灰度图片展平所得。

test.csv:测试集(无标签),28001行784列。第一行为标题行,介绍了每一列为样本的一个特征,共784个特征。测试数据集共有28000个样本。

sample_submission.csv:提交文件的模板。28001行2列,第一行为标题行,介绍了第一列为测试集对应的每一个样本的索引序号。第二列为代填入的标签,值全部为0,需要将测试集每个样本的预测结果覆盖在此。完成填写后可直接提交此文件。

三、完整代码

import torch
from torch.utils.data import Dataset, DataLoader
import torch.nn as nn
import torch.optim as optim
import matplotlib.pyplot as plt
import pandas as pd
from tqdm import tqdm

class CSVTrainDataset(Dataset):
    def __init__(self, train_path):
        train_data = pd.read_csv(train_path)
        self.labels = train_data.iloc[:,0].values
        self.features = train_data.iloc[:,1:].values

    def __len__(self):
        # return the lenth of the dataset
        return len(self.labels)

    def __getitem__(self, idx):
        sample = torch.from_numpy(self.features[idx]).float(), torch.tensor(self.labels[idx]).long()
        return sample


class CSVTestDataset(Dataset):
    def __init__(self, test_path):
        self.features = pd.read_csv(test_path).values

    def __len__(self):
        return self.features.shape[0]

    def __getitem__(self, idx):
        sample = self.features[idx]
        sample =torch.from_numpy(sample).float()
        return sample


class CNN(nn.Module):
    def __init__(self):
        super(CNN, self).__init__()
        self.conv1 = nn.Conv2d(in_channels=1, out_channels=32, kernel_size=3, stride=1)
        self.conv2 = nn.Conv2d(in_channels=32, out_channels=64, kernel_size=3, stride=1)
        self.dropout1 = nn.Dropout2d(0.25)
        self.dropout2 = nn.Dropout2d(0.5)
        self.fc1 = nn.Linear(in_features=1600, out_features=128)
        self.fc2 = nn.Linear(in_features=128, out_features=10)

    def forward(self, x):
        x = self.conv1(x)
        x = nn.functional.relu(x)
        x = nn.functional.max_pool2d(x, 2)
        x = self.conv2(x)
        x = nn.functional.relu(x)
        x = nn.functional.max_pool2d(x, 2)
        x = self.dropout1(x)
        x = torch.flatten(x,1)
        x = self.fc1(x)
        x = nn.functional.relu(x)
        x = self.dropout2(x)
        x = self.fc2(x)
        output = nn.functional.log_softmax(x,dim =1)
        return output


show = 0
train_dataset = CSVTrainDataset(train_path=r"..\data\train.csv")
test_dataset = CSVTestDataset(test_path=r"..\data\test.csv")
train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)
model = CNN()
optimizer = optim.Adam(model.parameters(), lr=0.001)
criterion = nn.NLLLoss()


num_epochs = 5
for epoch in range(num_epochs):
    print("training")
    for i, (images, labels) in tqdm(enumerate(train_loader), total=len(train_loader)):
        images = images.reshape(-1, 1, 28, 28)
        if show == 1 and epoch == 1:
            if i == 3:
                images_subset = images[25:35].squeeze()
                labels_subset = labels[25:35]
                fig, axes = plt.subplots(2, 5, figsize=(12, 6))
                fig.suptitle('training set display')
                for i in range(10):
                    row, col = divmod(i, 5)
                    ax = axes[row, col]
                    ax.imshow(images_subset[i], cmap='gray')
                    ax.title.set_text(f'Labels:{labels_subset[i].item()}')
                    ax.axis('off')
                plt.show()
        model.train()
        optimizer.zero_grad()
        outputs = model(images)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
    print(f'Epoch[{epoch+1}/{num_epochs}], Loss:{loss.item():.4f}')

with torch.no_grad():
    test_loader = DataLoader(test_dataset, batch_size=64, shuffle=False)
    model.eval()
    print("predicting")
    for i, images in enumerate(test_loader):
        images = images.reshape(-1, 1, 28, 28)
        pred = model(images)
        pred = pred.argmax(dim=1)
        if show == 1:
            if i == 3:
                images_test_subset = images[25:35].squeeze()
                labels_test_subset = pred[25:35]
                fig, axes = plt.subplots(2, 5, figsize=(12, 6))
                fig.suptitle('test set display')
                for i in range(10):
                    row, col = divmod(i, 5)
                    ax = axes[row, col]
                    ax.imshow(images_test_subset[i], cmap='gray')
                    ax.title.set_text(f'Labels:{labels_test_subset[i].item()}')
                    ax.axis('off')
                plt.show()

        if i == 0:
            result = pred
        else:
            result = torch.cat((result, pred), dim=0)
    result = result[0:28000]
    print("take a look at the result")
    print(result)
    numpy_array = result.numpy()
    csv_file_path = r"..\data\sample_submission.csv"
    df = pd.read_csv(csv_file_path)
    df.iloc[:, 1] = numpy_array.flatten()
    df.to_csv(csv_file_path, index=False)
    print("been saved seccessfully")

须要如此配置文件夹,才可符合代码中的地址,代码方可正常运行。

注意:复制代码至新建main.py文件可直接运行,运行结果(测试集分类值)将直接保存到sample_submission.csv文件的相应位置,当程序打印出"been saved successfully",程序即顺利运行完毕,直接进入kaggle官网提交sample_submission.csv文件即可获得打分和排名。

四、思路简介

1.数据预处理

1.创建批处理数据加载器

为加快训练速度,可以将多个样本同时构建成一个数据,设定超参数batch_size决定批处理量大小。

举例:

batch_size = 64

生成(64,28,28)的高维数据可实现并行运算。

如果后续存在卷积操作,需要再添加一个卷积维度成为(64,1,28,28)

这里的图像为平面图像,通道数为1,故不会像RGB图像(三通道)一样产生(64,1,3,28,28)的数据。

2.样本重塑

原始样本为一维数组,为使用计算机视觉领域常用的二维卷积,须要将一维数组转化为二维数组的形式,即还原为图片的形式。

举例:数据由(64,784)变化为(64,28,28)。

2.构建卷积神经网络模型

class CNN(nn.Module):
    def __init__(self):
        super(CNN, self).__init__()
        self.conv1 = nn.Conv2d(in_channels=1, out_channels=32, kernel_size=3, stride=1)
        self.conv2 = nn.Conv2d(in_channels=32, out_channels=64, kernel_size=3, stride=1)
        self.dropout1 = nn.Dropout2d(0.25)
        self.dropout2 = nn.Dropout2d(0.5)
        self.fc1 = nn.Linear(in_features=1600, out_features=128)
        self.fc2 = nn.Linear(in_features=128, out_features=10)

    def forward(self, x):
        x = self.conv1(x)
        x = nn.functional.relu(x)
        x = nn.functional.max_pool2d(x, 2)
        x = self.conv2(x)
        x = nn.functional.relu(x)
        x = nn.functional.max_pool2d(x, 2)
        x = self.dropout1(x)
        x = torch.flatten(x,1)
        x = self.fc1(x)
        x = nn.functional.relu(x)
        x = self.dropout2(x)
        x = self.fc2(x)
        output = nn.functional.log_softmax(x,dim =1)
        return output

此模型为简易CNN模型,包含三层卷积层。前向传播分别经过二维卷积、激活函数、二维最大池化、二维卷积、激活函数、二维最大池化、丢弃、展平、全连接层、激活函数、丢弃、全连接层、归一化层。

五、单步调试数据流

show = 1

设置show为1则在程序中展示训练集和测试集的部分图像,设置为0则不展示。

train_dataset = CSVTrainDataset(train_path=r"..\data\train.csv")
test_dataset = CSVTestDataset(test_path=r"..\data\test.csv")

从CSV文件中导入数据。

当前主要数据:

test_dataset{CSVTrainDataset:28000}

    features{ndarray:(28000,784)}

train_dataset{42000}

    features{ndarray:(42000,784)}

    labels{ndarray:(42000,)}

train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)

实例化数据加载器。

当前主要数据:

test_dataset{CSVTrainDataset:28000}

    features{ndarray:(28000,784)}

train_dataset{42000}

    features{ndarray:(42000,784)}

    labels{ndarray:(42000,)}

train_loader{DataLoader:657}

model = CNN()
optimizer = optim.Adam(model.parameters(), lr=0.001)
criterion = nn.NLLLoss()

实例化模型、优化器和损失函数。

num_epochs = 5

设置训练轮次,每一个轮次使用训练集全集进行训练。

for i, (images, labels) in tqdm(enumerate(train_loader), total=len(train_loader)):

从训练集数据加载器取出样本的特征数据和标签数据。

当前主要数据:

train_loader{DataLoader:657}

images{Tensor:(64,784)}

labels{Tensor:(64,)}

images = images.reshape(-1, 1, 28, 28)

将样本的特征从一维数组还原为二维数组,并加入卷积维。

当前主要数据:

train_loader{DataLoader:657}

images{Tensor:(64,1,28,28)}

labels{Tensor:(64,)}

images_subset = images[25:35].squeeze()
                labels_subset = labels[25:35]
                fig, axes = plt.subplots(2, 5, figsize=(12, 6))
                fig.suptitle('training set display')
                for i in range(10):
                    row, col = divmod(i, 5)
                    ax = axes[row, col]
                    ax.imshow(images_subset[i], cmap='gray')
                    ax.title.set_text(f'Labels:{labels_subset[i].item()}')
                    ax.axis('off')
                plt.show()

展示训练集的部分图片。

        model.train()
        optimizer.zero_grad()
        outputs = model(images)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

进行训练。

    def forward(self, x):
        x = self.conv1(x)
        x = nn.functional.relu(x)
        x = nn.functional.max_pool2d(x, 2)
        x = self.conv2(x)
        x = nn.functional.relu(x)
        x = nn.functional.max_pool2d(x, 2)
        x = self.dropout1(x)
        x = torch.flatten(x,1)
        x = self.fc1(x)
        x = nn.functional.relu(x)
        x = self.dropout2(x)
        x = self.fc2(x)
        output = nn.functional.log_softmax(x,dim =1)
        return output

进入模型。

当前主要数据:

x{Tensor:(64,1,28,28)}

x = self.conv1(x)

x{Tensor:(64,32,26,26)}

x = nn.functional.relu(x)
x = nn.functional.max_pool2d(x, 2)

x{Tensor:(64,32,13,13)}

x = self.conv2(x)

x{Tensor:(64,64,11,11)}

x = nn.functional.relu(x)
x = nn.functional.max_pool2d(x, 2)
x = self.dropout1(x)

x{Tensor:(64,64,5,5)}

x = torch.flatten(x,1)

x{Tensor:(64,1600)}

x = self.fc1(x)

x{Tensor:(64,128)}

x = nn.functional.relu(x)
x = self.dropout2(x)
x = self.fc2(x)

x{Tensor:(64,10)}

output = nn.functional.log_softmax(x,dim =1)

output{Tensor:(64,10)}

经过多轮次训练,完成模型的训练。

test_loader = DataLoader(test_dataset, batch_size=64, shuffle=False)

当前主要数据:

test_loader{DataLoader:438}

for i, images in enumerate(test_loader):

从测试集数据加载器取出样本的特征数据。

当前主要数据:

test_loader{DataLoader:438}

images{Tensor:(64,784)}

images = images.reshape(-1, 1, 28, 28)

将样本的特征从一维数组还原为二维数组,并加入卷积维。

当前主要数据:

test_loader{DataLoader:438}

images{Tensor:(64,1,28,28)}

pred = model(images)

使用训练好的模型进行分类。

当前主要数据:

test_loader{DataLoader:438}

pred{Tensor:(64,10)}

pred = pred.argmax(dim=1)

获取测试集分类结果。

当前主要数据:
pred{Tensor:(64,)}

                images_test_subset = images[25:35].squeeze()
                labels_test_subset = pred[25:35]
                fig, axes = plt.subplots(2, 5, figsize=(12, 6))
                fig.suptitle('test set display')
                for i in range(10):
                    row, col = divmod(i, 5)
                    ax = axes[row, col]
                    ax.imshow(images_test_subset[i], cmap='gray')
                    ax.title.set_text(f'Labels:{labels_test_subset[i].item()}')
                    ax.axis('off')
                plt.show()

展示测试集分类结果的部分图像。

        if i == 0:
            result = pred
        else:
            result = torch.cat((result, pred), dim=0)

遍历完所有测试集加载器后,将所有的预测结果拼接在一起。

当前主要数据:
result{Tensor:(28032,)}

result = result[0:28000]

切除多余部分。

当前主要数据:
result{Tensor:(28000,)}

此即为最终的分类结果,下面将其保存到提交模板CSV文件中。

    print("take a look at the result")
    print(result)
    numpy_array = result.numpy()
    csv_file_path = r"..\data\sample_submission.csv"
    df = pd.read_csv(csv_file_path)
    df.iloc[:, 1] = numpy_array.flatten()
    df.to_csv(csv_file_path, index=False)
    print("been saved seccessfully")
ImageId,Label
1,2
2,0
3,9
4,9
5,3
6,7
7,0
8,3
9,0
10,3
11,5
12,7
13,4
14,0
15,4
16,3
17,3
18,1

完成提交,评分为:0.98935。(epoch_num = 100)

  • 9
    点赞
  • 30
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
您好!对于在Kaggle上进行数字识别任务,您可以使用PyTorch框架来实现。以下是一个基本的示例代码,帮助您入门: ```python import torch import torch.nn as nn import torch.optim as optim from torch.utils.data import DataLoader from torchvision import datasets, transforms # 设置随机种子,以便结果可复现 torch.manual_seed(42) # 数据预处理 transform = transforms.Compose([ transforms.ToTensor(), transforms.Normalize((0.5,), (0.5,)) ]) # 加载训练集和测试集 train_dataset = datasets.MNIST(root='./data', train=True, transform=transform, download=True) test_dataset = datasets.MNIST(root='./data', train=False, transform=transform, download=True) # 创建数据加载器 train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True) test_loader = DataLoader(test_dataset, batch_size=64, shuffle=False) # 定义模型 class Net(nn.Module): def __init__(self): super(Net, self).__init__() self.fc1 = nn.Linear(784, 128) self.fc2 = nn.Linear(128, 64) self.fc3 = nn.Linear(64, 10) def forward(self, x): x = x.view(x.size(0), -1) x = torch.relu(self.fc1(x)) x = torch.relu(self.fc2(x)) x = self.fc3(x) return x model = Net() # 定义损失函数和优化器 criterion = nn.CrossEntropyLoss() optimizer = optim.SGD(model.parameters(), lr=0.01) # 训练模型 for epoch in range(10): running_loss = 0.0 for images, labels in train_loader: optimizer.zero_grad() outputs = model(images) loss = criterion(outputs, labels) loss.backward() optimizer.step() running_loss += loss.item() print(f"Epoch {epoch+1}: Loss {running_loss/len(train_loader)}") # 测试模型 model.eval() correct = 0 total = 0 with torch.no_grad(): for images, labels in test_loader: outputs = model(images) _, predicted = torch.max(outputs.data, 1) total += labels.size(0) correct += (predicted == labels).sum().item() print(f"Accuracy on test set: {(correct / total) * 100}%") ``` 希望这个示例能帮助您开始在Kaggle上进行数字识别任务!如果您有任何进一步的问题,请随时提问。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值