想法来源:李宏毅老师的机器学习课
通过改写Alexnet网络最后一层全连接层,使其能分辨宝可梦和数码宝贝。
数据集准备以及预处理
- 下载宝可梦和数码宝贝,设置训练集和测试集
- 将图片搜集好后,通过transforms将图像转换为张量(tensor)格式,并对其进行归一化,其中,我使用了PyTorch的
torchvision
模块中的ImageFolder
类来加载我们的数据集。
import torch
import torchvision
import torchvision.transforms as transforms
from torch.utils.data import Dataset
transform = transforms.Compose(
[transforms.Resize(256),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225])])
# 因为我使用GPU训练模型,因此通过参数pin_memory=True来告诉DataLoader将数据加载到固定的内存中,这样可以更快地将数据传输到GPU
trainset = torchvision.datasets.ImageFolder(root='Dataset/train', transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=32, shuffle=True, num_workers=4, pin_memory=True)
testset = torchvision.datasets.ImageFolder(root='Dataset/test', transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=32, shuffle=False, num_workers=4, pin_memory=True)
构建模型
从torchvision中获取预训练的AlexNet模型,将最后一层输出1000个类别的全连接层改为输出2个类别
import torchvision.models as models
from torch import nn, optim
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = models.alexnet(weights=True)
num_features = model.classifier[-1].in_features
model.classifier[-1] = torch.nn.Linear(num_features, 2)
训练模型
超参epoch设为20,学习率为0.001
使用随机梯度下降(SGD)优化器和交叉熵损失函数对其进行训练。
model = model.to(device)
criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=0.001, momentum=0.9)
n_epochs = 20
for epoch in range(n_epochs):
running_loss = 0.0
model.train()
for i, data in enumerate(trainloader, 0):
inputs, labels = data
inputs = inputs.type(torch.FloatTensor).to(device)
labels = labels.to(device)
optimizer.zero_grad()
outputs = model(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
running_loss += loss.item()
if i == 54:
print('[%d, %5d] loss: %.5f' % (epoch , i, running_loss/55))
# torch.save(model.state_dict(),"saveModel.pt")
测试模型
correct = 0
total = 0
model.eval()
with torch.no_grad():
for data in testloader:
images, labels = data
images = images.to(device)
labels = labels.to(device)
outputs = model(images)
_, predicted = torch.max(outputs.data, 1)
print("predicte_mon",predicted)
total += labels.size(0)
correct += (predicted == labels).sum().item()
print('Accuracy : %d %%' % (100 * correct / total))
输出准确率:(因为我的测试集只设了30个,所以准确度很高)