看到莫烦python的课程中有这么一个学习的实例,然而只有tensorflow版本,因此尝试使用pytorch进行实现。关于项目背景参考:机器学习实战 | 莫烦Pythonmofanpy.com
github源码(tf版):https://github.com/MorvanZhou/train-classifier-from-scratchgithub.com
莫烦老师说过的就不重复了,看本文之前建议先看那两节课程。
数据处理data_processing模块没有改动,直接复制粘贴:
import pandas as pd
from urllib.request import urlretrieve
def load_data(download=True):
# download data from : http://archive.ics.uci.edu/ml/datasets/Car+Evaluation
if download:
data_path, _ = urlretrieve("http://archive.ics.uci.edu/ml/machine-learning-databases/car/car.data", "car.csv")
print("Downloaded to car.csv")
# use pandas to view the data structure
col_names = ["buying", "maint", "doors", "persons", "lug_boot", "safety", "class"]
data = pd.read_csv("car.csv", names=col_names)
return data
def convert2onehot(data):
# covert data to onehot representation
return pd.get_dummies(data, prefix=data.columns)
if __name__ == "__main__":
data = load_data(download=False)
new_data = convert2onehot(data)
print(data.head())
print("\nNum of data: ", len(data), "\n") # 1728
# view data values
for name in data.keys():
print(name, pd.unique(data[name]))
print("\n", new_data.head(3))
new_data.to_csv("car_onehot.csv", index=False)
接下来搭建网络,由于只有FC层没有卷积层,因此只需要用nn.Linear:
from torch import nn
class CNN(nn.Module):
def __init__(self):
super(CNN, self).__init__()
self.layer = nn.Sequential(
nn.Linear(21, 128),
nn.ReLU(inplace=True),
nn.Linear(128, 128),
nn.ReLU(inplace=True),
nn.Linear(128, 4)
)
def forward(self, x):
#x = x.view(x.size(0), -1)
x = self.layer(x)
return x
在一般的卷积网络中需要使用#x = x.view(x.size(0), -1)将卷积结果展开成一维向量,作为FC层的输入,这里不需要。汽车的输入共有21种,即输入层有21个神经元,然后用两层隐藏层,都是128个神经元,最后输出层4个代表四种可能的分类。
接下来写model模块,开始训练。
import data_processing
import numpy as np
data = data_processing.load_data(download=False)
new_data = data_processing.convert2onehot(data)
new_data = new_data.values.astype(np.float32)
np.random.shuffle(new_data)
sep = int(0.7*len(new_data))
train_data = new_data[:sep]
test_data = new_data[sep:]
sep相当于把数据73分,70%训练,30%测试。
定义一下超参数,用SGD训练,只有lr:
batch_size = 64
learning_rate = 0.1
num_epochs = 100
用DataLoader类将数据进行封装:
from torch.utils.data import DataLoader
train_loader = DataLoader(train_data, batch_size=batch_size, shuffle=True)
test_loader = DataLoader(test_data, batch_size=batch_size, shuffle=True)
定义损失函数,注意这里有一个坑,本来应该用:
criterion = nn.CrossEntropyLoss()
但是loss = criterion(out, label)这样写的话会报错,原因是CrossEntropyLoss不能接受one-hot类型的标签,必须是十进制的标签。比如说标签有1,2,3,4四种,那么对应的one-hot类型的标签就是[1,0,0,0]、[0,1,0,0]、[0,0,1,0]、[0,0,0,1]。因此损失函数定义方法转换为:
N = label.size(0)
out = model_train(img)
log_prob = torch.nn.functional.log_softmax(out, dim=1)
loss = -torch.sum(log_prob * label) / N
print_loss = loss.data.item()
优化器就很简单:
optimizer = optim.SGD(model_train.parameters(), lr=learning_rate)
然后是固定三句话:
optimizer.zero_grad()
loss.backward()
optimizer.step()
最后用visdom库来可视化训练过程:
import visdom
vis=visdom.Visdom()
plot_data = {'X': [], 'Y': []}
plot_data['Y'].append(print_loss)
plot_data['X'].append(epoch)
vis.line(X=np.array(plot_data['X']),Y=np.array(plot_data['Y']),opts={
'title': ' loss over time',
'legend': ['loss'],
'xlabel': 'epoch',
'ylabel': 'loss'})
效果为:
完整代码为:
import numpy as np
from torch import nn, optim
import torch
import data_processing
from torch.utils.data import DataLoader
import cnn
import visdom
import pandas as pd
vis=visdom.Visdom()
data = data_processing.load_data(download=False)
new_data = data_processing.convert2onehot(data)
new_data = new_data.values.astype(np.float32)
np.random.shuffle(new_data)
sep = int(0.7*len(new_data))
train_data = new_data[:sep]
test_data = new_data[sep:]
batch_size = 64
learning_rate = 0.1
num_epochs = 100
train_loader = DataLoader(train_data, batch_size=batch_size, shuffle=True)
test_loader = DataLoader(test_data, batch_size=batch_size, shuffle=True)
model_train = cnn.CNN()
model_train = model_train.cuda()
optimizer = optim.SGD(model_train.parameters(), lr=learning_rate)
epoch = 0
plot_data = {'X': [], 'Y': []}
for epoch in range(num_epochs):
for data in train_loader:
data = train_data
img, label = data[:,:21], data[:,21:]
img, label = torch.tensor(img), torch.tensor(label)
img, label = img.cuda(), label.cuda()
N = label.size(0)
out = model_train(img)
log_prob = torch.nn.functional.log_softmax(out, dim=1)
loss = -torch.sum(log_prob * label) / N
print_loss = loss.data.item()
optimizer.zero_grad()
loss.backward()
optimizer.step()
plot_data['Y'].append(print_loss)
plot_data['X'].append(epoch)
if epoch%50 == 0:
print('epoch: {}, loss: {:.4}'.format(epoch, loss.data.item()))
if epoch == 200:
vis.line(X=np.array(plot_data['X']),Y=np.array(plot_data['Y']),opts={
'title': ' loss over time',
'legend': ['loss'],
'xlabel': 'epoch',
'ylabel': 'loss'})
剩下测试模块下一篇文章补上。