分类
我们使用深度学习来训练一个象限分类器,简而言之就是根据点的坐标决策他的象限。例如,点(1,-1)位于第三象限。
这里没有细分损失函数
和评估方法
,下次内容会深入探讨。
数据获取
还是采用老方法,依然使用随机的方法生成数据。为了更加贴合实际,我们这次区分了
训练集
和测试集
,并使用类
进行封装。
import numpy as np
import matplotlib.pyplot as plt
import torch.nn.functional as F
import torch
from torch.utils import data
from torch import nn
class MyData:
def __init__(self, _features=[], _labels=[]):
self.features = _features
self.labels = _labels
def quadrant(self, x, y): # 决策点的象限,使用一个向量表示(one-hot)
if x >= 0 and y >= 0:
return [1,0,0,0]
if x >= 0 and y <= 0:
return [0,0,0,1]
if x <= 0 and y >= 0:
return [0,1,0,0]
return [0,0,1,0]
def get_data(self, size):
self.features = torch.normal(0, 1, (size, 2))
self.labels = torch.tensor([self.quadrant(x[0], x[1]) for x in self.features]).reshape(size,4).to(torch.float32)
# tensor改变数据类型:.to(torch.float32)方法
batch的数据迭代器
def load_array(data_arrays, batch_size, is_train=True):
"""构造一个PyTorch数据迭代器"""
dataset = data.TensorDataset(*data_arrays)
return data.DataLoader(dataset, batch_size, shuffle=is_train)
# 定义2个数据集
train_set = MyData()
test_set = MyData()
# 生成数据
train_set.get_data(40000)
test_set.get_data(10000)
data_iter = load_array((train_set.features, train_set.labels), batch_size=100)
训练过程
跟回归的方法类似,在这里注意一下net
的层数、输入特征数等。
值得注意的是下面使用了一个softmax
方法以及torch.max
函数:
1. softmax方法
对于一个序列
z
1
,
z
2
…
z
k
z_1,z_2\dots z_k
z1,z2…zk,定义它的softmax序列为
σ
(
z
i
)
=
e
z
i
∑
1
≤
j
≤
k
e
z
j
\sigma(z_i)=\frac{e^{z_i}}{\sum_{1\leq j \leq k} e^{z_j}}
σ(zi)=∑1≤j≤kezjezi
这样的话,它的softmax序列的和就是1
了,在多分类问题中,我们可以根据其softmax序列中最大的值来确定他的类别。例子如下
>>> A = torch.tensor([[1,2,3,4],[2,3,4,5]]).to(torch.float)
>>> F.softmax(A,dim = 0) # dim=0按列进行
tensor([[0.2689, 0.2689, 0.2689, 0.2689],
[0.7311, 0.7311, 0.7311, 0.7311]])
>>> F.softmax(A,dim = 1) # dim=1按行进行
tensor([[0.0321, 0.0871, 0.2369, 0.6439],
[0.0321, 0.0871, 0.2369, 0.6439]])
>>> B = torch.tensor([1,2,3,4]).to(torch.float)#一维tensor较为特殊
>>> F.softmax(B) # 出现warnning
<stdin>:1: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.
tensor([0.0321, 0.0871, 0.2369, 0.6439])
>>> F.softmax(B,dim = 0)#务必指定dim=0
tensor([0.0321, 0.0871, 0.2369, 0.6439])
1.1 log_softmax
sofxmax的结果再求一次log,F.log_softmax(A)
等价于torch.log(F.softmax(A))
2. max函数
max函数即求最大值,torch里面定义了一些新的操作。
>>> A = torch.randn(2,3)
>>> A
tensor([[ 0.5118, 0.2640, -1.3412],
[-0.6643, -0.3805, -0.2402]])
>>> torch.max(A) # 直接获取max
tensor(0.5118)
>>>
>>> torch.max(A,dim = 0) # 按照列来获取max,并且获取max所在位置
torch.return_types.max(
values=tensor([ 0.5118, 0.2640, -0.2402]),
indices=tensor([0, 0, 1]))
>>>
>>>
>>> torch.max(A,dim = 1) # 按照行来获取max,并且获取max所在位置
torch.return_types.max(
values=tensor([ 0.5118, -0.2402]),
indices=tensor([0, 2]))
使用xxx[0]
可以指定要求的max值或者index值,例如
>>> torch.max(A,dim = 1)
torch.return_types.max(
values=tensor([ 0.5118, -0.2402]),
indices=tensor([0, 2]))
>>> torch.max(A,dim = 1)[0]
tensor([ 0.5118, -0.2402])
NN代码
net = nn.Sequential(nn.Linear(2, 18), nn.ReLU(),
nn.Linear(18, 18), nn.Tanh(), nn.Linear(18, 4))
loss = nn.MSELoss()
trainer = torch.optim.SGD(net.parameters(), lr=0.03)
for epoch in range(5):
for X, y in data_iter:
predict = net(X)
_loss = loss(predict, y)
trainer.zero_grad()
_loss.backward()
trainer.step()
print('epoch {},loss = {}'.format(epoch+1,loss(net(train_set.features),train_set.labels).data ))
prediction = net(test_set.features)
prediction = F.softmax(prediction,dim = 0)
prediction = torch.max(prediction,1)[1]
true_label = torch.max(test_set.labels, 1)[1]
print("prediction = ",prediction)
print("true_label = ",true_label)
from sklearn.metrics import accuracy_score
accuracy_score(y_true = true_label, y_pred = prediction) # 输出准确率
训练结果
因为任务很简单,所以模型很快完成训练了,仅训练五轮就在测试集上几乎能达到98%的准确率。
epoch 1,loss = 0.07902326434850693
epoch 2,loss = 0.06842370331287384
epoch 3,loss = 0.061164308339357376
epoch 4,loss = 0.05610521882772446
epoch 5,loss = 0.052224207669496536
prediction = tensor([2, 0, 0, ..., 0, 1, 3])
true_label = tensor([3, 0, 0, ..., 0, 1, 3])
0.9788
损失函数介绍
1. MSE均方损失函数
单个数据:预测值为 y ^ \hat{y} y^,真实值为 y y y,则 M S E ( y ^ , y ) = ( y ^ − y ) 2 MSE(\hat{y},y)=(\hat{y}-y)^2 MSE(y^,y)=(y^−y)2
多个数据的MSE计算: M S E = 1 N ∑ i = 1 N ( y i ^ , y i ) 2 MSE=\frac{1}{N}\sum_{i=1}^N(\hat{y_i},y_i)^2 MSE=N1∑i=1N(yi^,yi)2
例如:多分类问题,预测概率为 [ 0.3 , 0.3 , 0.4 ] [0.3,0.3,0.4] [0.3,0.3,0.4],实际为 [ 0 , 0 , 1 ] [0,0,1] [0,0,1],则 M S E = ( 0.3 − 0 ) 2 + ( 0.3 − 0 ) 2 + ( 0.4 − 1 ) 2 3 = 0.18 MSE=\frac{(0.3-0)^2+(0.3-0)^2+(0.4-1)^2}{3}=0.18 MSE=3(0.3−0)2+(0.3−0)2+(0.4−1)2=0.18
>>> loss = nn.MSELoss()
>>> A = torch.tensor([0.3,0.3,0.4]).to(torch.float)
>>> B = torch.tensor([0,0,1]).to(torch.float)
>>> loss(A,B)
tensor(0.1800)
2. 交叉熵损失函数
nn.CrossEntropyLoss()
计算方法跟我们熟悉的
∑
p
i
l
o
g
(
p
i
)
\sum p_ilog(p_i)
∑pilog(pi)略有差别。
参考资料
[1]损失函数|交叉熵损失函数 - 飞鱼Talk的文章 - 知乎 https://zhuanlan.zhihu.com/p/35709485