深度学习实验1：pytorch实践与前馈神经网络

吴大炮

已于 2024-05-22 14:20:42 修改

阅读量7.3k

点赞数 21

分类专栏：学习记录 pytorch 深度学习实验文章标签： pytorch 神经网络深度学习

于 2021-09-13 20:47:38 首次发布

本文链接：https://blog.csdn.net/weixin_44645198/article/details/120110641

版权

学习记录同时被 3 个专栏收录

30 篇文章 13 订阅

订阅专栏

pytorch

11 篇文章 2 订阅

订阅专栏

深度学习实验

4 篇文章 3 订阅

订阅专栏

深度学习实验1:pytorch实践与前馈神经网络

1、pytorch基本操作

1.使用 𝐓𝐞𝐧𝐬𝐨𝐫初始化一个 𝟏×𝟑的矩阵 𝑴和一个 𝟐×𝟏的矩阵 𝑵，对两矩阵进行减法操作（要求实现三种不同的形式），给出结果并分析三种方式的不同（如果出现报错，分析报错的原因）），同时需要指出在计算过程中发生了什么
2.①利用𝐓𝐞𝐧𝐬𝐨𝐫创建两个大小分别𝟑×𝟐和𝟒×𝟐的随机数矩阵𝑷和𝑸，要求服从均值为0，标准差0.01为的正态分布；②对第二步得到的矩阵𝑸进行形状变换得到𝑸的转置𝑸𝑻；③对上述得到的矩阵𝑷和矩阵𝑸𝑻求内积
3.给定公式𝑦3=𝑦1+𝑦2=x^2+x3且𝑥=1。利用学习所得到的Tensor的相关知识，求y对的梯度𝑥，即y/𝑑𝑥。要求在计算过程中，在计算x^3时中断梯度的追踪，观察结果并进行原因分析提示,可使用with torch.no_grad(), 举例:
with torch.no_grad():
y2 = x ** 3
实验目的：理解tensor的概念，可以熟练的掌握的pytorch基本操作，广播机制，掌握如何利用pytorch深度学习框架求解梯度，进行梯度的追踪。

1 减法操作

M = torch.rand(1,3)
N = torch.rand(2,1)
print("张量M:\n",M)
print("张量N:\n",N)
#三种不同的形式
#减法形式一
print("减法形式一：M-N::\n",M-N)
#减法形式二
print("减法形式二，torch.sub(M,N)结果为:\n",torch.sub(M,N))
#减法形式三,inplace原地操作
N.sub_(M)
print("减法形式三,inplace原地操作:\n",N)

结果

张量M:
 tensor([[0.5758, 0.6711, 0.9729]])
张量N:
 tensor([[0.1538],
        [0.5465]])
减法形式一：M-N::
 tensor([[0.4220, 0.5173, 0.8191],
        [0.0293, 0.1246, 0.4264]])
减法形式二，torch.sub(M,N)结果为:
 tensor([[0.4220, 0.5173, 0.8191],
        [0.0293, 0.1246, 0.4264]])
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-3-8d25d635ba34> in <module>
      9 print("减法形式二，torch.sub(M,N)结果为:\n",torch.sub(M,N))
     10 #减法形式三,inplace原地操作
---> 11 N.sub_(M)
     12 print("减法形式三,inplace原地操作:\n",N)

RuntimeError: output with shape [2, 1] doesn't match the broadcast shape [2, 3]

2、生成张量进行计算

P = torch.normal(0.0,0.01,(3,2))
print("张量P:\n",P)
Q = torch.normal(0.0,0.01,(4,2))
print("张量Q:\n",Q)
#对Q求转置
Qt = Q.t()
print("Q的转置:\n",Qt)
#求内积
print("内积计算结果:\n",torch.matmul(P,Qt))

结果

张量P:
 tensor([[-0.0144,  0.0003],
        [-0.0019,  0.0103],
        [ 0.0060,  0.0057]])
张量Q:
 tensor([[-0.0015, -0.0031],
        [-0.0064, -0.0060],
        [-0.0004, -0.0045],
        [ 0.0125,  0.0054]])
Q的转置:
 tensor([[-0.0015, -0.0064, -0.0004,  0.0125],
        [-0.0031, -0.0060, -0.0045,  0.0054]])
内积计算结果:
 tensor([[ 2.0668e-05,  9.0359e-05,  3.9774e-06, -1.7820e-04],
        [-2.9422e-05, -4.9168e-05, -4.5161e-05,  3.0705e-05],
        [-2.6908e-05, -7.2489e-05, -2.7620e-05,  1.0531e-04]])

3、

x = torch.tensor(1.0,requires_grad = True)
y1 = x**2
with torch.no_grad():
    y2 = x**3
    
y3 = y1+y2
y3.backward()
print(x.grad)

结果

tensor(2.)

1.2 Logistic回归实验
逻辑回归（Logistic Regression）与线性回归（Linear Regression）都是一种广义线性模型（generalized linear model）。逻辑回归假设因变量 y 服从伯努利分布，而线性回归假设因变量 y 服从高斯分布。因此与线性回归有很多相同之处，去除Sigmoid映射函数的话，逻辑回归算法就是一个线性回归。可以说，逻辑回归是以线性回归为理论支持的，但是逻辑回归通过Sigmoid函数引入了非线性因素，因此可以轻松处理0/1分类问题。
logistic回归是一种分类方法，用于两分类问题。其基本思想为：a. 寻找合适的假设函数，即分类函数，用以预测输入数据的判断结果；b. 构造代价函数，即损失函数，用以表示预测的输出结果与训练数据的实际类别之间的偏差；c. 最小化代价函数，从而获取最优的模型参数。
1.要求动手从零实现logistic回归（只借助Tensor和Numpy相关的库）在人工构造的数据集上进行训练和测试，并从loss、训练集以及测试集上的准确率等多个角度对结果进行分析。
2.利用torch.nn实现 logistic回归在人工构造的数据集上进行训练和测试，并对结果进行分析并从loss、训练集以及测试集上的准确率等多个角度对结果进行分析
实验目的：从零实现和利用torch.nn实现logistic回归，更好地理解回归的过程，掌握pytorch框架。

2、从零和torch.nn实现logistic回归

2.1 从零实现

import torch
import numpy as np
import matplotlib.pyplot as plt

生成数据集

num_inputs = 2  
n_data = torch.ones(1000, num_inputs)
x1 = torch.normal(2 * n_data, 1) 
y1 = torch.ones(1000)
x2 = torch.normal(-2 * n_data, 1) 
y2 = torch.zeros(1000)  
#划分训练和测试集
train_index = 800

训练集按行合并

trainfeatures = torch.cat((x1[:train_index], x2[:train_index]), 0).type(torch.FloatTensor)  
trainlabels = torch.cat((y1[:train_index], y2[:train_index]), 0).type(torch.FloatTensor)

测试集

testfeatures = torch.cat((x1[train_index:], x2[train_index:]), 0).type(torch.FloatTensor)  
testlabels = torch.cat((y1[train_index:], y2[train_index:]), 0).type(torch.FloatTensor) 
print(len(trainfeatures))

训练集数据可视化

plt.scatter(trainfeatures.data.numpy()[:, 0], 
            trainfeatures.data.numpy()[:, 1],
            c=trainlabels.data.numpy(),
            s=5, lw=0, cmap='RdYlGn' ) 
plt.show()

结果

在这里插入图片描述

读取样本特征

def data_iter(batch_size,features,labels):
    num_examples=len(features)
    indices=list(range(num_examples))
    np.random.shuffle(indices)
    for i in range(0,num_examples,batch_size):
        j=torch.LongTensor(indices[i:min(i+batch_size,num_examples)])
        yield features.index_select(0,j),labels.index_select(0,j)

num_inputs = 2

模型参数初始化

w=torch.tensor(np.random.normal(0,0.01,(num_inputs,1)),dtype=torch.float32)
b=torch.zeros(1,dtype=torch.float32)
w.requires_grad_(requires_grad=True)
b.requires_grad_(requires_grad=True)

逻辑回归

def logits(X, w, b):  
    y = torch.mm(X, w) + b 
    return  1/(1+torch.pow(np.e,-y))

手动实现二元交叉熵损失函数

def logits_loss(y_hat, y):  
    y = y.view(y_hat.size())  
    return  -y.mul(torch.log(y_hat))-(1-y).mul(torch.log(1-y_hat)) 

#优化函数  
def sgd(params, lr, batch_size):  
    for param in params:  
        param.data -= lr * param.grad / batch_size # 注意这里更改param时用的param.data

测试集准确率

def evaluate_accuracy():  
    acc_sum,n,test_l_sum = 0.0,0 ,0 
    for X,y in data_iter(batch_size, testfeatures, testlabels):  
        y_hat = net(X, w, b)  
        y_hat = torch.squeeze(torch.where(y_hat>0.5,torch.tensor(1.0),torch.tensor(0.0)))  
        acc_sum += (y_hat==y).float().sum().item()
        l = loss(y_hat,y).sum()
        test_l_sum += l.item()  
        n+=y.shape[0]  
    return acc_sum/n,test_l_sum/n

开始训练

lr = 0.0005  
num_epochs = 300
net = logits  
loss = logits_loss
batch_size = 50  
test_acc,train_acc= [],[]
train_loss,test_loss =[],[] 
for epoch in range(num_epochs): # 训练模型一共需要num_epochs个迭代周期  
    train_l_sum, train_acc_sum,n = 0.0,0.0,0  
#在每一个迭代周期中，会使用训练数据集中所有样本一次  
    for X, y in data_iter(batch_size, trainfeatures, trainlabels): # x和y分别是小批量样本的特征和标签 
        y_hat = net(X, w, b)  
        l = loss(y_hat, y).sum() # l是有关小批量X和y的损失  
        l.backward() # 小批量的损失对模型参数求梯度  
        sgd([w, b], lr, batch_size) # 使用小批量随机梯度下降迭代模型参数  
        w.grad.data.zero_() # 梯度清零  
        b.grad.data.zero_() # 梯度清零  
        #计算每个epoch的loss  
        train_l_sum += l.item()  
        #计算训练样本的准确率  
        y_hat = torch.squeeze(torch.where(y_hat>0.5,torch.tensor(1.0),torch.tensor(0.0)))  
        train_acc_sum += (y_hat==y).sum().item()  
        #每一个epoch的所有样本数 
        n+= y.shape[0]  
        #train_l = loss(net(trainfeatures, w, b), trainlabels)  
        #计算测试样本的准确率  
    test_a,test_l = evaluate_accuracy()
    test_acc.append(test_a)
    test_loss.append(test_l)
    train_acc.append(train_acc_sum/n)
    train_loss.append(train_l_sum/n)
    print('epoch %d, loss %.4f, train acc %.3f, test acc %.3f'
          % (epoch + 1, train_loss[epoch], train_acc[epoch], test_acc[epoch]))

定义一个绘图函数后面都在用到

import matplotlib.pyplot as plt
def Draw_Curve(*args,xlabel = "epoch",ylabel = "loss"):#
    for i in args:
        x = np.linspace(0,len(i[0]),len(i[0]))  
        plt.plot(x,i[0],label=i[1],linewidth=1.5)  
    plt.xlabel(xlabel)
    plt.ylabel(ylabel)
    plt.legend()
    plt.show()
Draw_Curve([train_loss,"train_loss"])
Draw_Curve([train_acc,"train_acc"],[test_acc,"test_acc"],ylabel = "acc")

在这里插入图片描述

2.2、torch.nn实现

这里只只展示不同的部分

import torch
import numpy as np
import matplotlib.pyplot as plt
import torch.utils.data as Data
from torch.nn import init
from torch import nn

读取数据

batch_size = 50  
# 将训练数据的特征和标签组合  
dataset = Data.TensorDataset(trainfeatures, trainlabels)  
# 把 dataset 放入 DataLoader  
train_data_iter = Data.DataLoader(dataset=dataset, # torch TensorDataset format  
                            batch_size=batch_size, # mini batch size  
                            shuffle=True, # 是否打乱数据 (训练集一般需要进行打乱)  
                            num_workers=0, # 多线程来读数据， 注意在Windows下需要设置为0  
                           )  
# 将测试数据的特征和标签组合  
dataset = Data.TensorDataset(testfeatures, testlabels)  
# 把 dataset 放入 DataLoader  
test_data_iter = Data.DataLoader(dataset=dataset, # torch TensorDataset format  
                                 batch_size=batch_size, # mini batch size  
                                 shuffle=True, # 是否打乱数据 (训练集一般需要进行打乱)  
                                 num_workers=0, # 多线程来读数据， 注意在Windows下需要设置为0  
                                )

nn.Module 定义模型

class LogisticRegression(nn.Module):  
    def __init__(self,n_features):  
        super(LogisticRegression, self).__init__()  
        self.lr = nn.Linear(n_features, 1)  
        self.sm = nn.Sigmoid()  
    
    def forward(self, x): 
        x = self.lr(x)  
        x = self.sm(x)  
        return x

初始化模型

logistic_model = LogisticRegression(num_inputs)

定义损失函数

criterion = nn.BCELoss()

定义优化器

optimizer = torch.optim.SGD(logistic_model.parameters(), lr=1e-3)

参数初始化

init.normal_(logistic_model.lr.weight, mean=0, std=0.01)  
init.constant_(logistic_model.lr.bias, val=0) #也可以直接修改bias的data： net[0].bias.data.fill_(0)  
print(logistic_model.lr.weight)  
print(logistic_model.lr.bias)

训练评估函数和前面的一样，

开始训练

num_epochs = 300 
test_acc,train_acc= [],[]
train_loss,test_loss =[],[] 
for epoch in range( num_epochs):  
    train_l_sum, train_acc_sum,n = 0.0,0.0,0  
    for X, y in train_data_iter:  
        y_hat = logistic_model(X)  
        l = criterion(y_hat, y.view(-1, 1))  
        optimizer.zero_grad() # 梯度清零，等价于logistic_model.zero_grad()  
        l.backward()  
        # update model parameters  
        optimizer.step()  
        #计算每个epoch的loss  
        train_l_sum += l.item()  
        #计算训练样本的准确率  
        y_hat = torch.squeeze(torch.where(y_hat>0.5,torch.tensor(1.0),torch.tensor(0.0)))  
        train_acc_sum += (y_hat==y).sum().item()  
        #每一个epoch的所有样本数  
        n+= y.shape[0]  
        #计算测试样本的准确率  
    test_a,test_l = evaluate_accuracy()
    test_acc.append(test_a)
    test_loss.append(test_l)
    train_acc.append(train_acc_sum/n)
    train_loss.append(train_l_sum/n)
    print('epoch %d, loss %.4f, train acc %.3f, test acc %.3f'
          % (epoch + 1, train_loss[epoch], train_acc[epoch], test_acc[epoch]))

绘制训练结果绘图，和前面一样
在这里插入图片描述

3、从零实现和torch.nn实现softmax

3.1、从零实现

注：代码部分只展示不同部分
softmax回归实验采用的实验数据集为fashion mnist数据集

导入数据集
获取和读取数据

batch_size = 256

mnist_train = torchvision.datasets.FashionMNIST(root="../data", train=True, download=True, transform=transforms.ToTensor())
mnist_test = torchvision.datasets.FashionMNIST(root="../data", train=False, download=True, transform=transforms.ToTensor())

train_iter = torch.utils.data.DataLoader(mnist_train, batch_size=batch_size, shuffle=True, num_workers=0)
test_iter = torch.utils.data.DataLoader(mnist_test, batch_size=batch_size, shuffle=False, num_workers=0)

损失函数为交叉熵损失函数，定义优化函数

# 实现交叉熵损失函数
def cross_entropy(y_hat,y):
    return - torch.log(y_hat.gather(1,y.view(-1,1)))
def sgd(params, lr, batch_size):
    for param in params:
        param.data -= lr * param.grad / batch_size # 注意这里更改param时用的param.data

初始化模型参数

# 初始化模型参数
num_inputs = 784  # 输入是28x28像素的图像，所以输入向量长度为28*28=784
num_outputs = 10  # 输出是10个图像类别

W = torch.tensor(np.random.normal(0,0.01,(num_inputs,num_outputs)),dtype=torch.float) # 权重参数为784x10
b = torch.zeros(num_outputs,dtype=torch.float) # 偏差参数为1x10
# 模型参数梯度
W.requires_grad_(requires_grad=True)
b.requires_grad_(requires_grad=True)

实现softmax运算

# 实现softmax运算
def softmax(X):
    X_exp = X.exp() # 通过exp函数对每个元素做指数运算
    partition = X_exp.sum(dim=1, keepdim=True) # 对exp矩阵同行元素求和
    return X_exp / partition # 矩阵每行各元素与该行元素之和相除 最终得到的矩阵每行元素和为1且非负

模型定义


# 模型定义
def net(X):
    return softmax(torch.mm(X.view((-1,num_inputs)),W)+b)

定义精度评估函数

本次实验的训练任务为分类任务，模型性能的好坏依据分类的准确率来进行判定，本次实验通过定义精度评估函数evaluate_accurcy()对模型的预测结果和真实结果进行对比，计算出训练后模型的精度。在后面训练的每个epoch中进行计算，其原理是在每个epoch训练完后，更新参数w和b，然后在最新的w和b的基础上，对所有的测试集输入到net网络中，进行测试，其中根据判断每一个输出是否和真值的位置相对应，则认为检测准确，反之检测失误，然后统计最后的正确数量和测试集数量做比。

# 计算分类准确率
def evaluate_accuracy(data_iter,net):
    acc_sum,n,test_l_sum= 0.0,0,0.0
    for X,y in data_iter:
        acc_sum += (net(X).argmax(dim = 1) == y).float().sum().item()
        l = loss(net(X),y).sum()
        test_l_sum += l.item()
        n += y.shape[0]
    return acc_sum/n,test_l_sum/n

模型训练

流程和之前的一样，在训练集上取batch_size大小的训练集，然后输入到网络中，然后计算损失函数，反向传播求梯度，累加所有的loss，以及在batch_size下计算训练样本与真值的位置对应，统计检测准确的数量，然后在每个epoch结束后，进行所有loss和accuracy的统计与训练集数量做比，即可以求出loss和训练集的准确率，测试集准确率直接调用evaluate_accuray函数，需要注意的是，求每个epoch的测试集准确率，利用的是每次训练后更新的参数w和b，而训练集准确率和loss是统计训练时的结果，即不是更新后的w和b。

# 模型训练
num_epochs,lr = 50, 0.1
test_acc,train_acc= [],[]
train_loss,test_loss =[],[] 
loss  = cross_entropy
params  = [W,b]
for epoch in range(num_epochs):
    train_l_sum, train_acc_sum, n = 0.0, 0.0, 0
    for X,y in train_iter:
        y_hat = net(X)
        l = loss(y_hat,y).sum()
        # 梯度清零 梯度清零放在最后  
        #默认一开始梯度为零或者说没有梯度，所以讲梯度清零放在for循环的最后
        l.backward()
        sgd(params, lr, batch_size)
        W.grad.data.zero_()
        b.grad.data.zero_()
        train_l_sum += l.item()
        train_acc_sum += (y_hat.argmax(dim=1) == y).sum().item()
        n += y.shape[0]
    test_a,test_l = evaluate_accuracy(test_iter, net)
    test_acc.append(test_a)
    test_loss.append(test_l)
    train_acc.append(train_acc_sum/n)
    train_loss.append(train_l_sum/n)
    print('epoch %d, loss %.4f, train acc %.3f, test acc %.3f'
          % (epoch + 1, train_loss[epoch], train_acc[epoch], test_acc[epoch]))

运行之后会输出训练结果
展示部分

epoch 1, loss 0.7844, train acc 0.751, test acc 0.790
epoch 2, loss 0.5701, train acc 0.813, test acc 0.807
epoch 3, loss 0.5257, train acc 0.826, test acc 0.821
epoch 4, loss 0.5009, train acc 0.833, test acc 0.826
epoch 5, loss 0.4856, train acc 0.838, test acc 0.828
epoch 6, loss 0.4742, train acc 0.840, test acc 0.830
epoch 7, loss 0.4657, train acc 0.843, test acc 0.832
……

3.2、torch.nn实现softmax

1、定义模型

#初始化模型
net = torch.nn.Sequential(nn.Flatten(),
                          nn.Linear(784,10))

2、初始化模型参数

def init_weights(m):#初始参数
    if type(m) == nn.Linear:
        nn.init.normal_(m.weight,std = 0.01)
net.apply(init_weights)

3、定义损失函数和优化器
这里不同于之前的损失函数和网络模型，而是通过CrossEntropyLoss将softmax函数和交叉熵损失函数合并了在一起，相比于之前单独实现，其有着更好的稳定性。

loss = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(net.parameters(),lr = 0.1)

4、进行模型训练
这里的loss以及训练集准确率和前面的实现是一样的，就不加赘述，主要改变的是利用了nn模块自带的优化器optimizer来实现参数的更新。

#模型训练
num_epochs = 50
lr = 0.1
test_acc,train_acc= [],[]
train_loss,test_loss =[],[] 
for epoch in range(num_epochs):
        train_l_sum, train_acc_sum, n = 0.0, 0.0, 0
        for X,y in train_iter:
            y_hat = net(X)
            l = loss(y_hat,y).sum()
            optimizer.zero_grad()
            l.backward()
            optimizer.step()
            train_l_sum += l.item()
            train_acc_sum += (y_hat.argmax(dim=1) == y).sum().item()
            n += y.shape[0]
            
        test_a,test_l = evaluate_accuracy(test_iter, net)
        test_acc.append(test_a)
        test_loss.append(test_l)
        train_acc.append(train_acc_sum/n)
        train_loss.append(train_l_sum/n)
        print('epoch %d, loss %.4f, train acc %.3f, test acc %.3f'
              % (epoch + 1, train_loss[epoch], train_acc[epoch], test_acc[e

绘图函数都是使用我之前自己定义的函数，感觉还是挺好用的

Draw_loss_Curve([nn_train_loss,"train_loss"],[nn_test_loss,"test_loss"])

3.4、前馈神经网络实验

分别从零和torch.nn 实现回归、二分类、多分类

和前面都差不多，区别在于数据集、和模型定义有些区别

3.4.1回归问题

自定义数据集
按照要求构建数据集，数据集大小为10000且训练集大小为7000，测试集大小为3000，样本的特征维度为500

#回归数据集
num_train,num_test,num_inputs = 7000,3000,500
true_w,true_b = torch.ones(num_inputs,1)*0.0056,0.028
features = torch.randn((num_train+num_test,num_inputs))
labels = torch.matmul(features,true_w)+true_b
labels += torch.tensor(np.random.normal(0,0.01,size=labels.size()),dtype = torch.float)
train_reg_features,test_reg_features = features[:num_train,:],features[num_train:,:]
train_reg_labels,test_reg_labels = labels[:num_train],labels[num_train:]

模型参数定义及从初始化

num_inputs,num_outputs,num_hiddens = 500,1,128

W1 = torch.tensor(np.random.normal(0,0.01,(num_inputs,num_hiddens)),dtype = torch.float)
b1 = torch.zeros(1,dtype = torch.float)
W2 = torch.tensor(np.random.normal(0,0.01,(num_hiddens,1)),dtype = torch.float)
b2 = torch.zeros(1,dtype = torch.float)

params = [W1,b1,W2,b2]

for param in params:   
    param.requires_grad_(requires_grad = True)

定义激活函数选择Relu激活函数

def relu(x):
    return torch.max(input = x,other = torch.tensor(0.0))

定义损失函数和优化算法

def squared_loss(y_hat,y):
    return (y_hat-y.view(y_hat.size()))**2/2

定义随机梯度下降函数

def SGD(params,lr):
    for param in params:
        param.data -= lr*param.grad

定义模型

首先H得到的是隐藏层经过激活函数后的输出，最后返回全连接得到输出层结果。

def net(X):
    X = X.view((-1,num_inputs))
    #print(X.shape)
    H = relu(torch.matmul(X,W1)+b1)
    return torch.matmul(H,W2)+b2

模型训练

loss=  torch.nn.MSELoss()
#loss = squared_loss
num_epochs, lr = 50, 0.1
batch_size = 64
train_loss,test_loss= [],[]
for epoch in range(num_epochs):
    train_l_sum,n = 0,0
    for X,y in data_iter(batch_size,train_reg_features,train_reg_labels):
        y_hat = net(X)
        l = loss(y_hat,y).sum()    
        # 梯度清零 梯度清零放在最后  
        #默认一开始梯度为零或者说没有梯度，所以讲梯度清零放在for循环的最后
        l.backward()
        SGD(params, lr)
        for param in params:
                param.grad.data.zero_()
    train_loss.append(loss(net(train_reg_features),train_reg_labels))
    test_loss.append(loss(net(test_reg_features),test_reg_labels))
    print('epoch%d,train_loss%f,test_loss%f'%(epoch+1,train_loss[epoch],test_loss[epoch]))

3.4.2 二分类问题

自定义数据集，并读取数据：
按照实验要求，按实验要求，这里生成了两个数据集，第一类为均值为0.1，标准差为1，其标签为0，训练集和测试集个数分别是7000,3000；第二类为均值为-0.1，标准差为1，其标签为1，训练集和测试集个数分别是7000,3000。然后利用torch.cat()操作将两类的训练集和测试集分别合并，从而训练集的大小为14000，测试集的大小为6000。

mean = 0.1
std = 1
features_binary_cla_0 = torch.normal(mean,std,(10000,200),dtype = torch.float)
features_binary_cla_1 = torch.normal(-mean,std,(10000,200),dtype = torch.float)
labels_binary_cla_0 = torch.zeros(10000,dtype = torch.int64)
labels_binary_cla_1 = torch.ones(10000,dtype = torch.int64)
labels_binary_cla_0 = torch.zeros(10000).long()
labels_binary_cla_1 = torch.ones(10000).long()

#将标签为0的类别划分训练和测试集
train_features_binary_cla_0 = features_binary_cla_0[:7000]
train_labels_binary_cla_0 = labels_binary_cla_0[:7000]
test_features_binary_cla_0 = features_binary_cla_0[7000:]
test_labels_binary_cla_0 = labels_binary_cla_0[7000:]

#将标签为1的类别划分训练和测试集
train_features_binary_cla_1 = features_binary_cla_1[:7000]
train_labels_binary_cla_1 = labels_binary_cla_1[:7000] 
test_features_binary_cla_1 = features_binary_cla_1[7000:]
test_labels_binary_cla_1 = labels_binary_cla_1[7000:]

#将两个标签的向量cat成一个向量
train_features_binary_cla = torch.cat((train_features_binary_cla_0,train_features_binary_cla_1),0)
train_labels_binary_cla = torch.cat((train_labels_binary_cla_0,train_labels_binary_cla_1),0)

test_features_binary_cla = torch.cat((test_features_binary_cla_0,test_features_binary_cla_1),0)
test_labels_binary_cla = torch.cat((test_labels_binary_cla_0,test_labels_binary_cla_1),0)

print(train_features_binary_cla.shape)

batch_size = 128
train_binary_dataset = Data.TensorDataset(train_features_binary_cla,train_labels_binary_cla)
train_binary_loader = Data.DataLoader(dataset = train_binary_dataset,batch_size = batch_size,shuffle = True)

test_binary_dataset = Data.TensorDataset(test_features_binary_cla,test_labels_binary_cla)
test_binary_loader = Data.DataLoader(dataset = test_binary_dataset,batch_size = batch_size,shuffle = True)

在模型定义方面只是将输入和输出维度改为200，2，其他超参数可以不修改
其他的按照之前训练就可以了

3.4.3 多分类也是一样数据集改为mnist手写数字数据集

3.5 torch.nn实现

利用torch.nn构建模型

#定义FlattenLayer层
class FlattenLayer(torch.nn.Module):
    def __init__(self):
        super(FlattenLayer,self).__init__()
    def forward(self,x):
        return x.view(x.shape[0],-1)
#模型定义和参数初始化
num_inputs ,num_outputs,num_hiddens = 500,1,128

net = nn.Sequential(FlattenLayer(),
                   nn.Linear(num_inputs,num_hiddens),
                   #nn.ReLU(),
                    #nn.LeakyReLU(),
                   nn.Tanh(),
                   nn.Linear(num_hiddens,num_outputs))

for params in net.parameters():
    nn.init.normal_(params,mean = 0,std=0.1)