利用pytorch 搭建分类网络。

接触深度学习有一段时间,总感觉里面有些知识理解起来比较困难。想要跑一遍模型搭建、训练、测试、以及部署,本篇博客,先从模型搭建说起,里面有错误的地方还请各位更正。

打算利用pytorch框架完成吴恩达老师的课后作业--手势识别。

                                                                                                                 图1

目前根据自己的理解,按照以下步骤进行实现:

1、数据收集。数据题目已经给出。数据获取百度云盘连接

def load_dataset():
    train_dataset = h5py.File('datasets/train_signs.h5', "r")
    train_set_x_orig = np.array(train_dataset["train_set_x"][:]) # your train set features
    train_set_y_orig = np.array(train_dataset["train_set_y"][:]) # your train set labels
    test_dataset = h5py.File('datasets/test_signs.h5', "r")
    test_set_x_orig = np.array(test_dataset["test_set_x"][:]) # your test set features
    test_set_y_orig = np.array(test_dataset["test_set_y"][:]) # your test set labels
    classes = np.array(test_dataset["list_classes"][:]) # the list of classes
    train_set_y_orig = train_set_y_orig.reshape((1, train_set_y_orig.shape[0]))
    test_set_y_orig = test_set_y_orig.reshape((1, test_set_y_orig.shape[0]))
    return , train_set_y_orig, train_set_y_orig,test_set_x_orig, test_set_y_orig, classes

这个小函数是从作业要求里面摘取出来的。

训练数据的输入,train_set_x_orig 维度是(1080, 64, 64, 3),train_set_y_orig维度是(1,1080)

训练数据的输出 ,test_set_x_orig 维度上(120,64,64,3),test_set_y_orig维度是(1,120)

2、数据加载。

模型训练是分批次进行训练,每次加载指定的数据量进行训练,硬件设备可能无法做到一次加载全部数据。

def random_mini_batches(X, Y, mini_batch_size = 64, seed = 0):
    """
    Creates a list of random minibatches from (X, Y)
    Arguments:
    X -- input data, of shape (input size, number of examples) (m, Hi, Wi, Ci)
    Y -- true "label" vector (containing 0 if cat, 1 if non-cat), of shape (1, number of examples) (m, n_y)
    mini_batch_size - size of the mini-batches, integer
    seed -- this is only for the purpose of grading, so that you're "random minibatches are the same as ours.
    Returns:
    mini_batches -- list of synchronous (mini_batch_X, mini_batch_Y)
    """
    m = X.shape[0]                  # number of training examples
    mini_batches = []
    np.random.seed(seed)
    # Step 1: Shuffle (X, Y)
    permutation = list(np.random.permutation(m))
    shuffled_X = X[permutation, :, :, :]
    shuffled_Y = Y[permutation, :]
    # Step 2: Partition (shuffled_X, shuffled_Y). Minus the end case.
    num_complete_minibatches = math.floor(m/mini_batch_size) # number of mini batches of size mini_batch_size in your partitionning
    for k in range(0, num_complete_minibatches):
        mini_batch_X = shuffled_X[k * mini_batch_size : k * mini_batch_size + mini_batch_size,:,:,:]
        mini_batch_Y = shuffled_Y[k * mini_batch_size : k * mini_batch_size + mini_batch_size,:]
        mini_batch = (mini_batch_X, mini_batch_Y)
        mini_batches.append(mini_batch)
    # Handling the end case (last mini-batch < mini_batch_size)
    if m % mini_batch_size != 0:
        mini_batch_X = shuffled_X[num_complete_minibatches * mini_batch_size : m,:,:,:]
        mini_batch_Y = shuffled_Y[num_complete_minibatches * mini_batch_size : m,:]
        mini_batch = (mini_batch_X, mini_batch_Y)
        mini_batches.append(mini_batch)
    return mini_batches

这个小函数是从作业要求里面摘取出来的。

后来发现pytorch有自带的处理数据的类Dataset,继承这个类就可以实现数据和标签加载,自己写了一个MyData类。

class MyData(Dataset):
    def __init__(self, traindata, transform=None, train_val="train"):
        super(MyData, self).__init__()
        self.data = traindata
        self.imagenames = glob.glob(self.data +"/*/*.jpg")
        self.data_transform = transform
        self.train_val = train_val

    def __len__(self):
        return len(self.imagenames)

    def __getitem__(self, item):
        img_path = self.imagenames[item]
        #img = Image.open(img_path)
        img = cv2.imread(img_path)
        img_path = eval(repr(img_path).replace("/", '\\'))
        label = img_path.split('\\')[-2]
        label = int(label)

        if self.data_transform is not None:
            try:
                img = self.data_transform[self.train_val](img)

            except:
                print('can not load image :{}'.format(img_path))

        return torch.from_numpy(img), label

下面是如何使用这个类,首先实例化一下这个类,然后利用torch.utils.data.DataLoader 实现数据分批次加载。各参数说明如下

注意:num_workers 指的是用多少个线程加载数据,0是只在主线程中加载数据。pin_memroy 锁页内存,为true会把数据加载到cuda的锁页内存里面。出现下面的错误原因,因为自己的内存不够

此时将num_workers=0 pin_memory=False 即可解决。自己将本地运行的所有程序关掉,可以调大num_workers 并将pin_memory=True。

解决方法:

num_workers =0

    dataset = MyData()
    traindataloader = torch.utils.data.DataLoader(dataset,
                                             batch_size=8,
                                             num_workers=0,
                                             shuffle=True, 
                                             pin_memory=True)

    nb = len(traindataloader)
    pbar = tqdm(enumerate(traindataloader), total=nb)
    for i, data in pbar:
        images, label = data

3、模型搭建 

模型搭建的类采用继承torch.nn.Module实现。如果使用GPU训练,把模型放入到GPU中。

gesR = GesRecognition().to(device)

4、模型训练

1、选择好优化器,采用什么样的优化算法。

optimizer = optim.SGD(gesR.parameters(), lr=0.001)

2、设置好训练迭代次数,并将模型设置为训练模式。

gesR.train()

关于为什么在训练时采用train,在测试时采用eval ,参考了https://blog.csdn.net/weixin_44760744/article/details/108929528这篇博客,本篇博客没有采用BN和dropout模块。

在模型每次推理前,将梯度参数清零。这部分内容有待研究。

 optimizer.zero_grad()

3、loss 函数的选取

分类任务一般采用交叉熵损失函数,这部分内容有待研究。 loss = criterion(output, train_label.squeeze().long())  之前认为网络输出 和label 只要维度一致就可以了,结果出现了很多错误。

这个错误需要把train_label=train_label.long()

我把batch_size 设置为8,六个类别 output.shape 为 【8,6】

train_label.shape 为 【8】

原因是torch 求loss值函数内部会把标签文件转换成 one hot 格式的。

 output

4、反向传播 loss.backward()

5、迭代器优化更新。optimizer.step()

6、模型保存。torch.save()

 

程序就是如下代码,简单的设计了一个分类模型,训练了一上午的时间,在测试集上准确率为90%.。

import numpy as np
import torch
import torch.nn as nn
import h5py
import math
from torch import optim
from torch.autograd import Variable
import torch.nn.functional as F
import matplotlib.pyplot as plt


def load_dataset():
    train_dataset = h5py.File('datasets/train_signs.h5', "r")
    train_set_x_orig = np.array(train_dataset["train_set_x"][:]) # your train set features
    train_set_y_orig = np.array(train_dataset["train_set_y"][:]) # your train set labels
    test_dataset = h5py.File('datasets/test_signs.h5', "r")
    test_set_x_orig = np.array(test_dataset["test_set_x"][:]) # your test set features
    test_set_y_orig = np.array(test_dataset["test_set_y"][:]) # your test set labels
    classes = np.array(test_dataset["list_classes"][:]) # the list of classes
    train_set_y_orig = train_set_y_orig.reshape((1, train_set_y_orig.shape[0]))
    test_set_y_orig = test_set_y_orig.reshape((1, test_set_y_orig.shape[0]))
    return train_set_x_orig, train_set_y_orig, test_set_x_orig, test_set_y_orig, classes


def random_mini_batches(X, Y, mini_batch_size = 64, seed = 0):
    """
    Creates a list of random minibatches from (X, Y)
    Arguments:
    X -- input data, of shape (input size, number of examples) (m, Hi, Wi, Ci)
    Y -- true "label" vector (containing 0 if cat, 1 if non-cat), of shape (1, number of examples) (m, n_y)
    mini_batch_size - size of the mini-batches, integer
    seed -- this is only for the purpose of grading, so that you're "random minibatches are the same as ours.
    Returns:
    mini_batches -- list of synchronous (mini_batch_X, mini_batch_Y)
    """
    m = X.shape[0]                  # number of training examples
    mini_batches = []
    np.random.seed(seed)
    # Step 1: Shuffle (X, Y)
    permutation = list(np.random.permutation(m))
    shuffled_X = X[permutation, :, :, :]
    shuffled_Y = Y[permutation, :]
    # Step 2: Partition (shuffled_X, shuffled_Y). Minus the end case.
    num_complete_minibatches = math.floor(m/mini_batch_size) # number of mini batches of size mini_batch_size in your partitionning
    for k in range(0, num_complete_minibatches):
        mini_batch_X = shuffled_X[k * mini_batch_size : k * mini_batch_size + mini_batch_size,:,:,:]
        mini_batch_Y = shuffled_Y[k * mini_batch_size : k * mini_batch_size + mini_batch_size,:]
        mini_batch = (mini_batch_X, mini_batch_Y)
        mini_batches.append(mini_batch)
    # Handling the end case (last mini-batch < mini_batch_size)
    if m % mini_batch_size != 0:
        mini_batch_X = shuffled_X[num_complete_minibatches * mini_batch_size : m,:,:,:]
        mini_batch_Y = shuffled_Y[num_complete_minibatches * mini_batch_size : m,:]
        mini_batch = (mini_batch_X, mini_batch_Y)
        mini_batches.append(mini_batch)
    return mini_batches


class GesRecognition(nn.Module):
    def __init__(self):
        super(GesRecognition, self).__init__()
        self.conv1 = nn.Conv2d(3, 8, kernel_size=5, stride=1, padding=2)
        self.relu1 = nn.ReLU()
        self.maxpool1 = nn.MaxPool2d(kernel_size=2, stride=2)
        self.conv2 = nn.Conv2d(8, 16, kernel_size=3, stride=2)
        self.relu2 = nn.ReLU()
        self.maxpool1 = nn.MaxPool2d(kernel_size=2)
        self.fc = nn.Linear(784, 6)

    def forward(self, x):
        x = self.conv1(x)
        x = self.relu1(x)
        x = self.maxpool1(x)
        x = self.conv2(x)
        x = self.relu2(x)
        x = self.maxpool1(x)

       全连接层矩阵转换为向量
        x = x.view(x.size(0), -1)
        x = self.fc(x)
        return x


def train():
    torch.cuda.set_device(0)
    train_set_x_orig, train_set_y_orig, test_set_x_orig, test_set_y_orig, classes = load_dataset()
    Y = np.eye(6)[train_set_y_orig.reshape(-1)]
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    gesR = GesRecognition().to(device)

    train_set_x_orig = train_set_x_orig/255.0
    criterion = nn.CrossEntropyLoss()
    mini_batch_size = 8
    optimizer = optim.SGD(gesR.parameters(), lr=0.001)
    for num in range(1000):
        gesR.train()
        minbatch = random_mini_batches(train_set_x_orig, train_set_y_orig.T, mini_batch_size, seed=num)
        for batch in range(len(minbatch)):
            train_data = torch.from_numpy(np.transpose(minbatch[batch][0], (0, 3, 1, 2)))
            train_data = train_data.float().to(device)
            train_data = Variable(train_data)
            train_label = torch.from_numpy(minbatch[batch][1]).float().to(device)
            train_label = Variable(train_label)
            optimizer.zero_grad()
            output = gesR(train_data)

            loss = criterion(output, train_label.squeeze().long())
            if (num % 10 == 0):
                cost = loss.cpu()
                print("loss is :", cost)
            loss = loss.requires_grad_()
            loss.backward()
            optimizer.step()
            if(loss <0.1):
                torch.save(gesR, "Gesfy.pkl")
        if (num % 10==0):
            acc = 0
            minbatchtest = random_mini_batches(test_set_x_orig, test_set_y_orig.T, 10, seed=num)
            test_data = torch.from_numpy(np.transpose(minbatchtest[0][0], (0, 3, 1, 2)))
            test_data = test_data.float().to(device)
            test_data = Variable(test_data)
            test_label = torch.from_numpy(minbatchtest[0][1]).float().to(device)
            test_label = Variable(test_label)
            out = gesR(test_data)
            out = out.max(1)[1]
            print("out.shape", out.shape)
            print("test_label", test_label)
            for i in range(10):
                if(out[i] == test_label.squeeze().long()[i]):
                    acc = acc+1
            acc = acc/10.0


def test():
    torch.cuda.set_device(0)
    train_set_x_orig, train_set_y_orig, test_set_x_orig, test_set_y_orig, classes = load_dataset()
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    #gesR = GesRecognition().to(device)
    model = torch.load("Gesfy.pkl").to(device)
    model.eval()
    print(model)
    test_set_x_orig = test_set_x_orig/255.0
    count = 0
    for i in range(test_set_x_orig.shape[0]):
        print(type(test_set_x_orig[i]))
        plt.imshow(test_set_x_orig[i])
        plt.pause(5)
        print("label", test_set_y_orig.squeeze()[i])
        test_data = torch.from_numpy(np.transpose(test_set_x_orig[i:i+1], (0, 3, 1, 2)))
        test_data = test_data.float().to(device)
        test_data = Variable(test_data)
        out = model(test_data)
        out = out.max(1)[1].to("cpu")
        print("out is :", out)
        plt.show()

        if(out == test_set_y_orig.squeeze()[i]):
            count = count+1
    count = count/test_set_x_orig.shape[0]

    print("right rate is :", count)


if __name__ == "__main__":
    train()

 

 

 

 

 

 

 

 

 

 

 

 

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
好的,我可以回答你的问题。首先,你需要导入 PyTorch 库和其他必要的库,例如 pandas 和 numpy。然后,你应该准备你的数据,这里假设你有一个名为 data.csv 的数据集,其中包含分类变量和目标变量。接下来,你可以使用 pandas 库读取 csv 文件并将数据分成输入特征和目标变量。然后,你需要将分类变量转换为数字标签。这可以通过使用 LabelEncoder 类来完成,该类将每个分类变量映射到一个唯一的数字标签。接下来,你需要将数据集分成训练集和测试集。这可以通过使用 train_test_split 函数来完成。最后,你可以使用 PyTorch搭建全连接神经网络。以下是一个示例代码: ```python import torch import pandas as pd import numpy as np from sklearn.preprocessing import LabelEncoder from sklearn.model_selection import train_test_split # 读取数据集 data = pd.read_csv('data.csv') # 分离输入特征和目标变量 X = data.iloc[:, :-1].values y = data.iloc[:, -1].values # 将分类变量转换为数字标签 le = LabelEncoder() for i in range(X.shape[1]): if type(X[0][i]) == str: X[:, i] = le.fit_transform(X[:, i]) # 划分训练集和测试集 X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2) # 将 numpy 数组转换为 tensor X_train = torch.from_numpy(X_train).float() y_train = torch.from_numpy(y_train).long() X_test = torch.from_numpy(X_test).float() y_test = torch.from_numpy(y_test).long() # 定义全连接神经网络 class Net(torch.nn.Module): def __init__(self, n_feature, n_hidden, n_output): super(Net, self).__init__() self.hidden = torch.nn.Linear(n_feature, n_hidden) # 隐藏层 self.out = torch.nn.Linear(n_hidden, n_output) # 输出层 def forward(self, x): x = torch.relu(self.hidden(x)) # 激活函数 x = self.out(x) return x # 定义模型和优化器 net = Net(n_feature=X_train.shape[1], n_hidden=10, n_output=len(np.unique(y_train))) optimizer = torch.optim.SGD(net.parameters(), lr=0.01) loss_func = torch.nn.CrossEntropyLoss() # 训练模型 for epoch in range(100): out = net(X_train) loss = loss_func(out, y_train) optimizer.zero_grad() loss.backward() optimizer.step() # 输出训练误差 if epoch % 10 == 0: print('Epoch: %d | Loss: %.4f' % (epoch, loss.item())) # 测试模型 with torch.no_grad(): out = net(X_test) predictions = torch.argmax(out, axis=1) accuracy = (predictions == y_test).sum() / len(y_test) print('Accuracy:', accuracy) ``` 在这个示例中,我们使用了一个具有一个隐藏层的全连接神经网络。你可以根据你的数据集和需要调整输入特征的数量、隐藏层的大小和输出层的数量。训练模型时,我们使用了交叉熵损失函数和随机梯度下降优化器。测试模型时,我们计算了模型的准确度。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值