使用MindSpore API实现简单深度学习模型
引用API
import mindspore from mindspore import nn from mindspore.dataset import vision, transforms from mindspore.dataset import MnistDataset
处理数据集
下载数据集
# Download data from open datasets from download import download url = "https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/" \ "notebook/datasets/MNIST_Data.zip" path = download(url, "./", kind="zip", replace=True)
Downloading data from https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/notebook/datasets/MNIST_Data.zip (10.3 MB) file_sizes: 100%|███████████████████████████| 10.8M/10.8M [00:00<00:00, 144MB/s] Extracting zip file... Successfully downloaded / unzipped to ./
下载完成后可见数据集目录结构
MNIST_Data └── train ├── train-images-idx3-ubyte (60000个训练图片) ├── train-labels-idx1-ubyte (60000个训练标签) └── test ├── t10k-images-idx3-ubyte (10000个测试图片) ├── t10k-labels-idx1-ubyte (10000个测试标签)
数据集操作
# 获得数据集对象 train_dataset = MnistDataset('MNIST_Data/train') test_dataset = MnistDataset('MNIST_Data/test')
打印数据集中包含的数据列名,用于dataset的预处理
print(train_dataset.get_col_names())
['image', 'label']
使用map函数对图像数据及标签变换处理,然后batch函数打包数据集。
def datapipe(dataset, batch_size): image_transforms = [ vision.Rescale(1.0 / 255.0, 0), vision.Normalize(mean=(0.1307,), std=(0.3081,)), vision.HWC2CHW() ] label_transform = transforms.TypeCast(mindspore.int32) dataset = dataset.map(image_transforms, 'image') dataset = dataset.map(label_transform, 'label') dataset = dataset.batch(batch_size) return dataset
将处理好的数据集打包为大小为64的batch
# Map vision transforms and batch dataset train_dataset = datapipe(train_dataset, 64) test_dataset = datapipe(test_dataset, 64)
使用create_tuple_iterator 或create_dict_iterator对数据集进行迭代访问,查看数据和标签的shape和datatype。
# create_tuple_iterator for image, label in test_dataset.create_tuple_iterator(): print(f"Shape of image [N, C, H, W]: {image.shape} {image.dtype}") print(f"Shape of label: {label.shape} {label.dtype}") break
Shape of image [N, C, H, W]: (64, 1, 28, 28) Float32 Shape of label: (64,) Int32
# create_dict_iterator for data in test_dataset.create_dict_iterator(): print(f"Shape of image [N, C, H, W]: {data['image'].shape} {data['image'].dtype}") print(f"Shape of label: {data['label'].shape} {data['label'].dtype}") break
Shape of image [N, C, H, W]: (64, 1, 28, 28) Float32 Shape of label: (64,) Int32
网络构建
mindspore.nn
类是构建所有网络的基类,也是网络的基本单元。当用户需要自定义网络时,可以继承nn.Cell
类,并重写__init__
方法和construct
方法。__init__
包含所有网络层的定义,construct
中包含数据(Tensor)的变换过程。
# Define model class Network(nn.Cell): def __init__(self): super().__init__() self.flatten = nn.Flatten() self.dense_relu_sequential = nn.SequentialCell( nn.Dense(28*28, 512), nn.ReLU(), nn.Dense(512, 512), nn.ReLU(), nn.Dense(512, 10) ) def construct(self, x): x = self.flatten(x) logits = self.dense_relu_sequential(x) return logits model = Network() print(model)
Network< (flatten): Flatten<> (dense_relu_sequential): SequentialCell< (0): Dense<input_channels=784, output_channels=512, has_bias=True> (1): ReLU<> (2): Dense<input_channels=512, output_channels=512, has_bias=True> (3): ReLU<> (4): Dense<input_channels=512, output_channels=10, has_bias=True> > >
模型训练
模型训练三步走
正向计算:模型预测结果(logits),并与正确标签(label)求预测损失(loss)。
反向传播:利用自动微分机制,自动求模型参数(parameters)对于loss的梯度(gradients)。
参数优化:将梯度更新到参数上。
# Instantiate loss function and optimizer loss_fn = nn.CrossEntropyLoss() optimizer = nn.SGD(model.trainable_params(), 1e-2) # 1. Define forward function def forward_fn(data, label): logits = model(data) loss = loss_fn(logits, label) return loss, logits # 2. Get gradient function(value_and_grad) grad_fn = mindspore.value_and_grad(forward_fn, None, optimizer.parameters, has_aux=True) # 3. Define function of one-step training def train_step(data, label): (loss, _), grads = grad_fn(data, label) optimizer(grads) return loss def train(model, dataset): size = dataset.get_dataset_size() model.set_train() for batch, (data, label) in enumerate(dataset.create_tuple_iterator()): loss = train_step(data, label) if batch % 100 == 0: loss, current = loss.asnumpy(), batch print(f"loss: {loss:>7f} [{current:>3d}/{size:>3d}]")
训练后,需要定义一个测试函数以评估模型性能
def test(model, dataset, loss_fn): num_batches = dataset.get_dataset_size() model.set_train(False) total, test_loss, correct = 0, 0, 0 for data, label in dataset.create_tuple_iterator(): pred = model(data) total += len(data) test_loss += loss_fn(pred, label).asnumpy() correct += (pred.argmax(1) == label).asnumpy().sum() test_loss /= num_batches correct /= total print(f"Test: \n Accuracy: {(100*correct):>0.1f}%, Avg loss: {test_loss:>8f} \n")
迭代数据集相关概念
训练过程需多次迭代数据集,一次完整的迭代称为一轮(epoch)。在每一轮,遍历训练集进行训练,结束后使用测试集进行预测。打印每一轮的loss值和预测准确率(Accuracy),可以看到loss在不断下降,Accuracy在不断提高。
epochs = 3 for t in range(epochs): print(f"Epoch {t+1}\n-------------------------------") train(model, train_dataset) test(model, test_dataset, loss_fn) print("Done!")
Epoch 1 ------------------------------- loss: 0.330054 [ 0/938] loss: 0.434025 [100/938] loss: 0.129600 [200/938] loss: 0.154334 [300/938] loss: 0.092072 [400/938] loss: 0.262538 [500/938] loss: 0.366513 [600/938] loss: 0.292299 [700/938] loss: 0.239407 [800/938] loss: 0.169028 [900/938] Test: Accuracy: 92.8%, Avg loss: 0.247737 Epoch 2 ------------------------------- loss: 0.315685 [ 0/938] loss: 0.133240 [100/938] loss: 0.300934 [200/938] loss: 0.149830 [300/938] loss: 0.346082 [400/938] loss: 0.350824 [500/938] loss: 0.221809 [600/938] loss: 0.218873 [700/938] loss: 0.385571 [800/938] loss: 0.327509 [900/938] Test: Accuracy: 94.1%, Avg loss: 0.207035 Epoch 3 ------------------------------- loss: 0.117105 [ 0/938] loss: 0.073975 [100/938] loss: 0.065642 [200/938] loss: 0.154363 [300/938] loss: 0.189739 [400/938] loss: 0.325832 [500/938] loss: 0.178350 [600/938] loss: 0.094985 [700/938] loss: 0.225935 [800/938] loss: 0.167045 [900/938] Test: Accuracy: 94.6%, Avg loss: 0.181408 Done!
保存模型
# Save checkpoint mindspore.save_checkpoint(model, "model.ckpt") print("Saved Model to model.ckpt")
加载模型
加载保存的权重分为两步:
重新实例化模型对象,构造模型。
加载模型参数,并将其加载至模型上。
# Instantiate a random initialized model model = Network() # Load checkpoint and load parameter to model param_dict = mindspore.load_checkpoint("model.ckpt")
利用加载后的模型预测推理。
model.set_train(False) for data, label in test_dataset: pred = model(data) predicted = pred.argmax(1) print(f'Predicted: "{predicted[:10]}", Actual: "{label[:10]}"') break
Predicted: "[8 9 9 8 4 4 4 9 9 9]", Actual: "[0 7 1 5 0 7 3 8 1 7]"
总结
深度学习大致流程如下
下载、处理数据集 -> 网络构建 -> 模型训练 -> 保存、加载模型
-
数据处理
-
使用map函数进行变换,datapipe函数打包成固定大小的batch。
-
使用
create_tuple_iterator
或create_dict_iterator
对数据集进行迭代访问,查看数据和标签的shape和datatype。
-
-
网络构建
-
继承nn.cell类,通过重写
__init__
和construct
方法自定义网络,其中construct
方法包含数据变换过程
-
-
模型训练
-
训练流程
-
正向计算:模型预测结果(logits),并与正确标签(label)求预测损失(loss)。
-
反向传播:利用自动微分机制,自动求模型参数(parameters)对于loss的梯度(gradients)。
-
参数优化:将梯度更新到参数上。
-
-
评估模型性能
-
定义测试函数评估性能
-
多次迭代数据集
-
-
-
保存、加载模型
-
保存:save_checkpoint()
-
加载:load_checkpoint()
-
import time print(time.strftime('%Y-%m-%d %H:%M:%S', time.localtime()), 'Mindstorm')
2024-06-19 10:57:35 Mindstorm