昇思25天学习打卡营第6天|模型训练

Four steps are the common training process.

First, you need to construct the dataset.

Next, you define the neural network.

Third, you define the hyperparameter\ loss function and the optimizer.

Finally you train and test.

We load the data.

%%capture captured_output
# 实验环境已经预装了mindspore==2.2.14,如需更换mindspore版本,可更改下面mindspore的版本号
!pip uninstall mindspore -y
!pip install -i https://pypi.mirrors.ustc.edu.cn/simple mindspore==2.2.14
import mindspore
from mindspore import nn
from mindspore.dataset import vision, transforms
from mindspore.dataset import MnistDataset

# Download data from open datasets
from download import download

url = "https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/" \
      "notebook/datasets/MNIST_Data.zip"
path = download(url, "./", kind="zip", replace=True)


def datapipe(path, batch_size):
    image_transforms = [
        vision.Rescale(1.0 / 255.0, 0),
        vision.Normalize(mean=(0.1307,), std=(0.3081,)),
        vision.HWC2CHW()
    ]
    label_transform = transforms.TypeCast(mindspore.int32)

    dataset = MnistDataset(path)
    dataset = dataset.map(image_transforms, 'image')
    dataset = dataset.map(label_transform, 'label')
    dataset = dataset.batch(batch_size)
    return dataset

train_dataset = datapipe('MNIST_Data/train', batch_size=64)
test_dataset = datapipe('MNIST_Data/test', batch_size=64)

the code above has nothing to say but the rescale and normalization, which are important to process data. Next, we just define the neural network, which I depicit in last blog.

class Network(nn.Cell):
    def __init__(self):
        super().__init__()
        self.flatten = nn.Flatten()
        self.dense_relu_sequential = nn.SequentialCell(
            nn.Dense(28*28, 512),
            nn.ReLU(),
            nn.Dense(512, 512),
            nn.ReLU(),
            nn.Dense(512, 10)
        )

    def construct(self, x):
        x = self.flatten(x)
        logits = self.dense_relu_sequential(x)
        return logits

model = Network()

here are the work we need. Hyperparameters are just what we should choose manually to estimate whether the model is better or worse by test it again and again.

epochs = 3
batch_size = 64
learning_rate = 1e-2

epoch: How many times do you want to train your model  ?

batch_size: How big is the data you pass to model once?

learning rate: How fast you want your model to learn the output? Too fast may just get a worse result.

loss_fn = nn.CrossEntropyLoss()

this is a kind of loss function, you can view it as the sum of log(your raw loss), which is better for calculation.

optimizer = nn.SGD(model.trainable_params(), learning_rate=learning_rate)

the code above is just a kind of optimizer. Some tricks to make your training more accurate with less calculation. SGD just means choose some samples and calculate the gradiants to represent the whole data.

here is the training process.

# define a forward function to predict
def forward_fn(data, label):
    logits = model(data)
    loss = loss_fn(logits, label)
    return loss, logits
# get gradient
grad_fn = mindspore.value_and_grad(forward_fn, None, optimizer.parameters, has_aux = True)
# define how to train in each step
def train_step(data, label):
    (loss, _), grads = grad_fn(data, label)
    optimizer(grads)
    return loss
# train again and again until all data you chosen have been seen by the model
def train_loop(model, dataset):
    size = dataset.get_dataset_size()
    model.set_train()
    for batch, (data,label) in enumerate(dataset.create_tuple_iterator()):
        loss = train_step(data, label)
        
        if batch % 100 == 0:
            loss, current = loss.asnumpy(), batch
            print(f"loss:{loss:>7f} [{current:>3d}/{size:>3d}]")
#finally test_loop are similiar
def test_loop(model, dataset, loss_fn):
    num_batches = dataset.get_dataset_size()
    model.set_train(False)
    total, test_loss, correct = 0,0,0
    for data, label in dataset.create_tuple_iterator():
        pred = model(data)
        total += len(data)
        test_loss += loss_fn(pred, label).asnumpy() #calculate loss is in loss_fn and we sum it
        correct += (pred.argmax(1) == label).asnumpy().sum() #here we just sum those correct predictions numbers
    test_loss /= num_batches #average loss for each batch
    correct /= total
    print(f"Test: \n Accuracy:{(100*correct):>0.1f}%, Avg loss: {test_loss:>8f}\n")    

What does the code above do?

model predict, calculate the gradient and loss, train, test.  what we can revise is those hyperparameters. We just change the temperature to cook a good soup. But we do not know whether the soup taste better if we cook with more salt/ less oil? Just try !

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值