最近在B站看沐神的动手学深度学习视频,记录一下学习过程
查看本文的jupyter notebook格式,更加清晰美观哦!
多层感知机
多层感知机在单层神经网络之间加入了一个或多个全连接隐藏层。为了避免多个全连接层的仿射变换叠加之后仍是仿射变换,从而引入了非线性变换单元,即激活函数。三种激活函数:ReLU函数、sigmoid函数、tanh(双曲正切)函数介绍如下:
ReLU(rectified linear unit)函数
ReLU(x) = max(x, 0).
%matplotlib inline
import d2lzh as d2l
from mxnet import autograd, nd
def xyplot(x_vals, y_vals, name):
"""根据nd类型的x_vals, y_vals,绘制关于x的名为name的函数"""
d2l.set_figsize(figsize=(5, 2.5))
d2l.plt.plot(x_vals.asnumpy(), y_vals.asnumpy())
d2l.plt.xlabel('x')
d2l.plt.ylabel(name+'(x)')
x = nd.arange(-8.0, 8.0, 0.1)
x.attach_grad()
with autograd.record():
y = x.relu()
xyplot(x, y, 'relu')

绘制ReLU函数的导数
y.backward()
xyplot(x, x.grad, 'grad of relu')

sigmoid函数
sigmoid(x) = 1/(1+exp(-x))
with autograd.record():
y = x.sigmoid()
xyplot(x, y, 'sigmoid')

绘制sigmoid函数的导数
y.backward()
xyplot(x, x.grad, 'grad of sigmoid')

tanh(双曲正切)函数
tanh(x) = (1-exp(-2x))/(1+exp(-2x))
with autograd.record():
y = x.tanh()
xyplot(x, y, 'tanh')

绘制tanh函数的导数
y.backward()
xyplot(x, x.grad, 'grad of tanh')

多层感知机从零开始实现
%matplotlib inline
from mxnet import nd
import d2lzh as d2l
from mxnet.gluon import loss as gloss
读取数据集,对Fashion_MNIST数据集进行分类
batch_size = 256
train_iter, test_iter = d2l.load_data_fashion_mnist(batch_size)
定义模型参数
num_inputs, num_hiddens, num_outputs = 784, 256, 10
W1 = nd.random.normal(scale=0.01, shape=(num_inputs, num_hiddens))
b1 = nd.zeros(num_hiddens)
W2 = nd.random.normal(scale=0.01, shape=(num_hiddens, num_outputs))
b2 = nd.zeros(num_outputs)
params = [W1, b1, W2, b2]
for param in params:
param.attach_grad()
定义loss,使用Gluon提供的包括softmax计算和交叉熵损失计算的函数
loss = gloss.SoftmaxCrossEntropyLoss()
定义激活函数
def relu(X):
return nd.maximum(X, 0)
定义模型,只含一层隐含层的多层感知机模型
def net(X):
X = X.reshape((-1 , num_inputs))
H = relu(nd.dot(X, W1)+b1)
return nd.dot(H, W2)+b2
训练模型
num_epochs, lr = 5, 0.5
d2l.train_ch3(net, train_iter, test_iter, loss, num_epochs, batch_size, params, lr)
epoch 1, loss 0.8149, train acc 0.697, test acc 0.824
epoch 2, loss 0.4855, train acc 0.820, test acc 0.846
epoch 3, loss 0.4256, train acc 0.843, test acc 0.856
epoch 4, loss 0.3952, train acc 0.854, test acc 0.864
epoch 5, loss 0.3745, train acc 0.863, test acc 0.871
多层感知机的简洁实现
%matplotlib inline
from mxnet import gluon, init
from mxnet.gluon import loss as gloss, nn
定义模型
net = nn.Sequential()
net.add(nn.Dense(256, activation='relu'),
nn.Dense(10))
net.initialize(init.Normal(sigma=0.01))
训练模型
batch_size = 256
train_iter, test_iter = d2l.load_data_fashion_mnist(batch_size)
loss = gloss.SoftmaxCrossEntropyLoss()
trainer = gluon.Trainer(net.collect_params(), 'sgd', {'learning_rate':0.5})
num_epochs = 5
d2l.train_ch3(net, train_iter, test_iter, loss, num_epochs, batch_size, None, None, trainer)
epoch 1, loss 0.8009, train acc 0.699, test acc 0.824
epoch 2, loss 0.4920, train acc 0.817, test acc 0.846
epoch 3, loss 0.4289, train acc 0.842, test acc 0.846
epoch 4, loss 0.3952, train acc 0.854, test acc 0.867
epoch 5, loss 0.3725, train acc 0.861, test acc 0.865

2万+

被折叠的 条评论
为什么被折叠?



