单层感知机
感知机的原理说明:https://www.cnblogs.com/turingbrain/p/7355265.html
训练感知机
收敛定理
感知机的问题
不能你和XOR问题,它只能产生线性分割面
总结
多层感知机
两层感知机,处理XOR问题
线性激活函数f(x)=x
非线性的激活函数
sigmoid:(0,1)
tanh:(-1,1)
relu函数:max(0,x),常用,原因简单,没有指数
多隐藏层:最后一层不需要激活函数,因为激活函数是为了防止层数的塌陷,输出的话是不用的。
总结
从零实现
- 加载数据
import torch
from torch import nn
from d2l import torch as d2l
batch_size = 256
train_iter, test_iter = d2l.load_data_fashion_mnist(batch_size)
2.初始化模型参数
nn.Parameter加不加都可,作用为声明是参数
num_inputs, num_outputs, num_hiddens = 784, 10, 256
W1 = nn.Parameter(
torch.randn(num_inputs, num_hiddens, requires_grad=True) * 0.01)
b1 = nn.Parameter(torch.zeros(num_hiddens, requires_grad=True))
W2 = nn.Parameter(
torch.randn(num_hiddens, num_outputs, requires_grad=True) * 0.01)
b2 = nn.Parameter(torch.zeros(num_outputs, requires_grad=True))
params = [W1, b1, W2, b2]
3.激活函数
def relu(X):
a = torch.zeros_like(X)
return torch.max(X, a)
4.模型
def net(X):
X = X.reshape((-1, num_inputs))
H = relu(X @ W1 + b1) # 这里“@”代表矩阵乘法
return (H @ W2 + b2)
5.损失函数
loss = nn.CrossEntropyLoss()
6.训练
num_epochs, lr = 10, 0.1
updater = torch.optim.SGD(params, lr=lr)
d2l.train_ch3(net, train_iter, test_iter, loss, num_epochs, updater)
简洁实现
import torch
from torch import nn
from d2l import torch as d2l
net = nn.Sequential(nn.Flatten(), nn.Linear(784, 256), nn.ReLU(),
nn.Linear(256, 10))
def init_weights(m):
if type(m) == nn.Linear:
nn.init.normal_(m.weight, std=0.01)
net.apply(init_weights);
batch_size, lr, num_epochs = 256, 0.1, 10
loss = nn.CrossEntropyLoss()
trainer = torch.optim.SGD(net.parameters(), lr=lr)
train_iter, test_iter = d2l.load_data_fashion_mnist(batch_size)
d2l.train_ch3(net, train_iter, test_iter, loss, num_epochs, trainer)
问题
1.层数:W+σ(权重+激活)看作一层,下图就是两层
2.