项目实训(五)
本篇文章接上一篇,对项目中所使用的神经网络进行分析。
NN的实现
基本结构
项目中所使用的神经网络为手写实现,未使用Torch,TensorFlow等第三方库。
项目中使用的NN是包含一个隐藏层的全连接网络。
def __init__(self, input_size, hidden_size, output_size, file_name):
I, H, O = input_size, hidden_size, output_size
初始化接收三个参数,input_size 是输入层的神经元数,hidden_size 是隐藏层的神经元数,output_size 是输出层的神经元数,file_name为保存的训练参数。
if not os.path.exists(file_name):
W1 = 0.01 * np.random.randn(I, H)
b1 = np.zeros(H)
W2 = 0.01 * np.random.randn(H, O)
b2 = np.zeros(O)
# 生成层
self.layers = [
Affine(W1, b1),
# Sigmoid(),
Affine(W2, b2)
]
self.loss_layer = SoftmaxWithLoss()
# 将所有的权重和偏置整理到列表中
self.params, self.grads = [], []
for layer in self.layers:
self.params += layer.params
self.grads += layer.grads
# print(self.params)
else:
with open(file_name, 'rb') as f:
params = pickle.load(f)
self.params = params
params = [p.astype('f') for p in params]
W1 = 0.01 * np.random.randn(I, H)
b1 = np.zeros(H)
W2 = 0.01 * np.random.randn(H, O)
b2 = np.zeros(O)
# print(W1)
W1 = self.params[0]
b1 = self.params[1]
W2 = self.params[2]
b2 = self.params[3]
# print(W1)
# 生成层
self.layers = [
Affine(W1, b1),
# Sigmoid(),
Affine(W2, b2)
]
self.loss_layer = SoftmaxWithLoss()
# 将所有的权重和偏置整理到列表中
self.params, self.grads = [], []
for layer in self.layers:
self.params += layer.params
self.grads += layer.grads
内部实现中,首先判断是否存在训练得到的参数,如果有,则使用这些参数;否则用零向量(np.zeros())初始化偏置,再用小的随机数(0.01 * np.random.randn())初始化权重。通过将权重设成小的随机数,学习可以更容易地进行。
我们为 TwoLayerNet 实现 3 个方法,即进行推理的 predict() 方法、正向传播的 forward() 方法和反向传播的 backward() 方法
def predict(self, x):
for layer in self.layers:
x = layer.forward(x)
return x
def forward(self, x, t):
score = self.predict(x)
loss = self.loss_layer.forward(score, t)
return loss
def backward(self, dout=1):
dout = self.loss_layer.backward(dout)
for layer in reversed(self.layers):
dout = layer.backward(dout)
return dout
训练
model = TwoLayerNet(input_size=inputDim, hidden_size=hidden_size,
output_size=outputSize,file_name='two_layer_net.pkl')
optimizer = SGD(lr=learning_rate)
trainer = Trainer(model, optimizer)
trainer.fit(x, t, max_epoch, batch_size, eval_interval=10)
初始化程序接收神经网络(模型)和优化器,调用 fit() 方法开始学习。
fit()函数
def fit(self, x, t, max_epoch=10, batch_size=32, max_grad=None, eval_interval=20):
data_size = len(x)
max_iters = data_size // batch_size
self.eval_interval = eval_interval
model, optimizer = self.model, self.optimizer
total_loss = 0
loss_count = 0
start_time = time.time()
for epoch in range(max_epoch):
# 打乱
idx = numpy.random.permutation(numpy.arange(data_size))
x = x[idx]
t = t[idx]
for iters in range(max_iters):
batch_x = x[iters*batch_size:(iters+1)*batch_size]
batch_t = t[iters*batch_size:(iters+1)*batch_size]
# 计算梯度,更新参数
loss = model.forward(batch_x, batch_t)
model.backward()
params, grads = remove_duplicate(model.params, model.grads) # 将共享的权重整合为1个
if max_grad is not None:
clip_grads(grads, max_grad)
optimizer.update(params, grads)
total_loss += loss
loss_count += 1
# 评价
if (eval_interval is not None) and (iters % eval_interval) == 0:
avg_loss = total_loss / loss_count
elapsed_time = time.time() - start_time
print('| epoch %d | iter %d / %d | time %d[s] | loss %.2f'
% (self.current_epoch + 1, iters + 1, max_iters, elapsed_time, avg_loss))
self.loss_list.append(float(avg_loss))
total_loss, loss_count = 0, 0
self.current_epoch += 1
结果展示
训练中
结果预测(训练200条数据,预测100条)
目前可以有七成准确率,效果比较好。但距真正的目标仍然有很大差距。接下来我们会真的准确率采取一系列措施。
参考
《深度学习进阶:自然语言处理》 斋藤康毅