转述:Recurrent Neural Networks Tutorial, Part 2 – Implementing a RNN with Python, Numpy and Theano
这里主要记录下上面那篇文字里所说的loss的计算和SGD。
作者采用的是交叉熵的Loss,公式如下
这里把一个字作为一个训练样本,一个句子作为一个mini-batch。计算整个训练语料Loss的代码块为:
def calculate_total_loss(self, x, y):
L = 0
# For each sentence...
for i in np.arange(len(y)):
o, s = self.forward_propagation(x[i])
# We only care about our prediction of the "correct" words
correct_word_predictions = o[np.arange(len(y[i])), y[i]]
# Add to the loss based on how off we were
L += -1 * np.sum(np.log(correct_word_predictions))
return L
def calculate_loss(self, x, y):
# Divide the total loss by the number of training examples
N = np.sum((len(y_i) for y_i in y))
return self.calc