- 最近在做毕设,数据量有点大,所以不得不使用mini-batch的训练方式,咨询了同学,mini-batch的原理好像比较简单,就是取batch里句子最长的长度,然后加一个标记每个句子长度的mask矩阵,lstm每次scan的时候把中间的c和h的值过一下mask,废话不多说了,下面看代码:
- 首先我初始化了模型需要的参数:
W_value = numpy.asarray(rng.uniform(
low = -initialize_range,
high = initialize_range,
size = (2, 4, n_in, n_h)), dtype = theano.config.floatX)
W_s = theano.shared(value = W_value, name = "W_s", borrow = True)
U_value = numpy.asarray(rng.uniform(
low = -initialize_range,
high = initialize_range,
size = (2, 4, n_h, n_h)), dtype = theano.config.floatX)
U_s = theano.shared(value = U_value, name = "U_s", borrow = True)
b_value = numpy.asarray(rng.uniform(
low = -initialize_range,
high = initialize_range,
size = (2, 4, n_h)), dtype = theano.config.floatX)
b_s = theano.shared(value = b_value, name = "b_s", borrow = True)
v_o_value = numpy.asarray(rng.uniform(
low=-initialize_range,
high=initialize_range,
si