context_vector = math_ops.reduce_sum(
array_ops.reshape(attn_dist, [batch_size, -1, 1, 1]) * encoder_state, [1, 2]) # shape (batch_size, attn_size).
encoder_state: (batch_size, attn_length, 1, attn_size)
array_ops.reshape(attn_dist, [batch_size, -1, 1, 1]): (batch_size, attn_length, 1, 1)
相乘结果为: (batch_size, attn_length, 1, attn_size)