A little difference between training and predicting in Bi-LSTM+CRF

Each RNA Xi in an RNA sequence X is an observative variable and each label yi in a label sequence y is the hidden variable. First a Bi-LSTM neural network takes as input X and generates emission probability P (Pi,yi is the emission probability that RNA Xi is tagged with label yi) of y. Then we use the hidden state transition matrix A (Ayi,yi+1 is the transition probability from hidden state yi to yi+1), which is the learnable parameter of CRF layer, to get the transition probability. This hidden state transition matrix is initialized randomly and is updated using back propagation. We define the score for the label sequence y of an RNA sequence X as below:

Use Softmax to calculate the probability of all possible label sequences, even some may never 

appear and below is the probability of the correct label sequence

Log it then we can get

Take the opposite as the loss function so we can use gradient descent to train our model. When training model, instead of giving accurate prediction in each position, we just use the probability distribution generated by Bi-LSTM to calculate the loss. We use a particular function to do this. After we finish training, this function is no more need. To predict the label sequence of a new RNA sequence, we put this RNA sequence into Bi-LSTM and get the probability distribution. Then we use Viterbi algorithm which takes as parameters the probability distribution and learned transition matrix A to calculate the best path and the best path is what we need.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值