欢迎使用CSDN-markdown编辑器

0. p(sent1, sent2) = ? in all sentence_pair()? WRONG!
1. the question is p(sent1, sent2) > p(sent3, sent4) ?
=> ranking problem, patial order may be confict 
=> get the total order
    1.1 => dist measure, like cos, eular
    1.2 =>can we solve it by probability?
          => yes
          1.2.1 p(sent2 | sent1) = ?, p(sent1 | sent2) = ?,  DIST IS p(sent2 | sent1) * p(sent1 | sent2). P REPRESENT THE PROBABILITY WHICH SENT2 CAN INFER SENT1.
          1.2.2 p(word2_1, word2_2, ... | word1_1, word1_2, )
          1.2.3 p(word2_1, word2_2, ... | word1_1) * p(word2_1, word2_2, ...., word1_2 |word1_1) WORD ORDER IS NO USE, deleted
          1.2.4 p(sent2 | sent1) = p(sent2 | word1_1) * p(word2_1 | word1_1) * p(sent2 | word2_1), question is p(y|x) is not the order but the sim
          //known Sim(word1_1, word2_1) = .. ,Sim(word1_1, word2_2) = .. =>
          //p(word2_1 | word1_1) = p(word2_1, word1_1) / sum(p(word, word1_1))
      /***************************************************************************/
          1.2.1 P(SENT2|SENT1) = P(WORD2_1 | WORD1_1, WORD1_2, WORD1_3, ...) * P(WORD2_2 | WORD1_1, WORD1_2, WORD1_3, ...) * ..
           //=> P(WORD2_1, WORD1_1, ....) / P(WORD1_1, WORD1_2, WORD1_3, ...) ..
          1.2.2 P(SENT2|SENT1) = P(SENT2_1 | WORD1_1, WORD1_2, WORD1_3, ...) * P(WORD2_2 | WORD1_1, WORD1_2, WORD1_3, ...)
          => P(SENT2_1, WORD1_1, ....) / P(WORD1_1, WORD1_2, WORD1_3, ...) ..
          => P(SENT2_1) / P(WORD1_1, WORD1_2, WORD1_3, ...) ...
Algorithm:
step1: top1000 represent sent1
step2: get P, how much happen in top1000 / len(sent1)= P(SENT2_1) / P(WORD1_1, WORD1_2, WORD1_3, ...)
step3: get R, same as step3
step4: top 1000, F1 = 2.0*P*R/(P+R)
  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值