欢迎使用CSDN-markdown编辑器

最新推荐文章于 2021-01-07 23:01:28 发布

rgtjf

最新推荐文章于 2021-01-07 23:01:28 发布

阅读量415

点赞数 1

分类专栏：学习文章标签： nlp w2v

本文链接：https://blog.csdn.net/rgtjf/article/details/50613075

版权

学习专栏收录该内容

15 篇文章 0 订阅

订阅专栏

0. p(sent1, sent2) = ? in all sentence_pair()? WRONG!
1. the question is p(sent1, sent2) > p(sent3, sent4) ?
=> ranking problem, patial order may be confict 
=> get the total order
    1.1 => dist measure, like cos, eular
    1.2 =>can we solve it by probability?
          => yes
          1.2.1 p(sent2 | sent1) = ?, p(sent1 | sent2) = ?,  DIST IS p(sent2 | sent1) * p(sent1 | sent2). P REPRESENT THE PROBABILITY WHICH SENT2 CAN INFER SENT1.
          1.2.2 p(word2_1, word2_2, ... | word1_1, word1_2, )
          1.2.3 p(word2_1, word2_2, ... | word1_1) * p(word2_1, word2_2, ...., word1_2 |word1_1) WORD ORDER IS NO USE, deleted
          1.2.4 p(sent2 | sent1) = p(sent2 | word1_1) * p(word2_1 | word1_1) * p(sent2 | word2_1), question is p(y|x) is not the order but the sim
          //known Sim(word1_1, word2_1) = .. ,Sim(word1_1, word2_2) = .. =>
          //p(word2_1 | word1_1) = p(word2_1, word1_1) / sum(p(word, word1_1))
      /***************************************************************************/
          1.2.1 P(SENT2|SENT1) = P(WORD2_1 | WORD1_1, WORD1_2, WORD1_3, ...) * P(WORD2_2 | WORD1_1, WORD1_2, WORD1_3, ...) * ..
           //=> P(WORD2_1, WORD1_1, ....) / P(WORD1_1, WORD1_2, WORD1_3, ...) ..
          1.2.2 P(SENT2|SENT1) = P(SENT2_1 | WORD1_1, WORD1_2, WORD1_3, ...) * P(WORD2_2 | WORD1_1, WORD1_2, WORD1_3, ...)
          => P(SENT2_1, WORD1_1, ....) / P(WORD1_1, WORD1_2, WORD1_3, ...) ..
          => P(SENT2_1) / P(WORD1_1, WORD1_2, WORD1_3, ...) ...
Algorithm:
step1: top1000 represent sent1
step2: get P, how much happen in top1000 / len(sent1)= P(SENT2_1) / P(WORD1_1, WORD1_2, WORD1_3, ...)
step3: get R, same as step3
step4: top 1000, F1 = 2.0*P*R/(P+R)

rgtjf

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
欢迎使用CSDN-markdown编辑器

0. p(sent1, sent2) = ? in all sentence_pair()? WRONG!1. the question is p(sent1, sent2) > p(sent3, sent4) ?=> ranking problem, patial order may be confict => get the total order 1.1 => dist meas
复制链接

扫一扫

专栏目录