nlp_xiaojiang
AugmentText
- 回译(效果比较好)
- EDA(同义词替换、插入、交换和删除)(效果还行)
- HMM-marko(质量较差)
- syntax(依存句法、句法、语法书)(简单句还可)
- seq2seq(深度学习同义句生成,效果不理想,seq2seq代码大都是 [https://github.com/qhduan/just_another_seq2seq] 的,效果不理想)
ChatBot
- 检索式ChatBot
- 像ES那样直接检索(如使用fuzzywuzzy),只能字面匹配
- 构造句向量,检索问答库,能够检索有同义词的句子
- 生成式ChatBot(todo)
- seq2seq
- GAN
ClassificationText
- bert+bi-lstm(keras) approach 0.78~0.79% acc of weBank Intelligent Customer Service Question Matching Competition
- bert + text-cnn(keras) approach 0.78~0.79% acc of weBank Intelligent Customer Service Question Matching Competition
- bert + r-cnn(keras) approach 0.78~0.79% acc of weBank Intelligent Customer Service Question Matching Competition
- bert + avt-cnn(keras) approach 0.78~0.79% acc of w