双向长短期记忆网络模型_基于双向长短期记忆网络和标签嵌入的文本分类模型...

最新推荐文章于 2024-02-18 16:20:41 发布

欧皇·诸葛莺

最新推荐文章于 2024-02-18 16:20:41 发布

阅读量877

点赞数

文章标签：双向长短期记忆网络模型

本文链接：https://blog.csdn.net/weixin_29560137/article/details/113021419

版权

该文探讨了双向长短期记忆网络（Bi-LSTM）在文本分类中的应用，结合标签嵌入技术提高模型的分类效果。引用了多项相关研究，包括LSTM的原理、Deconvolutional Paragraph Representation Learning、Topic Compositional Neural Language Model等，并介绍了Attention机制、预训练词嵌入和深度学习在文本理解中的作用。

摘要由CSDN通过智能技术生成

[1] ZHANG Y H, SHEN D H, WANG G Y, et al. Deconvolutional paragraph representation learning[C] //Advances in Neural Information Processing Systems. California: Springer, 2017: 4169-4179.

[2] HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 1997, 9(8):1735-1780.

[3] WANG W L, GAN Z, WANG W Q, et al. Topic compositional neural language model[C] //International Conference on Artificial Intelligence and Statistics. Lanzarote. Spain: PMLR, 2018: 356-365.

[4] GRAVES A, SCHMIDHUBER J. Framewise phoneme classification with bidirectional LSTM and other neural network architectures[J]. Neural Networks, 2005, 18(5/6):602-610.

[5] NOWAK J, TASPINAR A, SCHERER R. LSTM recurrent neural networks for short text and sentiment classification[C] //International Conference on Artificial Intelligence and Soft Computing. Dubai: Springer, 2017: 553-562.

[6] NIU X L, HOU Y X, WANG P P. Bi-directional LSTM with quantum attention mechanism for sentence modeling[C] //International Conference on Neural Information Processing. Guangzhou: Springer, 2017: 178-188.

[7] BAHDANAU D, CHO K, BENGIO Y, et al. Neural machine translation by jointly learning to align and translate[C] //International Conference on Learning Representations. San Diego: Springer, 2015.

[8] YANG Z, YANG D, DYER C, et al. Hierarchical attention networks for document classification[C] //Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. California: Springer, 2016: 1480-1489.

[9] JOULIN A, GRAVE E, BOJANOWSKI P, et al. Bag of tricks for efficient text classification[C] //Conference of the European Chapter of the Association for Computational Linguistics. Valencia: Springer, 2017: 427-431.

[10] SHEN D, WANG G, WANG W, et al. Baseline needs more love: on simple word-embedding-based models and associated pooling mechanisms[C] //Meeting of the Association for Computational Linguistics. Melbourne: Springer, 2018: 440-450.

[11] REZAEINIA S M, RAHMANI R, GHODSI A, et al. Sentiment analysis based on improved pre-trained word embeddings[J]. Expert Systems with Applications, 2019, 117:139-147.

[12] DEVLIN J, CHANG M, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[C] //North American Chapter of the Association for Computational Linguistics. Minneapolis: Springer, 2019: 4171-4186.

[13] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C] //Advances in Neural Information Processing Systems. California: Springer, 2017: 5998-6008.

[14] MIKOLOV T, CHEN K, CORRADO G S, et al. Efficient estimation of word representations in vector space[C] //International Conference on Learning Representations. Scottsdale: Springer, 2013.

[15] PETERS M E, NEUMANN M, IYYER M, et al. Deep contextualized word representations[C] //North American Chapter of the Association for Computational Linguistics. New Orleans: Springer, 2018: 2227-2237.

[16] RADFORD A, NARASIMHAN K, SALIMANS T, et al. Improving language understanding by generative pre-training[J]. Computation and Language, 2017, 4(6):212-220.

[17] LUO Y. Recurrent neural networks for classifying relations in clinical notes[J]. Journal of Biomedical Informatics, 2017, 72:85-95.

[18] WU D, CHI M G. Long short-term memory with quadratic connections in recursive neural networks for representing compositional semantics[J]. IEEE Access, 2017, 5:16077-16083.

[19] WANG Y, FENG S, WANG D L, et al. Context-aware chinese microblog sentiment classification with bidirectional LSTM[C] //Asia-Pacific Web Conference. Suzhou: Springer, 2016: 594-606.

[20] YANG M, TU W, WANG J, et al. Attention-based LSTM for target-dependent sentiment classification[C] //Proceedings of the Thirty-first AAAI Conference on Artificial Intelligence. San Francisco: AAAI Press, 2017: 5013-5014.

[21] DANILUK M, ROCKTÄSCHEL T, WELBL J, et al. Frustratingly short attention spans in neural language modeling[J]. Computation and Language, 2017, 14(7):812-820.

[22] PARIKH A, TÄCKSTRÖM O, DAS D, et al. A decomposable attention model for natural language inference[C] //Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Austin:Association for Computational Linguistics, 2016: 2249-2255.

[23] AKATA Z, PERRONNIN F, HARCHAOUI Z, et al. Label-embedding for image classification[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 38(7):1425-1438.

[24] RODRIGUEZ-SERRANO J A, PERRONNIN F, MEYLAN F. Label embedding for text recognition[C] //BMVC. United Kingdom: Springer, 2013: 5.1-5.12.

[25] TANG J, QU M, MEI Q. PTE: predictive text embedding through large-scale heterogeneous text networks[C] //Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Francisco: ACM, 2015: 1165-1174.

[26] ZHANG H, XIAO L, CHEN W, et al. Multi-task label embedding for text classification[C] //Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Brussels: Association for Computational Linguistics, 2018: 4545-4553.

[27] CHUNG J, GULCEHRE C, CHO K, et al. Gated feedback recurrent neural networks[J]. Computer Science, 2015, 37(3):2067-2075.

[28] KIROS R, ZHU Y, SALAKHUTDINOV R R, et al. Skip-thought vectors[C] //Advances in Neural Information Processing Systems. Montreal: Springer, 2015: 3294-3302.

[29] LE Q, MIKOLOV T. Distributed representations of sentences and documents[C] //International Conference on Machine Learning. Montreal: JMLR, 2014: 1188-1196.

[30] ZHANG X, ZHAO J, LECUN Y. Character-level convolutional networks for text classification[C] //Advances in Neural Information Processing Systems. Montreal: Springer, 2015: 649-657.

[31] CONNEAU A, SCHWENK H, BARRAULT L, et al. Very deep convolutional networks for text classification[C] //Conference of the European Chapter of the Association for Computational Linguistics. Vancouver: ACL, 2017: 1107-1116.

[32] KINGMA D P, BA J. ADAM: a method for stochastic optimization[J].Neural Networks, 2014, 15(4):95-103.

[33] HILL F, CHO K, KORHONEN A. Learning distributed representations of sentences from unlabelled data[C] // Proceedings of NAACL-HLT. San Diego: NAACL, 2016: 1367-1377.

[34] AGIRRE E, BANEA C, CARDIE C, et al. Semeval-2014 task 10: multilingual semantic textual similarity[C] //Proceedings of the 8th International Workshop on Semantic Evaluation(SemEval 2014). Dublin: ACL, 2014: 81-91.

[35] JOHNSON A E W, POLLARD T J, SHEN L, et al. MIMIC-III, a freely accessible critical care database[J]. Scientific Data, 2016, 3: 160035.

[36] KIM Y. Convolutional neural networks for sentence classification[M] //Empirical Methods in Natural Language Processing. Doha: EMNLP, 2014: 1746-1751.

[37] SHI H R, XIE P T, HU Z T, et al. Towards automated ICD coding using deep learning[J]. Computation and Language, 2017, 23(8):1409-1418.

[38] MULLENBACH J, WIEGREFFE S, DUKE J, et al. Explainable prediction of medical codes from clinical text[C] //Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. New Orleans: ACL, 2018: 1101-1111.

[39] SHEN T, ZHOU T, LONG G, et al. Bi-directional block self-attention for fast and memory-efficient sequence modeling[C] //International Conference on Learning Representations. Vancouver: Springer, 2018: 779-788.