本文为加拿大滑铁卢大学(作者:Yahui Chen)的硕士论文,共62页。
知识库支持问答(KB-supported QA)系统的目标是通过从知识数据库中获取答案来回答自然语言的查询,知识数据库以(实体、关系、值)三种形式存储知识。QA系统通过提取实体和关系对来理解问题。本论文的目的是识别出一个问题中的关系型候选人,我们为这个具有挑战性的任务定义了一个多标签分类问题。基于单词的word2vec表示形式,我们提出了两种卷积神经网络(CNN)来解决多标签分类问题,即并行CNN和深度CNN。并行CNN包含四个并行卷积层,而深度CNN包含两个串行卷积层。两个模型的卷积层都捕获了局部语义特征。最大时间池化层放在最后一个卷积层的顶部,以选择全局语义特性。具有dropout的完全连接层用于概括所有特征。实验表明,这两种模型在很大程度上优于传统的支持向量分类(SVC)方法。此外,我们观察到深度CNN比并行CNN具有更好的性能,这表明深度结构比宽而浅的网络具有更强的语义学习能力。
The goal of a Knowledge Base–supported Question Answering (KB-supported QA) system is to answer a query natural language by obtaining the answer from a knowledge database, which stores knowledge in the form of (entity, relation, value) triples. QA systems understand questions by extracting entity and relation pairs. This thesis aims at recognizing the relation candidates inside a question. We define a multi-label classification problem for this challenging task. Based on the word2vec representation of words, we propose two convolutional neural networks (CNNs) to solve the multi-label classification problem, namely Parallel CNN and Deep CNN. The Parallel CNN contains four parallel convolutional layers while Deep CNN contains two serial convolutional layers. The convolutional layers of both the models capture local semantic features. A max over time pooling layer is placed on the top of the last convolutional layer to select global semantic features. Fully connected layers with dropout are used to summarize the features. Our experiments show that these two models outperform the traditional Support Vector Classification (SVC)–based method by a large margin. Furthermore, we observe that Deep CNN has better performance than Parallel CNN, indicating that the deep structure enables much stronger semantic learning capacity than the wide but shallow network.
1 引言
2 项目背景
3 相关工作
4 数据集与环境
5 主要研究结果
6 结论
附录 实现CNN的Python代码
完整资料领取请加QQ群免费下载: