5Lifelong Learning CRF for Supervised Aspect Extraction（2020.10.21）

fuchengguo666

已于 2022-03-03 22:10:55 修改

阅读量138

点赞数

分类专栏： sentiment analysis 文章标签： nlp

于 2020-10-21 21:13:27 首次发布

本文链接：https://blog.csdn.net/fuchengguo666/article/details/109206139

版权

sentiment analysis 专栏收录该内容

20 篇文章 2 订阅

订阅专栏

Lifelong Learning CRF for Supervised Aspect Extraction

用于有监督特征提取的终身学习CRF

一、Abstract

This paper makes a focused contribution to supervised aspect extraction.本文对有监督的’aspect extraction’做出了重要贡献。
It shows that if the system has performed aspect extraction from many past domains and retained their results as knowledge,Conditional Random Fields (CRF) can leverage this knowledge in a lifelong learning manner to extract in a new domain markedly better than the traditional CRF without using this prior knowledge.
它表明，如果系统从许多过去的领域中进行了方面提取，并将其结果保留为知识。条件随机场（CRF）可以以‘终身学习’的方式利用该知识，比传统的CRF更好地在新域中进行提取，而无需使用此先验知识。
The key innovation is that even after CRF training, the model can still improve its extraction with experiences in its applications.
关键的创新在于，即使经过CRF训练，该模型仍然可以根据应用中的经验来改进其提取。

二、Introdution

This paper focuses on the supervised approach(Jakob and Gurevych, 2010; Choi and Cardie,2010; Mitchell et al., 2013) using ConditionalRandom Fields (CRF) (Lafferty et al., 2001).
本文着重使用条件随机场（CRF）（Lafferty等人，2001）研究监督方法（Jakob和Gurevych，2010; Choi和Cardie，2010; Mitchell等人，2013）。
It shows that the results of CRF can be significantly improved by leveraging some prior knowledge automatically mined from the extraction results of previous domains, including domains without labeled data.
它表明，通过利用从先前域（包括没有标记数据的域）的提取结果中自动提取的一些先验知识，可以显着改善CRF的结果。
Due to leveraging the knowledge gained from the past to help the new domain extraction, we are using the idea of lifelong machine learning(LML) (Chen and Liu, 2016; Thrun, 1998; Sil-ver et al., 2013),which is a continuous learning paradigm that retains the knowledge learned in the past and uses it to help future learning and problem solving with possible adaptations.
由于利用从过去获得的知识来帮助进行新的域提取，我们使用了终身机器学习（LML）的想法（Chen和Liu，2016； Thrun，1998； Sil-ver等人，2013），这是一种持续的学习范例，它保留了过去学到的知识，并使用它来帮助未来的学习和问题解决，并可能进行适应性调整。

二、Conditional Random Fields（CRF）

三、General Dependency Feature (G)常规依赖特征(G)

特征G使用广义依赖关系，这一特征的有趣之处在于它使L-CRF能够在测试时将过去的知识用于其序列预测，以使其性能更好。此功能以依赖模式作为其值，依赖模式是从依赖关系泛化而来的。
The general dependency feature (G) of the variable xl takes a set of feature values V^G. Each feature value v^G is a dependency pattern. The Label-G (LG) FF is defined as:
变量xl的一般依赖特征(G)采用一组特征值V^G。每个特征值v^G都是一个依赖模式。 Label-G (LG) FF定义为：
在这里插入图片描述
Such a FF returns 1 when the dependency feature of the variable xl equals to a dependency pattern v^G and the variable yl equals to the label value I.
当变量xl的依赖特征等于依赖模式v ^G且变量yl等于标签值i时，这样的FF返回1

3.1 Dependency Relation
-.Dependency relations have been shown useful in many sentiment analysis applications.(Johanssonand Moschitti, 2010; Jakob and Gurevych, 2010).依赖关系已在许多情感分析应用中显示出有用。
-.A dependency relation is a quintuple-tuple:依赖关系是五元组
type, gov, govpos, dep, deppos
1）‘type’是依赖关系的类型
2）’gov‘ is the governor word
3）‘govpos’ 是governor word的词性POS tag标签
4）’dep‘是从属词
5）‘deppos’是从属单词的POS标签
在依赖关系中，第l个字可以是控制字或从属字。
3.2 Dependency Pattern
-.我们使用以下步骤将依赖关系概括为依赖模式:

For each dependency relation, replace the current word (governor word or dependent word) and its POS
tag with a wildcard since we already have the word (W) and the POS tag § features.
对于每个依赖关系，因为我们已经具有单词（W）和POS tag（P）特征，所以用‘通配符’替换当前单词（‘调控词’或‘从属词’）及其POS标签。
（We obtain dependency relations using StanfordCoreNLP: http://stanfordnlp.github.io/CoreNLP/.）
Replace the context word (the word other than the l-th word) in each dependency relation with a knowledge label to form a more general dependency pattern.Let the set of aspects annotated in the training data beKt. If the context word in the dependency relation appears in K^t, we replace it with a knowledge label ‘A’ (aspect); otherwise ‘O’ (other).
将每个依存关系中的上下文词(除第l个词以外的词)替换为知识标签，形成更一般的依存模式。设训练数据中标注的方面集为K^t，如果依存关系中的上下文词出现在K^t中，则将其替换为知识标签“A”(Aspect)，否则“O”(Other)。
For example, we work on the sentence “The battery of this camera is great.”
表1给出了依赖关系。

假设当前单词是“ battery”，并且“ camera”被标注为一个aspect(特征)。解析器(parser)产生的“相机”和“电池”之间的原始依赖关系是（nmod，battery，NN，camera，NN）。由于依赖关系中当前单词的信息（单词本身及其POS标签）是多余的，因此我们将其替换为通配符。关系变为（nmod，***，camera，NN)。其次，因为“Camera”是K^t，所以我们将“Camera”替换为通用标签“A”。最终的依赖模式变为(nmod，***，A，NN)
We now explain why dependency patterns can enable a CRF model to leverage the past knowledge.
现在我们解释为什么依赖性模式可以使CRF模型利用过去的知识。
The key is the knowledge label ‘A’ above, which indicates a likely aspect.关键是上方的知识标签“ A”，它表示可能的aspect。
回想我们的问题设置是，当我们需要使用训练有素的CRF模型M从新域Dn+1提取时，我们已经从许多先前的域D1，…，Dn中提取了数据，并保留了它们提取的方面A1，…An的集合。然后，我们可以从A1，…，An中挖掘可靠的aspect，并将它们添加到K^t中，由于aspect在跨域里被共享，因此可以在新数据An+1的依赖模式中有许多知识标签。这丰富了依赖性模式特征，因此允许从新域Dn+1中提取更多aspect。

四、The Proposed L-CRF Algorithm（L-CRF算法）

由于一般依存特征的依存模式不使用任何实际词语，它们也可以使用先验知识,它们对于跨域提取特别有用（测试域未在训练中使用）
在这里插入图片描述
Lifelong Extraction Phase: Algorithm 1 performs extraction on Dn+1 iteratively. 算法1迭代地对Dn+1进行提取。

-.补充：上图第3条翻译
3. 如果Kn+1与上一次迭代的Kp相同，就认为没有发现新的aspect，算法退出。我们使用迭代过程，因为每次提取都会产生新结果，这可能会增加K的大小、可靠的过去aspect或过去的知识。增加的K可能会产生更多的依赖性模式，从而可以进行更多的提取。

五、Experiments

We now evaluate the proposed L-CRF method andcompare with baselines.我们现在评估所提出的L-CRF方法，并与基线进行比较。

5.1 Evaluation Datasets
我们实验使用两种数据
5.2 Baseline Methods
We compare L-CRF with CRF.我们将L-CRF与CRF做比较。

CRF
CRF+R：它将可靠的aspect集K视为字典。它将K中那些未被CRF提取但存在于测试数据中的可靠aspect添加到最终结果中。我们想看看是否确实需要通过L-CRF中的依赖模式将K合并到CRF提取中。

Table 3: Aspect extraction results in precision, recall and F1score: Cross-Domain and In-Domain (−X means all except domain X)

5.3 Experiment Setting
为了比较使用相同训练和测试数据的系统，对于每个数据集，我们分别使用200句用于训练和200句用于测试，以避免偏向任何数据集或领域，因为我们将组合多个领域数据集进行CRF训练。我们进行了跨域和域内测试。我们的问题设置是跨域的。域内用于完整性。在这两种情况下，我们都假设已经对这50个域进行了提取
Cross-domain experiments:
In-domain experiments:
Evaluating Measures：We use the popular precision P, recall R, and F1-score
5.4 Results and Analysis

Cross-domain: 如上Table 3，第1列中的每个-X表示域X不用于训练。CRF+R is very poor due to poor precisions, which shows treating the reliable aspects set K as a dictionary isn’t a good idea.由于精度不高，CRF+R非常差，这表明将可靠的aspect集合K作为字典不是一个好主意。
In-domain：域内：培训和测试列中的-X表示在培训和测试中都使用了其他6个域（因此域内）。

六、Conclusion

This paper proposed a lifelong learning method to enable CRF to leverage the knowledge gained from extraction results of previous domains (unlabeled) to improve its extraction.
本文提出了一种‘终身学习’方法，以使CRF能够利用从先前域（未标记）的提取结果中获得的知识来改进其提取。
In our future work, we plan to modify CRF so that it can consider previous extraction results as well as the knowledge in previous CRF models.
在未来的工作中，我们计划修改CRF，以便它可以考虑以前的提取结果以及以前的CRF模型中的知识。

fuchengguo666

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
5Lifelong Learning CRF for Supervised Aspect Extraction（2020.10.21）

Lifelong Learning CRF for Supervised Aspect Extraction用于有监督特征提取的终身学习CRF一、AbstractThis paper makes a focused contribution to supervised aspect extraction.本文对有监督的’aspect extraction’做出了重要贡献。It shows that if the system has performed aspect extraction
复制链接

扫一扫

专栏目录