利用Stanford Parser进行观点词否定词抽取

最新推荐文章于 2024-07-24 11:47:57 发布

gdp5211314

最新推荐文章于 2024-07-24 11:47:57 发布

阅读量2.6k

点赞数 1

分类专栏： nlp 文章标签： string 产品情感 tree jar

nlp 专栏收录该内容

6 篇文章 0 订阅

订阅专栏

利用Stanford Parser进行观点词否定词抽取

问题：

接上一篇内容，当我们在文本中得到特征词的观点词之后，如果我们要做情感分析、极性判定（用户是在赞美还是批评），除了分析观点词本身的情感色彩之外，我们还需得到句子中是否有对该观点词的否定。如“我喜欢这个产品”->肯定；“我不喜欢这个产品”->否定。常见的否定词一般有“不”“无”“没有”等，否定词常常出现在观点词前面（不一定紧邻，有可能当中会隔着助词或副词等），因此如果光用统计的方法去判定准确率不高。

方法：

1.选择文本数据（数据源，如产品评论文本等）

2.对文本进行断句和分词

3.语法分析（Stanford Parser)

4.观点词抽取（参考前一篇内容）

5.否定词抽取。（利用stanford Parser返回的TypedDependency关系对象集合，查找当前观点词是否和其他词存在否定关系）

代码：

这里给的代码直接略过了前面几步，输入为：分词后的句子和观点词，输出：该观点词的否定词。

package textAnalysis;

import java.io.StringReader;

import java.util.Collection;

import java.util.List;

import edu.stanford.nlp.process.Tokenizer;

import edu.stanford.nlp.ling.HasWord;

import edu.stanford.nlp.ling.Label;

import edu.stanford.nlp.trees.*;

import edu.stanford.nlp.ling.HasWord;

import edu.stanford.nlp.parser.lexparser.LexicalizedParser;

import edu.stanford.nlp.process.Tokenizer;

import edu.stanford.nlp.trees.GrammaticalStructure;

import edu.stanford.nlp.trees.GrammaticalStructureFactory;

import edu.stanford.nlp.trees.PennTreebankLanguagePack;

import edu.stanford.nlp.trees.Tree;

import edu.stanford.nlp.trees.TreebankLanguagePack;

import edu.stanford.nlp.trees.TypedDependency;

import edu.stanford.nlp.trees.international.pennchinese.ChineseTreebankLanguagePack;

import edu.stanford.nlp.parser.lexparser.LexicalizedParser;

import edu.stanford.nlp.parser.lexparser.Test;

public class NegWordExtra {

public NegWordExtra(){

}

public static void main(String[] args) {

LexicalizedParser lp =

new LexicalizedParser("grammar/chinesePCFG.ser.gz");

Test.MAX_ITEMS = 2000000000;

TreebankLanguagePack tlp = new ChineseTreebankLanguagePack();

GrammaticalStructureFactory gsf = tlp.grammaticalStructureFactory();

String sentence = "这个产品很保湿但不油腻";

String keyword = "油腻";

Tokenizer<? extends HasWord> toke = tlp.getTokenizerFactory()

.getTokenizer(new StringReader(sentence));

List<? extends HasWord> sentList = toke.tokenize();

Tree parser = lp.apply(sentList);

GrammaticalStructure gs = gsf.newGrammaticalStructure(parser);

Collection tdl = gs.typedDependenciesCollapsedTree();

//System.out.println(tdl);

for(int i = 0;i < tdl.size();i ++){

TypedDependency td = (TypedDependency)tdl.toArray()[i];

String nodDep = td.dep().toString();

String nodgov = td.gov().toString();

String relation = td.reln().toString();

//int act = -1;

if(nodgov.contains(keyword) && relation.equals("neg")){

//act = td.gov().index();

System.out.println("Dep:" + nodDep + ", Gov:" + nodgov + " > relation:"+ relation);

}

}

}

}

注：本程序用的是Stanford 2011年9月14号发布的jar包和词库。因为发现在12年2月3号发的那个包把否定关系“neg”归为了副词修饰形容词的“advmod”关系了，所以没法得到否定词，具体情况还在学习研究中，有知道的同学也麻烦告诉我下。谢谢。

关注

1
点赞
踩
2

收藏

觉得还不错? 一键收藏
1
评论
利用Stanford Parser进行观点词否定词抽取

利用Stanford Parser进行观点词否定词抽取问题：接上一篇内容，当我们在文本中得到特征词的观点词之后，如果我们要做情感分析、极性判定（用户是在赞美还是批评），除了分析观点词本身的情感色彩之外，我们还需得到句子中是否有对该观点词的否定。如“我喜欢这个产品”->肯定；“我不喜欢这个产品”->否定。常见的否定词一般有“不”“无”“没有”等，否定词常常出现在观点词前面（
复制链接

扫一扫

专栏目录

评论 1

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。