情感分析textblob--英文分析

NLP 中文自然语言处理 专栏收录该内容
7 篇文章 1 订阅

美图欣赏:
在这里插入图片描述

一.textblob介绍

1.TextBlob:简化文本处理

TextBlob是用于处理文本数据的Python(2和3)库。它提供了一个简单的API,用于深入研究普通自然语言处理(NLP)任务,例如词性标记,名词短语提取情感分析分类翻译等。

2.功能

名词短语提取
词性标记
情绪分析
分类(朴素贝叶斯,决策树)
由Google翻译提供的语言翻译和检测标记化(将文本分为单词和句子)
单词和短语的频率
单词变形(复数和单数)和词形化
拼写校正
通过扩展添加新的模型或语言
WordNet整合

注:https://textblob.readthedocs.io/en/dev/ 内容详解

二.textblob安装

在pycharm中的Terminal中执行这个命令

$ pip install -U textblob

三.代码实现

1.利用textblob的TextBlob方法实现分句

import textblob
text1 = "No matter how many characters are available for your password you should be sure to use every one of them. " \
        "The more characters available for your password and the more you use makes it that much harder to figure out the combination. " \
        "Always make use of all characters available for a strong and secure password."
#1.利用textblob的TextBlob生成一个模型
blob1 = textblob.TextBlob(text1)

#sentences方法进行分句
sentences1 = blob1.sentences
print("1.分句是:",sentences1)

运行结果:

1.分句是: [Sentence("No matter how many characters are available for your password you should be sure to use every one of them."), Sentence("The more characters available for your password and the more you use makes it that much harder to figure out the combination."), Sentence("Always make use of all characters available for a strong and secure password.")]

2 根据分句实现分词

word_list=[]#声明一个list集合存储所有的分词结果
for sentences in sentences1:
    word_list.append(sentences.words)
    print(sentences.words)
print("2. 分词: ",word_list)

运行结果:

2. 分词:  [WordList(['No', 'matter', 'how', 'many', 'characters', 'are', 'available', 'for', 'your', 'password', 'you', 'should', 'be', 'sure', 'to', 'use', 'every', 'one', 'of', 'them']), WordList(['The', 'more', 'characters', 'available', 'for', 'your', 'password', 'and', 'the', 'more', 'you', 'use', 'makes', 'it', 'that', 'much', 'harder', 'to', 'figure', 'out', 'the', 'combination']), WordList(['Always', 'make', 'use', 'of', 'all', 'characters', 'available', 'for', 'a', 'strong', 'and', 'secure', 'password'])]

3.统计单个单词出现的次数

counts_you_ = blob1.word_counts['you']#这里一定要用blob1调用
print('3. you出现的次数',counts_you_)

运行结果:

3. you 出现的次数 2

4.统计名词出现的次数

import textblob
text="Beautiful is better than ugly." \
     " Explicit is better than implicit. " \
     "Simple is better than complex."
#1.利用textblob的TextBlob生成一个模型
blob=textblob.TextBlob(text)

#noun_phrases统计名词短语,case_sensitive大小写是否敏感(False为不敏感,大写小写都可以识别;True反之)
noun_counts=blob.noun_phrases.count('Simple',case_sensitive=False)
print("4. Simple 出现的次数",noun_counts)

运行结果:

4. Simple 出现的次数 1

5.统计所有单词次数

word_list1 = []
for lis1 in sentences1: #这里一定要用sentences1
    word_list1 = lis1.word_counts
    print(word_list1)
print(word_list)

运行结果:

defaultdict(<class 'int'>, {'no': 1, 'matter': 1, 'how': 1, 'many': 1, 'characters': 1, 'are': 1, 'available': 1, 'for': 1, 'your': 1, 'password': 1, 'you': 1, 'should': 1, 'be': 1, 'sure': 1, 'to': 1, 'use': 1, 'every': 1, 'one': 1, 'of': 1, 'them': 1})
defaultdict(<class 'int'>, {'the': 3, 'more': 2, 'characters': 1, 'available': 1, 'for': 1, 'your': 1, 'password': 1, 'and': 1, 'you': 1, 'use': 1, 'makes': 1, 'it': 1, 'that': 1, 'much': 1, 'harder': 1, 'to': 1, 'figure': 1, 'out': 1, 'combination': 1})
defaultdict(<class 'int'>, {'always': 1, 'make': 1, 'use': 1, 'of': 1, 'all': 1, 'characters': 1, 'available': 1, 'for': 1, 'a': 1, 'strong': 1, 'and': 1, 'secure': 1, 'password': 1})
[WordList(['No', 'matter', 'how', 'many', 'characters', 'are', 'available', 'for', 'your', 'password', 'you', 'should', 'be', 'sure', 'to', 'use', 'every', 'one', 'of', 'them']), WordList(['The', 'more', 'characters', 'available', 'for', 'your', 'password', 'and', 'the', 'more', 'you', 'use', 'makes', 'it', 'that', 'much', 'harder', 'to', 'figure', 'out', 'the', 'combination']), WordList(['Always', 'make', 'use', 'of', 'all', 'characters', 'available', 'for', 'a', 'strong', 'and', 'secure', 'password'])]

6.词性标注

#tags词性标注方法

tags = blob1.tags
print(tags)

运行结果:

[('No', 'DT'), ('matter', 'NN'), ('how', 'WRB'), ('many', 'JJ'), ('characters', 'NNS'), ('are', 'VBP'), ('available', 'JJ'), ('for', 'IN'), ('your', 'PRP$'), ('password', 'NN'), ('you', 'PRP'), ('should', 'MD'), ('be', 'VB'), ('sure', 'JJ'), ('to', 'TO'), ('use', 'VB'), ('every', 'DT'), ('one', 'CD'), ('of', 'IN'), ('them', 'PRP'), ('The', 'DT'), ('more', 'JJR'), ('characters', 'NNS'), ('available', 'JJ'), ('for', 'IN'), ('your', 'PRP$'), ('password', 'NN'), ('and', 'CC'), ('the', 'DT'), ('more', 'JJR'), ('you', 'PRP'), ('use', 'VBP'), ('makes', 'VBZ'), ('it', 'PRP'), ('that', 'IN'), ('much', 'JJ'), ('harder', 'JJR'), ('to', 'TO'), ('figure', 'VB'), ('out', 'RP'), ('the', 'DT'), ('combination', 'NN'), ('Always', 'NNS'), ('make', 'VBP'), ('use', 'NN'), ('of', 'IN'), ('all', 'DT'), ('characters', 'NNS'), ('available', 'JJ'), ('for', 'IN'), ('a', 'DT'), ('strong', 'JJ'), ('and', 'CC'), ('secure', 'JJ'), ('password', 'NN')]

7.情感分析

(1)积极(polarity) / 消极 值越大,越积极(-1,1)
(2)主观(subjectivity)/客观 值越大,越主观(0,1)

注:生成的是俩个数值

1.案例1(积极的):

text = "JacksonYee is very handsome "
blob = textblob.TextBlob(text)
result_sentiment = blob.sentiment
print(result_sentiment)

运行结果:

Sentiment(polarity=0.65, subjectivity=1.0)

2.案例2(消极的):

text = "mike is very ugly "
blob = textblob.TextBlob(text)
result_sentiment = blob.sentiment
print(result_sentiment)

运行结果:

Sentiment(polarity=-0.9099999999999999, subjectivity=1.0)

8.机器翻译

#1.从英文翻译成中文

english_test = "Jackson is very handsome "
english_blob = textblob.TextBlob(english_test)
chinese_test = english_blob.translate(from_lang='en',to='zh-CN') #translate方法翻译  #en表示英语,to表示转化,zh-CN表示汉语  
print(chinese_test)
#2.从英文翻译成中文

ch_test = "保持饥饿,保持学习"
ch_blob = textblob.TextBlob(ch_test)
en_test = ch_blob.translate(from_lang='zh-CN',to='en') #translate方法翻译  #zh-CN表示汉语 ,to表示转化, en表示英语 
print(en_test)
      ————保持饥饿,保持学习
           Jackson_MVP
  • 5
    点赞
  • 0
    评论
  • 27
    收藏
  • 一键三连
    一键三连
  • 扫一扫,分享海报

©️2021 CSDN 皮肤主题: 终极编程指南 设计师:CSDN官方博客 返回首页
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、C币套餐、付费专栏及课程。

余额充值