20191126_2_英文情感分析

最新推荐文章于 2022-04-21 20:35:27 发布

Happy丶lazy

最新推荐文章于 2022-04-21 20:35:27 发布

阅读量1.9k

点赞数 1

分类专栏：接单文章标签：情感分析 python

本文链接：https://blog.csdn.net/qq_39309652/article/details/103448409

版权

接单专栏收录该内容

35 篇文章 4 订阅

订阅专栏

这个单子是主要是进行英文评论积极，消极，中立词的统计，主要是用了一些库，别的没有什么

import pandas as pd
from textblob import TextBlob

#进行数据的导入
test=pd.read_excel('爬虫结果.xls')

#查看数据钱
test.head()

	text
0	These are great but not much better then gen1....
1	Everyone is posting that there isn’t a differe...
2	These AirPods are amazing they automatically p...
3	My son really wanted airpods but his parents t...
4	Poor quality microphone. Not suitable for a re...

# -1.0 消极，1.0积极
#参考网站 https://blog.csdn.net/ziyonghong/article/details/83928347
def function(x):
    testimonial = TextBlob(x)
    testimonial.sentiment
    a=testimonial.sentiment.polarity#sentiment.polarity方法会返回0到1的数字，越接近-1说明越消极，接近1越积极
    if a<-0.5:
        return '消极'
    elif a>0.5:
        return '积极'
    else:
        return '中立'
#将每一行进行数据处理产生一个laber
test['laber']=test.apply(lambda x: function(x['text']),axis=1)

test.head()

	text	laber
0	These are great but not much better then gen1....	中立
1	Everyone is posting that there isn’t a differe...	中立
2	These AirPods are amazing they automatically p...	中立
3	My son really wanted airpods but his parents t...	中立
4	Poor quality microphone. Not suitable for a re...	中立

#统计每个类出现的次数
test['laber'].value_counts()

中立    2496
积极    1044
消极      20
Name: laber, dtype: int64

#通过groupy将label进行分组
rawgrp = test.groupby('laber')
chapter = rawgrp.agg(sum) # 只有字符串列的情况下，sum函数自动转为合并字符串
chapter = chapter[chapter.index != 0]
chapter
def function(a):
    return a.lower()      # 把所有字符中的大写字母转换成小写字母
chapter['text'] = chapter.apply(lambda x: function(x['text']), axis = 1)

chapter

	text
laber
中立	these are great but not much better then gen1....
消极	estuvieron funcionando bien pero la batería no...
积极	excellent, pretty useful... easy to use and re...

#中立
n=[]
a=['works fine','describe honestly','commonly speed','general speed','general speed']
#通过count函数进行统计
for i in a:
    n.append(chapter.text[0].count(i))

[3, 0, 0, 0, 0]

#消极
n=[]
a=['poor quality','unclearly','rough','slow delivery','over time','wrong address','no reply','impatient','ineffective']
for i in a:
    n.append(chapter.text[1].count(i))

[0, 0, 0, 0, 0, 0, 0, 0, 0]

#积极
n=[]
a=['high grade','high quality','easy to use','quick delivery','good packaging','wrong address','intact','return in time','friendly','effective']
for i in a:
    n.append(chapter.text[2].count(i))