python nltk 情感分析

最新推荐文章于 2024-04-27 10:28:00 发布

廷益--飞鸟

最新推荐文章于 2024-04-27 10:28:00 发布

阅读量833

点赞数

分类专栏： python

本文链接：https://blog.csdn.net/weixin_45875105/article/details/108253961

版权

python 专栏收录该内容

120 篇文章 16 订阅

订阅专栏

文件下载地址:
链接: https://pan.baidu.com/s/1WeEyUKfrYoaZNd-jpl_UCA 提取码: ge7a

"""
    电影评论 情感分析
"""
import nltk.corpus as nc
import nltk.classify as cf
import nltk.classify.util as cu

# 加载正向样本
pos_data = []
fileids = nc.movie_reviews.fileids("pos")

for fileid in fileids:
    sample = {}
    words = nc.movie_reviews.words(fileid)
    for word in words:
        sample[word] = True
    pos_data.append((sample, "POSITIVE"))


print(len(pos_data))

# 加载负向样本
neg_data = []
fileids = nc.movie_reviews.fileids("neg")

for fileid in fileids:
    sample = {}
    words = nc.movie_reviews.words(fileid)
    for word in words:
        sample[word] = True
    neg_data.append((sample, "NEGTIVE"))


# 整理数据集 80%训练 20%测试
pnumb, nnumb = int(len(pos_data) * 0.8), int(len(neg_data) * 0.8)
train_data = pos_data[:pnumb] + neg_data[:nnumb]
test_data = pos_data[pnumb:] + neg_data[nnumb:]

# 创建模型 朴素贝叶斯分类
model = cf.NaiveBayesClassifier.train(train_data)

# 正确率计算
acc = cu.accuracy(model, test_data)
print(acc)

# 模拟业务场景
reviews = [
    'It is an amazing movie.',
    'This is a dull movie. I would never recommend it to anyone.',
    'The cinematography is pretty great in this movie.',
    'The direction was terrible and the story was all over the place.']

for review in reviews:
    sample = {}
    words = review.split()
    for word in words:
        sample[word] = True
    pcls = model.classify(sample)
    print(review, '->', pcls)

在这里插入图片描述

廷益--飞鸟

关注

0
点赞
踩
8

收藏

觉得还不错? 一键收藏
打赏
0
评论
python nltk 情感分析

文件下载地址:链接: https://pan.baidu.com/s/1WeEyUKfrYoaZNd-jpl_UCA 提取码: ge7a""" 电影评论情感分析"""import nltk.corpus as ncimport nltk.classify as cfimport nltk.classify.util as cu# 加载正向样本pos_data = []fileids = nc.movie_reviews.fileids("pos")for fileid in
复制链接

扫一扫