我在百度大脑用数据看《你好,李焕英》

41 篇文章 3 订阅
2 篇文章 0 订阅

《你好,李焕英》

简介

2001年的某一天,刚刚考上大学的贾晓玲(贾玲 饰)经历了人生中的一次大起大落。一心想要成为母亲骄傲的她却因母亲突遭严重意外,而悲痛万分。在贾晓玲情绪崩溃的状态下,竟意外的回到了1981年,并与年轻的母亲李焕英(张小斐 饰)相遇,二人形影不离,宛如闺蜜。与此同时,也结识了一群天真善良的好朋友。晓玲以为来到了这片“广阔天地”,她可以凭借自己超前的思维,让母亲“大有作为”,但结果却让晓玲感到意外…

影评短片

海报

项目结构

一、爬取数据

二、数据分析

三、评论分析

一、数据爬取

1.环境准备

# 如果需要进行持久化安装, 需要使用持久化路径, 如下方代码示例:
!mkdir /home/aistudio/external-libraries
!pip install beautifulsoup4 -t /home/aistudio/external-libraries
!pip install lxml -t /home/aistudio/external-libraries
!pip install wordcloud -t /home/aistudio/external-libraries
import pandas
# 同时添加如下代码, 这样每次环境(kernel)启动的时候只要运行下方代码即可: 
import sys 
sys.path.append('/home/aistudio/external-libraries')

2.数据抓取

豆瓣链接https://movie.douban.com/subject/34841067/?from=playing_poster

import json
import re
import requests
from bs4 import BeautifulSoup

def crawl_data(url):
    headers = {
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36 Edg/87.0.664.66'
    }
    url = 'https://movie.douban.com/subject/34841067/reviews'+url
    try:
        response = requests.get(url, headers=headers) #发送请求返回页面数据
        parse(response) #调用parse函数对爬取的页面数据进行解析,并保存为JSON文件

    except Exception as e:
        print(e)
#对爬取的页面数据进行解析,并保存为JSON文件
def parse(response):
    item = {}
    # 将一段文档传入BeautifulSoup的构造方法,就能得到一个文档的对象, 可以传入一段字符串
    soup = BeautifulSoup(response.text, 'lxml')

    # 返回的是class为main review-item的<div>所有标签
    review_list = soup.find_all('div', {'class': 'main review-item'})

    for review_div in review_list:
        # 作者
        author = review_div.find('a', {'class': 'name'}).text
        # 发布时间
        pub_time = review_div.find('span', {'class': 'main-meta'}).text
        # 评分
        rating = review_div.find('span', {'class': 'main-title-rating'})
        if rating:
            rating = rating.get('title')
        else:
            rating = ""
        # 标题
        title = review_div.find('div', {'class': 'main-bd'}).find('a').text
        # 是否有展开按钮
        is_unfold = review_div.find('a', {'class': 'unfold'})
        if is_unfold:
            # 获取评论id
            review_id = review_div.find('div', {'class': 'review-short'}).get('data-rid')
            # 获取内容
            content = get_fold_content(review_id)
        else:
            content = review_div.find('div', {'class': 'short-content'}).text
        if content:
            content = re.sub(r"\s", '', content)

        item = {
            "author":author,
            "pub_time":pub_time,
            "rating":rating,
            "title":title,
            "content":content
        }

        fp.write(json.dumps(item) + "\n")
    # 如果有下一页
    next_url = soup.find('span', {'class': 'next'}).find('a')
    if next_url:
        # 请求下一页的数据
        crawl_data(next_url.get('href'))
    else:
        return
    return

def get_fold_content(review_id):
    headers = {
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36 Edg/87.0.664.66'
    }
    url = "https://movie.douban.com/j/review/{}/full".format(review_id)
    resp = requests.get(url,headers=headers)
    data = resp.json()
    content = data['html']
    content = re.sub(r"(<.+?>)","",content)
    return content

时间可能久一点,有点耐心就好

if __name__=='__main__':
    fp = open("reviews.json", 'w', encoding='utf-8')
    start_url = '?sort=time&start=0'
    print("爬虫执行中,请勿做其他操作,爬取完成后会有提示!目录下会生成reviews.json文件")
    crawl_data(start_url)   #调用前面定义的 crawl_data函数
    fp.close()
    print("爬取完成")
爬虫执行中,请勿做其他操作,爬取完成后会有提示!目录下会生成reviews.json文件
爬取完成

3.下载中文字体

!cat /etc/*-release
#当前的操作系统
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=16.04
DISTRIB_CODENAME=xenial
DISTRIB_DESCRIPTION="Ubuntu 16.04.6 LTS"
NAME="Ubuntu"
VERSION="16.04.6 LTS (Xenial Xerus)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 16.04.6 LTS"
VERSION_ID="16.04"
HOME_URL="http://www.ubuntu.com/"
SUPPORT_URL="http://help.ubuntu.com/"
BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"
VERSION_CODENAME=xenial
UBUNTU_CODENAME=xenial
# Linux系统默认字体文件路径
!ls /usr/share/fonts/
cmap  truetype	type1  X11
# 查看系统可用的ttf格式中文字体
!fc-list :lang=zh | grep ".ttf"
/usr/share/fonts/truetype/droid/DroidSansFallbackFull.ttf: Droid Sans Fallback:style=Regular

###本机上传黑体

# !wget https://mydueros.cdn.bcebos.com/font/simhei.ttf
# 将字体文件复制到matplotlib字体路径
!cp SIMHEI.TTF /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/mpl-data/fonts/ttf/
# 一般只需要将字体文件复制到系统字体目录下即可,但是在aistudio上该路径没有写权限,所以此方法不能用
# !cp simhei.ttf /usr/share/fonts/

# 创建系统字体文件路径
# !mkdir .fonts
# 复制文件到该路径
!cp SIMHEI.TTF .fonts/
!rm -rf .cache/matplotlib

二、数据分析

# 导入必要的python包
import json
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import cm
import seaborn as sns
import jieba
import collections
import wordcloud
sns.set_style("darkgrid")
plt.rcParams['font.sans-serif'] = ['SimHei']

import warnings
warnings.filterwarnings("ignore")
#将json格式的数据转成pandas 的DataFrame类型,方便采用dataframe的方法进行分析。代码如下
# 读取数据,并做格式转换
items = []
with open("reviews.json","r",encoding='utf-8') as fp:
    for line in fp:
        review = json.loads(line)
        items.append(review)
print("第一个用户的评论")        
print(items[0]) 
第一个用户的评论
{'author': '少年', 'pub_time': '2021-02-18 21:51:36', 'rating': '力荐', 'title': '妈妈说有点无聊', 'content': '看之前我跟妈妈说过程不要玩手机哦“好的!”妈妈爽快地答应道快到结束的时候,我感动地稀里糊涂但往旁边一看,妈妈却是有点坐不住了于是我硬是把泪忍住,不然显得尴尬又突兀电影一播放完她马上掏出手机我还纳闷这电影对于妈妈来说也没这么无聊吧只见她掏出手机按亮了屏幕看了一眼之后说:“快五点了,得赶紧回家做饭,你晚上想吃什么呀”忽然想到一直以来有许多时候爸爸妈妈给我打电话或者发短信时我都假装当时没看到那会的我或许是在看电影、或许是在工作、或许是正好在开车、但也或许是玩..甚至是无所事事我总把自己的大部分事都放在爸爸妈妈的事之前而在爸爸妈妈心里我总是排很前边甚至在看电影的时候都心心念念着是不是到点了晚上回去该给我做什么菜吃电影很棒非常感人但那些镜头所描述的或许不过是天下母亲所做的其中一分respect贾玲!respect妈妈!'}
# 进一步处理数据,处理缺失数值和缺失评论,转换时间格式
item_list = [[item['author'],item['pub_time'],item['rating'],item['title'],item['content']] for item in items]
review_df = pd.DataFrame(item_list,columns=['author','pub_time','rating','title','content'])
# 删除缺失数值
review_df.dropna(inplace=True)
# 将缺失的评论情况设置为放弃
review_df[review_df['rating']=='']['rating'] = '放弃'
# # 将字符串格式的时间转换为datatime类型
review_df['pub_time'] = pd.to_datetime(review_df['pub_time'])
review_df
authorpub_timeratingtitlecontent
0少年2021-02-18 21:51:36力荐妈妈说有点无聊看之前我跟妈妈说过程不要玩手机哦“好的!”妈妈爽快地答应道快到结束的时候,我感动地稀里糊涂但...
1萧碧宰治2021-02-18 21:51:10推荐影评这一部概括来说就是贾玲致敬她妈妈的故事。故事本身确实很感人,尤其是她妈妈达观的生活态度令我动...
2〃May2021-02-18 21:50:52你好,李焕英前面我是真的笑到肚子痛,全场都是一片欢声笑语的,剧情是真的搞笑,演员也是真的演的好,才让我们...
3L.2021-02-18 21:50:06你好,李焕英这是贾玲送给她母亲的礼物,没想到她的母亲还没等到就去世了。这电影又好笑又感人,电影开始是贾玲...
4赦火骑士2021-02-18 21:48:31你好吗?电影《你好,李焕英》不是电影的“电影”,如果这部电影的票房大卖,只有两个原因,一是目前没有真正好...
5纳凉2021-02-18 21:48:19推荐她与时光同行《你好,李焕英》这部影片站在导演贾玲自己的情感角度来看,是她作为孝顺女儿送给她现实生活中的妈...
6喜剧2021-02-18 21:46:37推荐真诚可弥补其它严格意义上来讲,这是一部中规中矩的喜剧,穿越/梦境部分处理的还不够平滑,电影里的笑点/包袱我...
7你身边2021-02-18 21:46:19这简直就不是电影。是加长小品。不知道看过的人什么感受。我的感觉做为一部电影他是不合格的,不说是烂片也差不多。第一,贾玲穿越...
8身后有个小太阳2021-02-18 21:46:01下辈子我们不做母女,做好姐妹吧带着7岁的外甥女去看的电影,整部剧据她说她只看懂了“欢迎光临”,我对此一点也不意外,毕竟整部...
9Proeng2021-02-18 21:42:28力荐关于李焕英和贾玲自由意志选择的思考?是宿命论?还是命定论???代入48岁李焕英穿越视角二刷,(影片开始第30分钟的那一场对话,有一个很明显的察觉到48岁李...
10乌鸦火堂2021-02-18 21:41:23推荐你的笑容,是花季少女时妈妈的模样“从我有记忆开始,妈妈就是中年妇女的模样,所以我会忘记,她也曾是花季少女”。电影《你好,李焕...
11苏蘇2021-02-18 21:40:45❤️今天也去贡献了票房,一个人坐在电影院里,还是落下了眼泪。想起就在今天中午,我妈还一件件地试着...
12畅妈2021-02-18 21:40:23力荐反转得意想不到。之前贾玲的这个小品就看了四五回,舞台上贾玲看着母亲泪流满面,我们在电视机前也是情难自制。当知...
13狗儿荡眼镜儿哥2021-02-18 21:39:58感情85桥段85搞笑75道具59过年档在家人强烈推荐下走进了影院…做为一个没心没肺的男性观众在周围窸窸窣窣的抽纸巾与擤鼻子声...
14六木2021-02-18 21:39:34一家三口哭着出来母亲照顾瘫痪在床的外婆,今天临时抽空我们一家三口去电影院,网上买了两张票,到电影院已经满座了...
15ssppyy2021-02-16 15:44:25力荐【拾遗拾忆】《你好,李焕英》:贾玲以及她母亲●2016年,贾玲开了一家公司叫大碗娱乐,第一个小品是《你好,李焕英》。作品根据贾玲真实故事...
16summertime2021-02-16 16:01:40你好,妈妈!(依兰爱情故事)果然电影李焕英也没有让我失望,比起三年前的小品李焕英,电影版更加细腻真挚,可见贾玲正在以自己...
17好哈享受2021-02-16 16:48:01很差电影始终是一个面向公共的商品,如果你希望公众理解你,花钱的应该是你看这部电影两个重要因素始终不能逃避开。第一个,那部同名小品的影响。第二个,贾玲对自己母亲的缅...
18鸽子和小马2021-02-18 21:37:23推荐哭了在春节前看到了预告就一直想去看,之前还特地搜了27分钟的小品看了。电影故事情节一点也不老套,...
19hht2021-02-18 21:36:58你好 李焕英春节期间电影档很多pyq晒电影也很多,在看了这么多天pyq发现你好,李焕英是唯一一部没有吐槽...
20小勋和妈妈2021-02-18 21:36:52推荐重新解读《你好,李焕英》,贾晓玲的妈妈也许并没有那么美好《你好,李焕英》的票房截止至今已有30亿,远远超过预期,并且网络上好评如潮,就连权威媒体都对...
21健康与理想2021-02-18 21:36:21第二遍才更看懂我的父母也是一直的贫穷,他们对我的期望也从来都是“健康快乐”就好。他们经历了许多许多的苦难,...
22小豆一言2021-02-18 21:35:28母亲,自我记事起便是中年妇女的模样今天,终于带着我的“李焕英”去看了你好,李焕英。不出意外,哭成狗。我的“李焕英”却笑着对我说...
23&简单v快乐&2021-02-18 21:34:33推荐你好,李焕英大过年的,本来不打算看哭唧唧的电影。可经不住朋友推荐就去看了。有笑点,有泪点(请多备纸巾调整...
24静大人2021-02-18 21:34:01你好 李焕英今天一口气看了三个电影,刺杀小说家、人潮汹涌还有李焕英,记忆最深刻的还是李焕英,现在已经回到...
25Preferrr_2021-02-18 21:33:07力荐“因为神不是无处不在,所以创造了妈妈”贾玲在拍这部片子时一定有很感性的因素在里面,因为这是在纪念她妈妈,所以电影里的李焕英其实表现...
26capableAlice2021-02-18 21:31:22愿这是一封寄往人间的信#你好,李焕英#毫无例外地和我妈在影院哭成了两只鹅。“唐探三没法和李焕英相提并论”,电影散场...
27漂亮的格涅2021-02-18 21:29:59力荐你好,李焕英,你好,妈妈!这部2021年02月12日推出的贺岁档高分喜剧电影《你好,李焕英》,截止发稿前本片在豆瓣上已...
28狂算子2021-02-16 12:37:35我的观后感当年贾玲和白凯南在春晚说相声,我特别烦贾玲,后来,我黑转粉了。我觉得妖猫的杨贵妃应该贾玲演。...
29Pretend°🐳2021-02-16 13:56:38Hi,mam个人点评,没有一点专业领域常识不合理地方:1.贾玲的穿越以及降临,应该是全剧最不合理的地方,...
..................
4284沫沫的旅程2021-02-13 00:01:47值8分,推荐首先,故事说的很完整流畅,普通人都能看明白的完整。我觉得面对普通人的电影,首先就应该把故事说...
4285Paki2021-02-12 23:58:29焕英让人惊喜一开始并没有报太大期望去看,看完后却想用“低开高走”形容它,这绝不是个贬义。故事前半段比较平...
4286觅渡2021-02-12 23:43:28真挚的感情最动人1,电影的结构不算精巧,穿越的设定也比较传统,但胜在双穿越后的感情升华到了高潮,从女儿单向的...
4287墨染浅秋2021-02-12 23:15:13推荐你好,李焕英今年春节档电影,有了一个不好的开头,预期被《唐人街探案3》拉的很低。第二场又是爆满的喜剧片,...
4288jayharry2021-02-12 22:10:35力荐“妈妈希望你健康快乐就好~”一直很喜欢玲姐和腾哥的搭档,总觉得只要这俩人在一起就可以笑的很开心,大年初一看电影就选择了《...
4289星辰寒霜2021-02-12 21:20:40很差先喜后悲玩不腻吗?每个人开始有记忆的时候,母亲就是个中年妇女了。可是,每个母亲都曾经是个花季少女啊,是为了儿女...
4290南方有北2021-02-12 20:05:00力荐《你好,李焕英》影评【含轻微剧透,慎瞅,易影响初次观影体验】因为对贾玲和沈腾特别有信心,所以早就很期待这部电影了...
4291西岭2021-02-12 19:54:55力荐不要随便非议相信英子会为有贾玲这样的女儿而自豪。这是是一部纪念自己母亲的电影,不要随便非议了…相信英子会...
4292谢朝青2021-02-12 19:26:42力荐多角度评价你好李焕英(非专业影评人)色调摄影:开头最让我惊艳的是,80年代的色调,从黑白变成彩色,晓玲第一次不再平面的认识那个年...
4293地球守护者2021-02-12 17:38:18非导演出身也可以有好作品我看好多人说导演门槛低什么的,我们这学导演的时候老师就讲过为什么非导演出身能导出好作品,什么...
4294狂风不要浪2021-02-12 16:33:50结尾好评这些给一星的,说贾玲消费自己母亲恰烂钱的,怕都是没有妈的孤儿吧。还说别人不能做导演了,想起一...
4295方聿南2021-02-12 15:57:23推荐妈妈的孩子李焕英像你的妈妈吗?不好说。但如果只说关于她的某一点,相信许多人会点头称是。从小到大,妈妈会...
4296pickhool2021-02-12 12:01:55力荐唐探拉踩你好李焕英我有点看不懂了,大早上也够忙的,贬低你好李焕英剧组也是让我真真切切的看到了陈思诚导演与运营的...
4297Max晓🎵2021-02-09 10:32:30第一次忍不住写影评,只想说,真正的悲伤是不能一次又一次撕开给人看的!写这篇影评之前,真的挺喜欢贾玲的,不管是她在综艺节目中会照顾嘉宾,还是她反应很快能接梗,贾玲...
4298评丫-晓莉2021-02-08 21:17:46不一样的贾玲当年看贾玲小品的时候把我逗乐了,现在看个电影莫名伤感想哭,这部电影,我想一定很精彩(公众号:...
4299评丫-晓莉2021-02-06 22:51:12把我给看哭了怎么说呢,看过小品,令我流泪到不行,因为我女朋友就是这样的经历,本想带她一起看,可惜没坚持到...
4300评丫-晓莉2021-02-06 22:25:34爆米花又要大卖了我居然没看过这个小品,今年过年不能回家,就不能去妈妈坟前看看她了,以看这个电影来怀念一下她吧...
4301段钰2021-02-06 01:12:11较差艺人的门槛低了 导演的门槛也低了 很多人不是在搞艺术就是单纯的背着牌坊赚钱看那个小品的时候觉得矫情比煽情还多如果你觉得工作(艺人算是在工作吧)重要那就不要事后哔哔如果...
4302爬太空摘下月亮2021-02-03 17:31:46期待看到一部搞笑的好看的电影记得但是看完这个小品感动死了,看到沈腾更期待,当然这一切的前提是男主是沈腾,不要学羞羞的铁拳...
4303睡懒觉的猫_10792021-02-02 17:47:15力荐已经订好票了,要去看说实话原来对贾玲的小品感觉一般,但是当初看了她这个《你好,李焕英》的小品,让我很感动。也许是...
4304Hello金智媛2021-02-02 16:36:26我想要一直一直一直在你身旁“我有一个很小很小的愿望说出来可能你不相信我想我会我要一直一直一直在你身旁”2016年十月看...
4305小 hei2021-02-02 14:31:55会哭惨了吧想看又不敢看的一部电影,会在影院哭过去了吧,看预告都是一把泪,有笑有泪,最怕感同身受,想念妈...
4306少年不知愁滋味2021-01-31 20:07:10力荐如果我能回到过去,我不会让我爸妈结婚如题如果我能回到过去,我想我会撮合我妈和另外一个男人,不为别的,以后没有我也无所谓跟我爸爸太...
4307曼景2021-01-28 20:28:00人性之光:善电影还没正式上映,仅凭一些电影片段和之前的小品记忆都满心期待着……我是不是太久没逛豆瓣了,现...
4308南若柔2021-01-19 10:43:56匿号里环影医学有两个目标,一是治病,二是防病。目前的趋势,正由指向疾病的医学转向指向人的健康的医学。这...
4309姬如紫♀2021-01-14 09:24:14腻号离幻音小时候,她父亲贪酒,酒品又不好,喝醉了就耍酒疯,骂人。在她的记忆中,每到傍晚的时候,如果父亲...
4310滕王阁2021-01-12 17:19:26令人汗颜啊!如今,导演的门槛也太低了吧?...................................
4311查理沃伦吉姆2020-10-29 23:48:20先给打个八分,不够再加!贾玲、沈腾、孙集斌、何欢、魏翔,看这阵容,影片应该差不了。贾玲导演的喜剧处女作,应该会在故事...
4312华语娱乐酱2020-03-24 10:58:32贾玲也要当导演了啊?花费三年筹拍,邀来老搭档沈腾、张小斐助阵贾玲是中国内地著名喜剧演员,她不仅放得开而且性格讨喜,收获了很多观众的喜爱。最近贾玲又有好消...
4313影视口碑榜2019-09-29 16:21:55贾玲筹备了三年的电影,到底要拍些什么?故事的小品版早就播出了小编常常以为,明星大都遥不可及,但了解到他们背后的故事以后,我们才能够明白,每个人都不容易,...

4314 rows × 5 columns

# 分析评论日期 
import re
from matplotlib import dates
import matplotlib.pyplot as plt
from matplotlib.font_manager import FontProperties
Font = FontProperties(fname='SIMHEI.TTF', size=16)
plt.figure(figsize=(15,5))

# 2. 添加一个新的pub_date
review_df['pub_date'] = review_df['pub_time'].dt.date
review_df = review_df[pd.to_datetime(review_df['pub_date']).dt.day>=12]
review_df = review_df[pd.to_datetime(review_df['pub_date']).dt.day<=18]

# # 3. 根据日期分组绘图
review_date_df = review_df.groupby(['pub_date']).count()
review_date_df.plot(kind='line')
<matplotlib.axes._subplots.AxesSubplot at 0x7f376a29cd90>




<Figure size 1080x360 with 0 Axes>

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-N2xNyXs5-1615476706896)(output_22_2.png)]

# 分析评论时间
import datetime
plt.figure(figsize=(10,5))
time_range = [0,2,4,6,8,10,12,14,16,18,20,22,24]
review_time_df = review_df['pub_time'].dt.hour
time_range_counts = pd.cut(review_time_df,bins=time_range,include_lowest=True,right=False).value_counts()
ax = time_range_counts.plot(kind="bar")
_ = ax.set_xticklabels(labels=time_range_counts.index,rotation=0)

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-WyrMmpP4-1615476706898)(output_23_0.png)]

分析

由此可见,大家都是夜猫子,晚上8点–凌晨2点,大家是最活跃了。

豆瓣养了一堆夜猫子啊啊啊。。。。。。

def generate_wc(string_data):
    # 文本预处理
    pattern = re.compile(u'\t|\n|\.|-|:|;|\)|\(|\?|"') # 定义正则表达式匹配模式
    string_data = re.sub(pattern, '', string_data) # 将符合模式的字符去除
    # 文本分词
    seg_list_exact = jieba.cut(string_data, cut_all = False) # 精确模式分词
    object_list = []
    remove_words = []
    with open("work/停用词库.txt",'r',encoding='utf-8') as fp:
        for word in fp:
            remove_words.append(word.replace("\n",""))
    for word in seg_list_exact: # 循环读出每个分词
        if word not in remove_words: # 如果不在去除词库中
            object_list.append(word) # 分词追加到列表
    # 词频统计
    word_counts = collections.Counter(object_list) # 对分词做词频统计
    word_counts_top20 = word_counts.most_common(20) # 获取前10最高频的词
    # 词频展示
    wc = wordcloud.WordCloud(
        font_path='SIMHEI.TTF', # 设置字体格式
        background_color="#000000", # 设置背景图
        max_words=150, # 最多显示词数
        max_font_size=60, # 字体最大值
        width=707,
        height=490
    )
    wc.generate_from_frequencies(word_counts) # 从字典生成词云
    plt.figure(figsize=(60,20))
    plt.imshow(wc) # 显示词云
    plt.axis('off') # 关闭坐标轴
    plt.show() # 显示图像
content_str = ""
for row in review_df.index:
    content = review_df.loc[row,'content']
    content_str += content

generate_wc(content_str)

保存为csv数据

review_df.to_csv('reviews.csv',index=None)
new_data=review_df['content']
# header=0不保存列名   index=0   不保存行索引
new_data.to_csv('comment.csv',index=None, header=0)
review=pd.read_csv('reviews.csv')
print(review.describe())
       author             pub_time rating   title  \
count    4284                 4285   3031    4284   
unique   4118                 4213      6    3775   
top      江湖骗子  2021-02-18 00:03:28     力荐  你好,李焕英   
freq       12                    4   1657     155   

                                                  content    pub_date  
count                                                4283        4283  
unique                                               4237          10  
top     《你好,李焕英》无疑是春节热映电影最大的黑马,虽说是贾玲作为导演的电影处女座,却意料之外地好...  2021-02-16  
freq                                                    2         814  

三、评论分析

!head comment.csv -n 3
看之前我跟妈妈说过程不要玩手机哦“好的!”妈妈爽快地答应道快到结束的时候,我感动地稀里糊涂但往旁边一看,妈妈却是有点坐不住了于是我硬是把泪忍住,不然显得尴尬又突兀电影一播放完她马上掏出手机我还纳闷这电影对于妈妈来说也没这么无聊吧只见她掏出手机按亮了屏幕看了一眼之后说:“快五点了,得赶紧回家做饭,你晚上想吃什么呀”忽然想到一直以来有许多时候爸爸妈妈给我打电话或者发短信时我都假装当时没看到那会的我或许是在看电影、或许是在工作、或许是正好在开车、但也或许是玩..甚至是无所事事我总把自己的大部分事都放在爸爸妈妈的事之前而在爸爸妈妈心里我总是排很前边甚至在看电影的时候都心心念念着是不是到点了晚上回去该给我做什么菜吃电影很棒非常感人但那些镜头所描述的或许不过是天下母亲所做的其中一分respect贾玲!respect妈妈!
这一部概括来说就是贾玲致敬她妈妈的故事。故事本身确实很感人,尤其是她妈妈达观的生活态度令我动容。其实看到电影最后贾晓玲(角色名)发现她妈也是穿越的时候,我当时第一个想法是,坏了,这个故事可能又会是一个感动但又无奈和无能为力的一个结尾:晓玲发现她妈妈即将去世,但是她无能为力;妈妈意外去世让晓玲的人生受到了震动,但是她可能仍无力改变已有的人生轨迹,或者说很难改变已有的可见的未来。如果结尾硬让贾晓玲最后功成名就,就像让红楼梦中的贾宝玉最后中举一样,也不是假,但就是令人无法真正接受以至于总有狗尾续貂之感,所以看到这里时,我其实很担心这个电影该怎么结这个尾。但是,最后的电影结尾我觉得非常的好:贾晓玲人生受到了震动,她逐一实现了当初许诺给妈妈的那些“画饼”,并且创造性的给了一个晓玲开着车载着“妈妈”兜风的场景,此时她们脸上都是灿烂的笑容,而一瞬之间想象中的妈妈消失,镜头拉远,晓玲开着“豪车”(桑塔纳改装的敞篷车,她此前给妈妈的一个许诺)远去。最妙的地方在于,晓玲既实现的“富贵”又没有实现,即实现了那些千奇百怪的梦想,这些梦想本身并不真的豪华甚至有点“皇帝的金锄头”一般令人发笑,但就是这些梦想的实现在暗示晓玲生活质量的提升、物质条件的改善的同时又反映出她并没有因为走得太远而忘记为什么出发。这样也就避免了直接用“中举”来表现功成名就导致的低俗、粗制滥造和没心没肺感,比如,如果影片中用了一辆真正的豪车(劳斯莱斯这种),那这种感动就会大打折扣
前面我是真的笑到肚子痛,全场都是一片欢声笑语的,剧情是真的搞笑,演员也是真的演的好,才让我们get到笑点。后面就是歌颂了伟大的母爱,无论在哪里,母亲永远爱自己的孩子,最后那里,我旁边的人都哭的稀里哗啦的,还听到了抽泣的声音,我旁边的人一直在看我哭没哭。总之这部剧,我很推荐,有感动,有欢声笑语

再写一遍情感分析网络没啥意义,直接使用paddlehub岂不更香?

# 情感倾向分析
!hub install senta_lstm==1.2.0
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/pandas/core/tools/datetimes.py:3: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
  from collections import MutableMapping
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/rcsetup.py:20: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
  from collections import Iterable, Mapping
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/colors.py:53: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
  from collections import Sized
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/nltk/decorators.py:68: DeprecationWarning: `formatargspec` is deprecated since Python 3.5. Use `signature` and the `Signature` object directly
  regargs, varargs, varkwargs, defaults, formatvalue=lambda value: ""
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/nltk/lm/counter.py:15: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
  from collections import Sequence, defaultdict
The version of PaddlePaddle(2.0.0) or PaddleHub(1.6.0) can not match module, please upgrade your PaddlePaddle or PaddleHub according to the form below.
+--------------------------------+----------+------------------+------------------+
|          [1;32mResourceName[0m          | [1;32mVersion[0m  |   [1;32mPaddlePaddle[0m   |    [1;32mPaddleHub[0m     |
+--------------------------------+----------+------------------+------------------+
|           [1;33msenta_lstm[0m           |  1.2.0   |     >=1.8.0      |     >=1.8.0      |
+--------------------------------+----------+------------------+------------------+
|           [1;33msenta_lstm[0m           |  1.1.0   |     >=1.6.3      |     >=1.6.0      |
+--------------------------------+----------+------------------+------------------+
|           [1;33msenta_lstm[0m           |  1.0.0   |        -         |        -         |
+--------------------------------+----------+------------------+------------------+

[0m
!pip uninstall   paddlehub -y
Uninstalling paddlehub-2.0.0:
  Successfully uninstalled paddlehub-2.0.0
!pip install   paddlehub==2.0.0
Looking in indexes: https://mirror.baidu.com/pypi/simple/
Collecting paddlehub==2.0.0
[?25l  Downloading https://mirror.baidu.com/pypi/packages/2d/5f/9f6dee0444cb843f0585d2f5b0f21b59bee1c5cf386ec0b0acfcc8bb7336/paddlehub-2.0.0-py3-none-any.whl (191kB)
     |████████████████████████████████| 194kB 16.2MB/s eta 0:00:01
[?25hRequirement already satisfied: rarfile in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from paddlehub==2.0.0) (3.1)
Requirement already satisfied: pyzmq in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from paddlehub==2.0.0) (18.1.1)
Requirement already satisfied: tqdm in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from paddlehub==2.0.0) (4.36.1)
Requirement already satisfied: flask>=1.1.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from paddlehub==2.0.0) (1.1.1)
Requirement already satisfied: colorama in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from paddlehub==2.0.0) (0.4.4)
Requirement already satisfied: paddlenlp>=2.0.0b2 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from paddlehub==2.0.0) (2.0.0rc1)
Requirement already satisfied: gitpython in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from paddlehub==2.0.0) (3.1.13)
Requirement already satisfied: easydict in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from paddlehub==2.0.0) (1.9)
Requirement already satisfied: filelock in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from paddlehub==2.0.0) (3.0.12)
Requirement already satisfied: gunicorn>=19.10.0; sys_platform != "win32" in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from paddlehub==2.0.0) (20.0.4)
Requirement already satisfied: pyyaml in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from paddlehub==2.0.0) (5.1.2)
Requirement already satisfied: opencv-python in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from paddlehub==2.0.0) (4.1.1.26)
Requirement already satisfied: colorlog in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from paddlehub==2.0.0) (4.1.0)
Requirement already satisfied: Pillow in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from paddlehub==2.0.0) (7.1.2)
Requirement already satisfied: matplotlib in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from paddlehub==2.0.0) (2.2.3)
Requirement already satisfied: numpy in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from paddlehub==2.0.0) (1.16.4)
Requirement already satisfied: visualdl>=2.0.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from paddlehub==2.0.0) (2.1.1)
Requirement already satisfied: packaging in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from paddlehub==2.0.0) (20.9)
Requirement already satisfied: itsdangerous>=0.24 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from flask>=1.1.0->paddlehub==2.0.0) (1.1.0)
Requirement already satisfied: Werkzeug>=0.15 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from flask>=1.1.0->paddlehub==2.0.0) (0.16.0)
Requirement already satisfied: Jinja2>=2.10.1 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from flask>=1.1.0->paddlehub==2.0.0) (2.10.1)
Requirement already satisfied: click>=5.1 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from flask>=1.1.0->paddlehub==2.0.0) (7.0)
Requirement already satisfied: seqeval in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from paddlenlp>=2.0.0b2->paddlehub==2.0.0) (1.2.2)
Requirement already satisfied: h5py in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from paddlenlp>=2.0.0b2->paddlehub==2.0.0) (2.9.0)
Requirement already satisfied: jieba in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from paddlenlp>=2.0.0b2->paddlehub==2.0.0) (0.42.1)
Requirement already satisfied: gitdb<5,>=4.0.1 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from gitpython->paddlehub==2.0.0) (4.0.5)
Requirement already satisfied: setuptools>=3.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from gunicorn>=19.10.0; sys_platform != "win32"->paddlehub==2.0.0) (41.4.0)
Requirement already satisfied: cycler>=0.10 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from matplotlib->paddlehub==2.0.0) (0.10.0)
Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from matplotlib->paddlehub==2.0.0) (2.4.2)
Requirement already satisfied: python-dateutil>=2.1 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from matplotlib->paddlehub==2.0.0) (2.8.0)
Requirement already satisfied: six>=1.10 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from matplotlib->paddlehub==2.0.0) (1.15.0)
Requirement already satisfied: kiwisolver>=1.0.1 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from matplotlib->paddlehub==2.0.0) (1.1.0)
Requirement already satisfied: pytz in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from matplotlib->paddlehub==2.0.0) (2019.3)
Requirement already satisfied: Flask-Babel>=1.0.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from visualdl>=2.0.0->paddlehub==2.0.0) (1.0.0)
Requirement already satisfied: shellcheck-py in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from visualdl>=2.0.0->paddlehub==2.0.0) (0.7.1.1)
Requirement already satisfied: pre-commit in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from visualdl>=2.0.0->paddlehub==2.0.0) (1.21.0)
Requirement already satisfied: protobuf>=3.11.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from visualdl>=2.0.0->paddlehub==2.0.0) (3.14.0)
Requirement already satisfied: flake8>=3.7.9 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from visualdl>=2.0.0->paddlehub==2.0.0) (3.8.2)
Requirement already satisfied: requests in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from visualdl>=2.0.0->paddlehub==2.0.0) (2.22.0)
Requirement already satisfied: bce-python-sdk in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from visualdl>=2.0.0->paddlehub==2.0.0) (0.8.53)
Requirement already satisfied: MarkupSafe>=0.23 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from Jinja2>=2.10.1->flask>=1.1.0->paddlehub==2.0.0) (1.1.1)
Requirement already satisfied: scikit-learn>=0.21.3 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from seqeval->paddlenlp>=2.0.0b2->paddlehub==2.0.0) (0.22.1)
Requirement already satisfied: smmap<4,>=3.0.1 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from gitdb<5,>=4.0.1->gitpython->paddlehub==2.0.0) (3.0.5)
Requirement already satisfied: Babel>=2.3 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from Flask-Babel>=1.0.0->visualdl>=2.0.0->paddlehub==2.0.0) (2.8.0)
Requirement already satisfied: aspy.yaml in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from pre-commit->visualdl>=2.0.0->paddlehub==2.0.0) (1.3.0)
Requirement already satisfied: cfgv>=2.0.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from pre-commit->visualdl>=2.0.0->paddlehub==2.0.0) (2.0.1)
Requirement already satisfied: nodeenv>=0.11.1 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from pre-commit->visualdl>=2.0.0->paddlehub==2.0.0) (1.3.4)
Requirement already satisfied: virtualenv>=15.2 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from pre-commit->visualdl>=2.0.0->paddlehub==2.0.0) (16.7.9)
Requirement already satisfied: importlib-metadata; python_version < "3.8" in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from pre-commit->visualdl>=2.0.0->paddlehub==2.0.0) (0.23)
Requirement already satisfied: identify>=1.0.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from pre-commit->visualdl>=2.0.0->paddlehub==2.0.0) (1.4.10)
Requirement already satisfied: toml in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from pre-commit->visualdl>=2.0.0->paddlehub==2.0.0) (0.10.0)
Requirement already satisfied: mccabe<0.7.0,>=0.6.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from flake8>=3.7.9->visualdl>=2.0.0->paddlehub==2.0.0) (0.6.1)
Requirement already satisfied: pycodestyle<2.7.0,>=2.6.0a1 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from flake8>=3.7.9->visualdl>=2.0.0->paddlehub==2.0.0) (2.6.0)
Requirement already satisfied: pyflakes<2.3.0,>=2.2.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from flake8>=3.7.9->visualdl>=2.0.0->paddlehub==2.0.0) (2.2.0)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from requests->visualdl>=2.0.0->paddlehub==2.0.0) (1.25.6)
Requirement already satisfied: certifi>=2017.4.17 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from requests->visualdl>=2.0.0->paddlehub==2.0.0) (2019.9.11)
Requirement already satisfied: idna<2.9,>=2.5 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from requests->visualdl>=2.0.0->paddlehub==2.0.0) (2.8)
Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from requests->visualdl>=2.0.0->paddlehub==2.0.0) (3.0.4)
Requirement already satisfied: future>=0.6.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from bce-python-sdk->visualdl>=2.0.0->paddlehub==2.0.0) (0.18.0)
Requirement already satisfied: pycryptodome>=3.8.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from bce-python-sdk->visualdl>=2.0.0->paddlehub==2.0.0) (3.9.9)
Requirement already satisfied: scipy>=0.17.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from scikit-learn>=0.21.3->seqeval->paddlenlp>=2.0.0b2->paddlehub==2.0.0) (1.3.0)
Requirement already satisfied: joblib>=0.11 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from scikit-learn>=0.21.3->seqeval->paddlenlp>=2.0.0b2->paddlehub==2.0.0) (0.14.1)
Requirement already satisfied: zipp>=0.5 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from importlib-metadata; python_version < "3.8"->pre-commit->visualdl>=2.0.0->paddlehub==2.0.0) (0.6.0)
Requirement already satisfied: more-itertools in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from zipp>=0.5->importlib-metadata; python_version < "3.8"->pre-commit->visualdl>=2.0.0->paddlehub==2.0.0) (7.2.0)
Installing collected packages: paddlehub
Successfully installed paddlehub-2.0.0
import pandas as pd

comment=pd.read_csv('comment.csv',header=None)
print(comment.describe())
                                                        0
count                                                4284
unique                                               4238
top     因为看过贾玲当时的小品,所以能想象到这个电影一定很感人,一直犹豫要不要去电影院看,但今天我过...
freq                                                    2
import paddlehub as hub

mylist=[]

senta = hub.Module(name="senta_lstm")
for index, row in comment.iterrows():
    # print(row[0])
    texts=''
    positive_probs=''
    negative_probs=''
    sentiment_key=''
    sentiment_label=''
    result = senta.sentiment_classify(texts=[row[0]])[0]
    
    print(result)
    texts=row[0]
    positive_probs=result['positive_probs']
    negative_probs=result['negative_probs']
    sentiment_key=result['sentiment_key']
    sentiment_label=result['sentiment_label']
    mylist.append([texts,sentiment_label,positive_probs,negative_probs,sentiment_key])
[2021-02-19 00:02:13,808] [    INFO] - Installing senta_lstm module


Downloading senta_lstm
[==================================================] 100.00%
Uncompress /home/aistudio/.paddlehub/tmp/tmpg9qd_34x/senta_lstm
[==================================================] 100.00%


[2021-02-19 00:02:31,325] [    INFO] - Successfully installed senta_lstm-1.1.0
[2021-02-19 00:02:33,474] [    INFO] - Installing lac module


Downloading lac
[==================================================] 100.00%
Uncompress /home/aistudio/.paddlehub/tmp/tmp61wja_ha/lac
[==================================================] 100.00%


[2021-02-19 00:02:34,777] [    INFO] - Successfully installed lac-2.2.0


{'text': '4.5分。感慨万千,好久没看过这么烂的片子了又是毫无世界观逻辑的穿越烂梗,不过喜剧题材嘛!且按下不表。那么重点的喜剧部分呢?也是技法拙劣,味同嚼蜡。所有的笑点,全都是低级的语言游戏,是小品化,舞台化的,完全没有利用电影独到的视听,问问自己,去掉了字幕,还有几个段子能get到?在所谓的反转前,所有的人物塑造、事件推进、情感建立都非常拧巴,毕竟小品凑电影时长,要掌握好节奏,难呀!反转后,搭着轰隆隆的配乐,如PPT式的移步换景,投下一颗颗悼亡浓度极高的情绪炸弹,这就能赢得叫好声一片?都说创作者的真诚最能打动人,不好意思,郭敬明捞钱的真诚难道就不打动人吗?这片子的商业成绩如此之好,第一,因为选择了年代戏。80年代厂院体制下包裹的集体主义,是老一辈人的青春,竟也是年轻一代人的向往。第二,充分说明了我国人民对于小品,有着硬通货般的强烈需求。为了防止有人拿“阳春白雪,下里巴人”那一套来杠,我最后也得说一句:都别骂cw了,承认吧,全TM都是cw的受众!', 'sentiment_label': 0, 'sentiment_key': 'negative', 'positive_probs': 0.0092, 'negative_probs': 0.9908}
from pandas.core.frame import DataFrame

data=DataFrame(mylist)
止有人拿“阳春白雪,下里巴人”那一套来杠,我最后也得说一句:都别骂cw了,承认吧,全TM都是cw的受众!', 'sentiment_label': 0, 'sentiment_key': 'negative', 'positive_probs': 0.0092, 'negative_probs': 0.9908}


```python
from pandas.core.frame import DataFrame

data=DataFrame(mylist)
data.head(5)
01234
0看之前我跟妈妈说过程不要玩手机哦“好的!”妈妈爽快地答应道快到结束的时候,我感动地稀里糊涂但...10.63770.3623positive
1这一部概括来说就是贾玲致敬她妈妈的故事。故事本身确实很感人,尤其是她妈妈达观的生活态度令我动...10.95430.0457positive
2前面我是真的笑到肚子痛,全场都是一片欢声笑语的,剧情是真的搞笑,演员也是真的演的好,才让我们...10.94170.0583positive
3这是贾玲送给她母亲的礼物,没想到她的母亲还没等到就去世了。这电影又好笑又感人,电影开始是贾玲...10.89190.1081positive
4《你好,李焕英》不是电影的“电影”,如果这部电影的票房大卖,只有两个原因,一是目前没有真正好...00.00210.9979negative
data[4].value_counts()
positive    3452
negative     832
Name: 4, dtype: int64
3452/(832+3452)
0.8057889822595705

统计可见好评率 80.578%

data.to_csv('comment_senta_')

aistudio项目地址

我在百度大脑用数据看《你好,李焕英》https://aistudio.baidu.com/aistudio/projectdetail/1551780

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值