爬取豆瓣Top250部电影+Flask框架显示

最新推荐文章于 2023-01-09 14:51:54 发布

KaiKai-G

最新推荐文章于 2023-01-09 14:51:54 发布

阅读量2.7k

点赞数

分类专栏： Python

本文链接：https://blog.csdn.net/kaikai_gege/article/details/115413164

版权

Python 专栏收录该内容

8 篇文章 0 订阅

订阅专栏

1、导入包

from flask import Flask,render_template
import os
import sqlite3
import jieba                            #分词
import numpy as np                      #矩阵运算(可将图片转为数组)
from matplotlib import pyplot as plt    #绘图
from wordcloud import WordCloud         #词云
from PIL import Image                   #图片处理

2、设置首页

可以直接通过：ip地址:5000(端口号)
或者：ip地址:5000(端口号)/index
进行访问首页

#首页
@app.route('/')
def index():
    return render_template('index.html')

#home
@app.route('/index')
def home():
    return index()

在这里插入图片描述

3、在网页显示250条电影数据

这些数据是从sqlite中取出来的，如何爬取数据和存储到数据库：点击跳转

#movie
@app.route('/movie')
def movie():
    #数据库操作将取出来的数据存放到列表
    dataList = []

    connect = sqlite3.connect('movies.db')
    cursor = connect.cursor()
    sql = "select * from movie250"
    execute = cursor.execute(sql)
    for i in execute:
        dataList.append(i)
    connect.commit()
    connect.close()
    return render_template('movie.html',dataList = dataList)

在这里插入图片描述

4、通过数据库查询出得分电影分数分布并作图

很好用的用代码作图程序Echarts：https://echarts.apache.org/zh/index.html

#score
@app.route('/score')
def score():
    score = []  #存放评分
    count = []  #评分数

    connect = sqlite3.connect('movies.db')
    cursor = connect.cursor()
    sql = "select score,count(score) from movie250 group by score "
    execute = cursor.execute(sql)
    for e in execute:
        score.append(e[0])
        count.append(e[1])

    return render_template('score.html',score = score,count = count)

在这里插入图片描述

5、将电影的概述作成词云

首先将数据库的数据取出来，合并成一个字符串
通过jieba进行分词，它会将词自动分裂出来，返回列表然后我们join通过空格分开这些词又变成一个字符串
打来图片，通过numpy将图片(白色背景)转换为数组
使用WordCloud设置一些图片参数并把图片和分好的字符串结合
plt进行图片进一步修改，然后保存

#dec 词云
@app.route('/dec')
def dec():

    connect = sqlite3.connect('movies.db')
    cursor = connect.cursor()
    sql = "select inq from movie250"
    result = cursor.execute(sql)
    text = ""
    #将查询出来的结果作成字符串
    for t in result:
        text = t[0]+text
    connect.commit()
    connect.close()

    #判断是否已经生成了cloud.jpg
    save_path = r'static/assets/img/cloud.jpg'
    if not os.path.exists(save_path):
        cloud(text)
    return render_template('dec.html')

#将字符串作成词云图片
def cloud(text):

    cut = jieba.cut(text)           # 分词成一个列表
    string = ' '.join(cut)          # 将列表内数据以空格隔开
    print(len(string))              # 打印分词数量
    img = Image.open(r'static/assets/img/tree.jpg')     # 打开图片
    img_array = np.array(img)         #将图片装换为数组
    wc = WordCloud(                   #词云参数设置
    background_color='white',         #设置背景颜色
    mask=img_array,                   #设置背景图片
    font_path="msyh.ttc"              #微软雅黑     #'C:\Windows\Fonts\STZHONGS.TTF',若是有中文的话，这句代码必须添加，不然会出现方框，不出现汉字
    ).generate_from_text(string)
    plt.figure(1)  # 新建一个名叫 Figure1的画图窗口
    plt.imshow(wc) #显示图片，同时也显示其格式
    plt.axis('off') # 是否显示x轴、y轴下标 #plt.show() #显示生成合成图片
    plt.savefig(r'static/assets/img/cloud.jpg',dpi=500) #保存合成图片，dpi是设定分辨率，默认为400

在这里插入图片描述
最后全部源码放到：点击跳转

KaiKai-G

关注

0
点赞
踩
19

收藏

觉得还不错? 一键收藏
0
评论
爬取豆瓣Top250部电影+Flask框架显示

1、导入包from flask import Flask,render_templateimport osimport sqlite3import jieba #分词import numpy as np #矩阵运算(可将图片转为数组)from matplotlib import pyplot as plt #绘图from wordcloud import WordCloud #
复制链接

扫一扫