Python 骚操作：如何给你爱的读者每天发早报？

最新推荐文章于 2024-05-14 15:20:13 发布

小詹学 Python

最新推荐文章于 2024-05-14 15:20:13 发布

阅读量386

点赞数

本文链接：https://blog.csdn.net/weixin_40787712/article/details/96410866

版权

⬆️点击“小詹学Python”，星标公众号

重磅干货，第一时间送达

640?wx_fmt=png

早报

最近和一个朋友唠嗑，听她吐槽了好久……

她是一个社群运营，每天早上收集信息、整理、排版、到推送社群，大概要花费30分钟，需要花费大量的时间和精力，苦不堪言。

那么有没有办法将这个流程自动化呢？于是乎，我们想到了 「除了生孩子无所不能的 Python 」，请接着看！

思路很简单，分为3步：

1.从目标网站采集信息；

2.将采集到的信息生成早报图片

3.将生成的图片发送到微信群或好友

一、早报数据收集

网络上有很多提供早报的网站，这里为了避免做广告，但是为了演示方便，只提供地址，不说明名字，下文将我采集的网站称为A网站，A网站有专门的早报模块

1.获取最新日报的url

首先获取A网站最新早报页面的链接，通过查看网页源代码发现，所有的展示信息在页面 li 中,我们要获取的链接的地址正好在h2中，所以分析完毕开始提取链接

首先引入相关的网络请求库u，如下图

import requestsfrom bs4 import BeautifulSoup
from bs4 import BeautifulSoup

点击第一个早报信息进入后链接为https://www.pmtown.com/archives/197318.html，而我们使用find方式找到最新日报页面的链接为相对路径/archives/197318.html，所以需要我们手动组装完整的URL，具体见下方。

# 获取第一个早报的urlobj1 = requests.get('http://www.pmtown.com/archives/category/早报')url_obj = BeautifulSoup(obj1.text, 'lxml')url = url_obj.find('h2').find('a').get('href')first_url = 'http://www.pmtown.com' + url
obj1 = requests.get('http://www.pmtown.com/archives/category/早报')
url_obj = BeautifulSoup(obj1.text, 'lxml')
url = url_obj.find('h2').find('a').get('href')
first_url = 'http://www.pmtown.com' + url

2.获取日报页面的日报内容

制作早报时，我们只需要新闻的标题即可，通过分析发现页面比较简单，所有的标题都在p下面，所以我们直接提取内容

# 获取当前页obj = requests.get(first_url)obj_1 = BeautifulSoup(obj.text, 'lxml')titles = obj_1.findAll('p')# 获得新闻标题a = []for title in titles:    a.append(title.get_text())
obj = requests.get(first_url)
obj_1 = BeautifulSoup(obj.text, 'lxml')
titles = obj_1.findAll('p')

# 获得新闻标题
a = []
for title in titles:
    a.append(title.get_text())

获取的内容部分截图如下

3.文本处理

A网站日报的内容有科技头条，国内动态，海外动态和投资收购四个模块,实质得到的文本除了“科技头条”得到的日报标题是列表，而其他3个模块的日报标题各自在一整段字符串中，所以就要对字符串进行处理，使其成为列表

定义函数，将国内动态，海外动态和投资收购的日报标题分割开来，组成新的列表，这样日报的4类内容的格式就统一了

# 将新闻文本格式统一，生成新的列表def get_text(text_orgin):    #将标题的序号统一替换为‘sp’，然后将整段文本分割组成新的日报标题列表    first_list = re.sub(r'\d{1,2}、', 'SP', text_orgin)    mid_list = first_list.split('SP')    finnal_list = mid_list[1:len(mid_list)]    return finnal_list
def get_text(text_orgin):
    #将标题的序号统一替换为‘sp’，然后将整段文本分割组成新的日报标题列表
    first_list = re.sub(r'\d{1,2}、', 'SP', text_orgin)
    mid_list = first_list.split('SP')
    finnal_list = mid_list[1:len(mid_list)]
    return finnal_list

上述步骤中，我们把标题的序号替换了，所以需要需要加上新的序号，具体操作

# 定义函数，给信息加上编号，输出列表def inf_list(inf_orgin):    inf_after = []    for num, single_info in enumerate(inf_orgin):        inf_after.append(u'%s、%s' % ((num + 1), single_info))    return inf_after
def inf_list(inf_orgin):
    inf_after = []
    for num, single_info in enumerate(inf_orgin):
        inf_after.append(u'%s、%s' % ((num + 1), single_info))
    return inf_after

二、生成早报图片

首先导入画图的库，这里使用PIL库

from PIL import Image, ImageDraw, ImageFontimport Image, ImageDraw, ImageFont

1.画日报报头

设置字体类型和颜色，字体类型后续会用到，字体需要填写自己电脑上有的字体，window一般在C:/Windows/Fonts文件夹下,如果字体设置错误，程序会报错。

# 设置字体样式font_type = 'C:/Windows/Fonts/simkai.ttf'font_medium_type = 'C:/Windows/Fonts/simkai.ttf'header_font = ImageFont.truetype(font_medium_type, 55)title_font = ImageFont.truetype(font_medium_type, 20)font = ImageFont.truetype(font_type, 38)color = "#726053"color1 = "#294E76"
font_type = 'C:/Windows/Fonts/simkai.ttf'
font_medium_type = 'C:/Windows/Fonts/simkai.ttf'
header_font = ImageFont.truetype(font_medium_type, 55)
title_font = ImageFont.truetype(font_medium_type, 20)
font = ImageFont.truetype(font_type, 38)
color = "#726053"
color1 = "#294E76"

（1）画题目

header_x 和header_y是要画图的坐标，color是题目的颜色，header_font是题目的字体类型

# 开始画图header = '互联网日报'header_x = 130header_y = 200draw.text((header_x, header_y), u'%s' % header, color, header_font)
header = '互联网日报'
header_x = 130
header_y = 200
draw.text((header_x, header_y), u'%s' % header, color, header_font)

（2）画副标题

title = '由python脚本自动生成'title_x = header_xtitle_y = header_y + 80draw.text((title_x, title_y), u'%s' % title, color1, title_font)
title_x = header_x
title_y = header_y + 80
draw.text((title_x, title_y), u'%s' % title, color1, title_font)

（3）添加当前时间

在图片上添加生成图片的时间，单纯是为了显得高大上

首先引入import time模块，然后开始画图

cur_time = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime())cur_time_x = 666cur_time_y = title_ycur_time_font = ImageFont.truetype(font_type, 20)draw.text((cur_time_x, cur_time_y), u'%s' %cur_time, color, cur_time_font)
cur_time_x = 666
cur_time_y = title_y
cur_time_font = ImageFont.truetype(font_type, 20)
draw.text((cur_time_x, cur_time_y), u'%s' %cur_time, color, cur_time_font)

报头效果预览

2.画日报内容

因为新闻标题很长（最多2行），而图片是有宽度上限的，所以必须考虑换行的问题，我的处理方式是将原列表直接按照设定的宽度（我设定为750px）处理成为新列表,具体操作见下

def old_to_new_list(oldlist):    newlist = []    for single_text in oldlist:        if font.getsize(single_text.strip())[0] < 750 or font.getsi(single_text.strip()) == 750:            newlist.append(single_text)        else:            strList = []            newStr = ''            index = 0            # 从字符串single_text逐个取字，直到总长度大于750px            for item in single_text:                newStr += item            #gitsize可以同时输出字符串的宽和高                if font.getsize(newStr.strip())[0] > 750:                    newlist.append(newStr[:-1])                    newStr = ''                    # 如果后面长度没有750px长就返回这部分字符串                    if font.getsize(single_text[index:])[0] < 750:                        newlist.append(single_text[index:])                    else:                        break                index += 1    print(newlist)    return newlist
    newlist = []
    for single_text in oldlist:
        if font.getsize(single_text.strip())[0] < 750 or font.getsi(single_text.strip()) == 750:
            newlist.append(single_text)
        else:
            strList = []
            newStr = ''
            index = 0
            # 从字符串single_text逐个取字，直到总长度大于750px
            for item in single_text:
                newStr += item
            #gitsize可以同时输出字符串的宽和高
                if font.getsize(newStr.strip())[0] > 750:
                    newlist.append(newStr[:-1])
                    newStr = ''
                    # 如果后面长度没有750px长就返回这部分字符串
                    if font.getsize(single_text[index:])[0] < 750:
                        newlist.append(single_text[index:])
                    else:
                        break

                index += 1
    print(newlist)
    return newlist

接下来定义函数来画图，传入画图开始的的坐标:x,y值，要画的列表list，以及字体高度和标题文字；做函数的优点就在于不用重复写代码

def draw_info(x, y, the_list, linehigh, title_text):    draw.text((x, y), u'%s' % (title_text), color, font)    for num, info in enumerate(the_list):        height = num * linehigh        draw.text((x, y + height + 80), u'%s' % (info), color, font)
    draw.text((x, y), u'%s' % (title_text), color, font)
    for num, info in enumerate(the_list):
        height = num * linehigh
        draw.text((x, y + height + 80), u'%s' % (info), color, font)

比如绘制“科技新闻”,我们设定好绘画坐标，标题，内容列表等，调用上面的函数即可，其他“科技新闻”，“海外新闻”和“融资收购”类似不做演示，原理相同

# 绘制科技keji_x = title_x - 30keji_y = title_y + 88title_text = '【科技新闻】'keji_text = a[1:gn:2]keji_newlist = old_to_new_list(keji_text)draw_info(keji_x, keji_y, keji_newlist, linehigh, title_text)
keji_x = title_x - 30
keji_y = title_y + 88
title_text = '【科技新闻】'
keji_text = a[1:gn:2]
keji_newlist = old_to_new_list(keji_text)
draw_info(keji_x, keji_y, keji_newlist, linehigh, title_text)

来看成品

3.将日报发送给好友或微信群

首先引入wxpy库，wxpy是python专门用于调取微信功能的库

from wxpy import *import time#获取系统时间cur_time = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime())#初始化bot = Bot()time
#获取系统时间
cur_time = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime())
#初始化
bot = Bot()

你可以把日报发送给好友，只需要填写好友的昵称，如果要发给多个好友，加入循环结构即可

myfriends = bot.friends().search('好友昵称')[0]myfriends.send('python自动早报到了 ' + cur_time)myfriends.send_image('日报.jpeg')'好友昵称')[0]
myfriends.send('python自动早报到了 ' + cur_time)
myfriends.send_image('日报.jpeg')

你也可以选择把日报发送给微信群，操作类似

groups = ['微信群的名字']for send_OBJ in groups:    my_groups = bot.groups().search(groups)[0]    my_groups.send('python自动早报到了 ' + cur_time)    my_groups.send_image('日报.jpeg')
for send_OBJ in groups:
    my_groups = bot.groups().search(groups)[0]
    my_groups.send('python自动早报到了 ' + cur_time)
    my_groups.send_image('日报.jpeg')

如果有自己的微信群，需要每天做早报；或者你本身就是互联网运营等职业，相信能给你节约大量的时间！

做一次的事寻找可行解，重复做的事寻找最优解。如果需要完整代码，可以扫下方二维码回复「日报」即可获取哦。

640?wx_fmt=jpeg

小詹学 Python

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
Python 骚操作：如何给你爱的读者每天发早报？

⬆️点击“小詹学Python”，星标公众号重磅干货，第一时间送达早报最近和一个朋友唠嗑，听她吐槽了好久……她是一个社群运营，每天早上收集信息、整理、排版、到推送社群，大概...
复制链接

扫一扫