从微信公众号[俄语摆渡]获取中俄翻译的内容

随笔备忘录

已于 2023-02-24 14:23:53 修改

阅读量141

点赞数

分类专栏： Python 文章标签： python

于 2023-02-24 14:09:28 首次发布

本文链接：https://blog.csdn.net/weixin_42199542/article/details/129199608

版权

Python 专栏收录该内容

39 篇文章 7 订阅

订阅专栏

#引入需要的模块
from requests import request
from bs4 import BeautifulSoup
import time
import re

#需要用到的自定义函数
#get Href from web
def acquirehref(href):
    return request('GET', href).text

#输入微信文章页面链接
htmltext=acquirehref('https://mp.weixin.qq.com/s/nffP8boYAQGkhr39zzj0VQ')

#获取页面信息
BeautS=BeautifulSoup(htmltext,'html.parser')
BeautS.encoding = 'utf-8'

#获取文章标题news_title
news_title=BeautS.findAll('h1',{'class':'rich_media_title'})
for each in news_title:
  file_name = each.text.strip()

#获取文章内容并保存在与文章标题同名的文件里
itemsnews=BeautS.findAll('div',{'class':'rich_media_content'})
with open(f'{file_name}.txt', 'w', encoding='utf-8') as f:
  for each in itemsnews:
    #print(tag.text.strip())
    f.writelines(each.text.strip()+'\n')