爬取创造与魔法游戏的食谱大全

linx_eric

于 2023-03-13 15:18:21 发布

阅读量222

点赞数

CC 4.0 BY-SA版权

文章标签：游戏 python pandas

本文链接：https://blog.csdn.net/linx_eric/article/details/129495049

该代码示例使用Python的requests库获取网页内容，利用lxml的etree模块解析HTML，寻找对齐方式为居中的<p>标签内的文本。提取的数据被写入到一个名为食谱大全.txt的文件中，实现了网页数据的简单抓取和本地存储。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

import requests
import pymysql
from lxml import etree
import pandas as pd


# UA伪装
headers = {
    'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/110.0.0.0 Safari/537.36 Edg/110.0.1587.63'
}

# url连接
url = 'https://www.9game.cn/czymf/2174371.html'
response = requests.get(url, headers=headers)
# print(response)
pege = response.text
# print(pege)


# 解析数据
tree = etree.HTML(pege)
list = tree.xpath('//p[@align="center"]')
# print(list)
c = []
# for i in list:
#     title = i.xpath('.//text()')
#     c.append(title)
#     print(c)
#     df = pd.DataFrame(c)
#     df.to_excel('c.xlsx')

for i in list:
    title = i.xpath('.//text()')
    print(title)
with open('食谱大全.txt','w',encoding='utf8') as f:
    for l in title:
        f.write(l+"\n")






# 存储数据




# # 持久化保存数据
# with open('创造与魔法.html','w',encoding='utf8') as fp:
#     fp.write(pege)
# print("爬取数据结束")