python 读取指定div的内容

最新推荐文章于 2023-06-10 17:52:50 发布

aixi1895

最新推荐文章于 2023-06-10 17:52:50 发布

阅读量2.3k

点赞数 2

原文链接：http://www.cnblogs.com/ai594ai/p/6929027.html

版权

本文介绍如何使用Python进行网页抓取，专注于从HTML中提取指定div标签内的信息。通过实例代码，详细讲解了如何利用BeautifulSoup或PyQuery等库解析HTML，并提取目标内容。

摘要由CSDN通过智能技术生成

# -*- coding:utf-8 -*-

from bs4 import BeautifulSoup
import urllib.request
import re

# 如果是网址，可以用这个办法来读取网页
# html_doc = "http://tieba.baidu.com/p/2460150866"
# req = urllib.request.Request(html_doc)
# webpage = urllib.request.urlopen(req)
# html = webpage.read()



html = """
<html><head><title>The Dormouse's story</title></head>
<body>
<p class="title" name="dromouse"><b>The Dormouse's story</b></p>
<p class="story">Once upon a time there were three little sisters; and their names were
<a href="http://example.com/elsie" rel="external nofollow" class="sister" id="xiaodeng"><!-- Elsie --></a>,
<a href="http://example.com/lacie" r

最低0.47元/天解锁文章

确定要放弃本次机会？

福利倒计时

: :

立减 ¥

普通VIP年卡可用

立即使用

aixi1895

关注关注

2
点赞
踩
3

收藏

觉得还不错? 一键收藏
0
评论
python 读取指定div的内容

# -*- coding:utf-8 -*-from bs4 import BeautifulSoupimport urllib.requestimport re# 如果是网址，可以用这个办法来读取网页# html_doc = "http://tieba.baidu.com/p/2460150866"# req = urllib.request.Req...
复制链接

扫一扫