提取网页特定数据的案例

最新推荐文章于 2024-09-06 16:24:38 发布

Abvedu

最新推荐文章于 2024-09-06 16:24:38 发布

阅读量1.9k

点赞数

分类专栏： Python 文章标签： Python BeautifulSoup html 数据

本文链接：https://blog.csdn.net/abvedu/article/details/54866613

版权

本文介绍了如何利用Python的BeautifulSoup库从HTML文件中提取所需数据。通过解析HTML，可以轻松获取网页上的特定信息。

摘要由CSDN通过智能技术生成

BeautifulSoup可以使我们通过网页的标签找到网页中我们想要的特定数据。本案例可以清楚地理顺从html文件变化到我们想要获得的数据。Python程序如下：

from bs4 import BeautifulSoup
import requests
url = 'http://new.cpc.com.tw/division/mb/oil-more4.aspx'

html = requests.get(url).text
bs = BeautifulSoup(html, 'html.parser')
#print(bs)
data = bs.find_all('span' ,{'id':'Showtd'} )
#print(data)
rows = data[0].find_all('tr')
#print(rows)

prices = list()
i = 0
for row in rows:
    if i < 16:
        print(row)
    cols = row.find_all("td")
    if len(cols[1].text ) > 0:
        item = [cols[0].text, cols[1].text, cols[2].text, cols[3].text]
        prices.append(item)
    i += 1
i = 0
for p in prices:
    if i < 16:
        print(p)
    i += 1

现在从变量容器的变化过程，认识提取