python怎么从网站提取数据_Python 3从体育网站提取HTML数据

最新推荐文章于 2021-06-10 13:37:40 发布

weixin_39890633

最新推荐文章于 2021-06-10 13:37:40 发布

阅读量61

点赞数

文章标签： python怎么从网站提取数据

I have been trying to extract data from a sports site and so far failing. I am Trying to extract the 35, Shots on Goal and 23 but have been failing.

35

Shots on Goal

23

from bs4 import BeautifulSoup

import requests

result = requests.get("https://www.scoreboard.com/uk/match/lvbns58C/#match-statistics;0")

src = result.content

soup = BeautifulSoup(src, 'html.parser')

stats = soup.find("div", {"class": "tab-statistics-0-statistic"})

print(stats)

This is the code I have been trying to use and when I run it I get "None" printed to me. Could someone help me so I can print out the data.

解决方案

As the website is rendered by javascript, possible option would load the page using selenium and then parse it with BeautifulSoup:

from bs4 import BeautifulSoup

from selenium import webdriver

from selenium.webdriver.support import expected_conditions as EC

from selenium.webdriver.common.by import By

from selenium.webdriver.support.ui import WebDriverWait

# initialize selenium driver

chrome_options = webdriver.ChromeOptions()

chrome_options.add_argument('--headless')

chrome_options.add_argument('--no-sandbox')

chrome_options.add_argument('--disable-dev-shm-usage')

wd = webdriver.Chrome('<>', options=chrome_options)

# load page via selenium

wd.get("https://www.scoreboard.com/uk/match/lvbns58C/#match-statistics;0")

# wait 30 seconds until element with class mainGrid will be loaded

table = WebDriverWait(wd, 30).until(EC.presence_of_element_located((By.ID, 'statistics-content')))

# parse content of the table

soup = BeautifulSoup(table.get_attribute('innerHTML'), 'html.parser')

print(soup)

# close selenium driver

wd.quit()

weixin_39890633

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。