python requests 动态加载,使用Python requests.get解析一次不加载的HTML代码-CSDN博客

I am trying to write a Python script that will periodically check a website to see if an item is available. I have used requests.get, lxml.html, and xpath successfully in the past to automate website searches. In the case of this particular URL (http://www.anthropologie.com/anthro/product/4120200892474.jsp?cm_vc=SEARCH_RESULTS#/) and others on the same website, my code was not working.

import requests

from lxml import html

page = requests.get("http://www.anthropologie.com/anthro/product/4120200892474.jsp?cm_vc=SEARCH_RESULTS#/")

tree = html.fromstring(page.text)

html_element = tree.xpath(".//div[@class='product-soldout ng-scope']")

at this point, html_element should be a list of elements (I think in this case only 1), but instead it is empty. I think this is because the website is not loading all at once, so when requests.get() goes out and grabs it, it's only grabbing the first part. So my questions are

1: Am I correct in my assessment of the problem?

and

2: If so, is there a way to make requests.get() wait before returning the html, or perhaps another route entirely to get the whole page.

Thanks

Edit: Thanks to both responses. I used Selenium and got my script working.

解决方案

You are not correct in your assessment of the problem.

You can check the results and see that there's a