实现用fiddler抓取https://movie.douban.com/typerank?type_name=%E5%89%A7%E6%83%85&type=11&interval_id=100:90&action=下以ajax请求方式的真正的url
然后把浏览器滚动条移到底部,浏览器又发出ajax请求:
# coding:utf-8
import urllib
import urllib2
url = "https://movie.douban.com/j/chart/top_list?type=11&interval_id=100%3A90&action="
headers = {"User-Agent" : "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.99 Safari/537.36"}
startPage = raw_input("请输入开始的页码数:")
size = raw_input("请输入每页的数量:")
#ajax请求真实url的拼接
fullurl = url + "&start=" + str(startPage) + "&limit=" + str(size)
request = urllib2.Request(fullurl,headers = headers)
response = urllib2.urlopen(request)
print response.read()